[FFmpeg-devel] [PATCH] AVCHD/H.264 parser: determination of frame type, question about timestamps

Sun Jan 25 20:59:22 CET 2009

On Fri, Jan 23, 2009 at 10:46:39PM +0100, Ivan Schreter wrote:
> Hi,
> 
> Michael Niedermayer wrote:
> > On Mon, Jan 19, 2009 at 09:22:27PM +0100, Ivan Schreter wrote
> >> +    s->pict_type= FF_I_TYPE;
> >>     
> > useless?
> >   
> Most probably. Each (ffmpeg) frame will have at least one (H.264) slice.
> >> +    while(buf<buf_end){
> >> +        buf= ff_find_start_code(buf, buf_end, &state);
> >> +        init_get_bits(&h->s.gb, buf, 8*(buf_end - buf));
> >> +        switch(state & 0x1F){
> >> [...]
> >> +        case NAL_IDR_SLICE:
> >> +        case NAL_SLICE:
> >> +            get_ue_golomb(&h->s.gb);
> >> +            s->pict_type= golomb_to_pict_type[get_ue_golomb(&h->s.gb) % 5];
> >>     
> > IIRC that should be get_ue_golomb_31() also its missing some check against
> > negative values
> >   
> I don't believe so. First, slice_type is defined in H.264 standard in 
> 7.3.3 (Slice header syntax) being ue(v), so not limited to 31. If I 

this reasoning is flawed, to proof something is allowed you need to find
an explicit statement that it is allowed or proof the non existence of a
statement making it invalid.
Showing that a single line of text at some point doesnt outright contradict
it is not enough.

> understand golomb.h correctly, this corresponds to get_ue_golomb(). Of 
> course, since there are just 5 slice types, this value should be small, 
> but some broken encoder could write something else in there. Do we want 
> to live with it (and save just a single if per picture)? Further, it is 
> unsigned, so it can't be negative. 

> And modulo 5 makes it always being in 
> the range 0..4, as in the array.

-1 % 5

> 
> I now dived a little more into H.264 standard in the search how to 
> correctly get key frames for seeking.
> 

> As I see, just getting type of one slice is not quite correct. Although 
> the AVCHD/H.264 files I have at hand all use one slice per field, in 
> H.264, it seems to be allowed to have also several slices. And even if 
> the slice is an I-slice, it doesn't mean it is going to be a key frame 
> in ffmpeg sense.

yes

> 
> In H.264, the only "sure" key frames (in the sense of AVPacket::flags) 
> are IDR pictures. However, in the sample files I have, there are 
> actually no IDR pictures (save the very first one)... So even misusing 
> pict_type by filling it with FF_I_TYPE only for IDR pictures would not 
> help to compute key frames correctly.

also true

> 
> I believe the only way is to use recovery point SEI message as 
> documented in H.264 spec Chapter D.2.7 and count down recovery_frame_cnt 
> frames before issuing a "key frame" in order for seeking to work 

no, the keyframe flag must be set at the point of the SEI not after it
you also should set convergence_duration
Still even if we do just part of this its already a step forward, important
is that we move toward the correct solution instead of adding random hacks.

> properly (well, it's a prerequisite, not a solution). But again, how to 
> communicate this to avformat, so it can correctly set key frame? 
> Currenty, compute_pkt_fields() does this:
> 
>     /* update flags */
>     if(is_intra_only(st->codec))
>         pkt->flags |= PKT_FLAG_KEY;
>     else if (pc) {
>         pkt->flags = 0;
>         /* keyframe computation */
>             if (pc->pict_type == FF_I_TYPE)
>                 pkt->flags |= PKT_FLAG_KEY;
>     }
> 
> which is IMHO broken. Of course, we could communicate with it by setting 
> pict_type to FF_I_TYPE for keyframes only (IDR frames and frames after 
> recovery point), for other frames containing I- and P-slices to 
> FF_P_TYPE and for B-frames to FF_B_TYPE. But I don't like it much. Any 
> idea, how to do it correctly without the need to touch other codecs?

pict_type from the parse context should likely be split in pict_type and
keyframe

> 
> Further, I found out that av_read_frame() sometimes reads only one field 
> of a frame (especially in interlaced mpegts AVCHD files), which confuses 
> the rest of ffmpeg. I suppose, this is a bug in h264_parse(), which 
> returns a single field instead of whole frame (when a frame is coded as 
> two fields in two pictures), but I didn't find a way yet, how to address 
> the problem. Any idea?

keep in mind that AFIAK h264 allows unpaired fields that is
1 frame, 1 field, 1 frame
I have no samples that actually does this so we might get away with ignoring
it but if we choose to not support it we should have some argument why
it would be too complex ..

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The worst form of inequality is to try to make unequal things equal.
-- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090125/c5c7b101/attachment.pgp>