[FFmpeg-devel] [PATCH] AVCHD/H.264 parser: determination of frame type, question about timestamps

Fri Jan 23 22:46:39 CET 2009

Hi,

Michael Niedermayer wrote:
> On Mon, Jan 19, 2009 at 09:22:27PM +0100, Ivan Schreter wrote
>> +    s->pict_type= FF_I_TYPE;
>>     
> useless?
>   
Most probably. Each (ffmpeg) frame will have at least one (H.264) slice.
>> +    while(buf<buf_end){
>> +        buf= ff_find_start_code(buf, buf_end, &state);
>> +        init_get_bits(&h->s.gb, buf, 8*(buf_end - buf));
>> +        switch(state & 0x1F){
>> [...]
>> +        case NAL_IDR_SLICE:
>> +        case NAL_SLICE:
>> +            get_ue_golomb(&h->s.gb);
>> +            s->pict_type= golomb_to_pict_type[get_ue_golomb(&h->s.gb) % 5];
>>     
> IIRC that should be get_ue_golomb_31() also its missing some check against
> negative values
>   
I don't believe so. First, slice_type is defined in H.264 standard in 
7.3.3 (Slice header syntax) being ue(v), so not limited to 31. If I 
understand golomb.h correctly, this corresponds to get_ue_golomb(). Of 
course, since there are just 5 slice types, this value should be small, 
but some broken encoder could write something else in there. Do we want 
to live with it (and save just a single if per picture)? Further, it is 
unsigned, so it can't be negative. And modulo 5 makes it always being in 
the range 0..4, as in the array.

I now dived a little more into H.264 standard in the search how to 
correctly get key frames for seeking.

As I see, just getting type of one slice is not quite correct. Although 
the AVCHD/H.264 files I have at hand all use one slice per field, in 
H.264, it seems to be allowed to have also several slices. And even if 
the slice is an I-slice, it doesn't mean it is going to be a key frame 
in ffmpeg sense.

In H.264, the only "sure" key frames (in the sense of AVPacket::flags) 
are IDR pictures. However, in the sample files I have, there are 
actually no IDR pictures (save the very first one)... So even misusing 
pict_type by filling it with FF_I_TYPE only for IDR pictures would not 
help to compute key frames correctly.

I believe the only way is to use recovery point SEI message as 
documented in H.264 spec Chapter D.2.7 and count down recovery_frame_cnt 
frames before issuing a "key frame" in order for seeking to work 
properly (well, it's a prerequisite, not a solution). But again, how to 
communicate this to avformat, so it can correctly set key frame? 
Currenty, compute_pkt_fields() does this:

    /* update flags */
    if(is_intra_only(st->codec))
        pkt->flags |= PKT_FLAG_KEY;
    else if (pc) {
        pkt->flags = 0;
        /* keyframe computation */
            if (pc->pict_type == FF_I_TYPE)
                pkt->flags |= PKT_FLAG_KEY;
    }

which is IMHO broken. Of course, we could communicate with it by setting 
pict_type to FF_I_TYPE for keyframes only (IDR frames and frames after 
recovery point), for other frames containing I- and P-slices to 
FF_P_TYPE and for B-frames to FF_B_TYPE. But I don't like it much. Any 
idea, how to do it correctly without the need to touch other codecs?

Further, I found out that av_read_frame() sometimes reads only one field 
of a frame (especially in interlaced mpegts AVCHD files), which confuses 
the rest of ffmpeg. I suppose, this is a bug in h264_parse(), which 
returns a single field instead of whole frame (when a frame is coded as 
two fields in two pictures), but I didn't find a way yet, how to address 
the problem. Any idea?

>> Index: libavformat/mpegts.c
>>     
> doesnt belong in this patch
>   
OK, separated it into another patch (e-mail was already sent and if it 
won't get applied now, I'll probably start a new thread for it).

Regards,

Ivan