[FFmpeg-devel] [PATCH]Fix for issue694. Dirac A/V sync loss

Sat Dec 20 04:28:23 CET 2008

On Dec 19, 2008, at 6:01 AM, Michael Niedermayer wrote:

> On Tue, Dec 02, 2008 at 01:39:03PM +1100, Anuradha Suraparaju wrote:
>> Hi,
>>
>> Sorry for the delayed response.
>
> noone beats my delayed responses, i stil have mails from last year i  
> should
> awnser ...
>
>
>> My replies are inline.
> [...]
>>>> Index: libavcodec/libdiracdec.c
>>>> ===================================================================
>>>> --- libavcodec/libdiracdec.c	(revision 15870)
>>>> +++ libavcodec/libdiracdec.c	(working copy)
>>>> @@ -88,10 +88,12 @@
>>>>
>>>>     *data_size = 0;
>>>>
>>>> -    if (buf_size>0)
>>>> +    if (buf_size>0) {
>>>>         /* set data to decode into buffer */
>>>>         dirac_buffer (p_dirac_params->p_decoder, buf, buf 
>>>> +buf_size);
>>>> -
>>>> +        if ((buf[4] &0x08) == 0x08 && (buf[4] & 0x03))
>>>> +            avccontext->has_b_frames = 1;
>>>> +    }
>>>>     while (1) {
>>>>          /* parse data and process result */
>>>>         DecoderState state = dirac_parse (p_dirac_params- 
>>>> >p_decoder);
>>>
>>> Just to make sure this code does what you intend it to do.  
>>> (has_b_frames is
>>> poorly named ...) and i dont know dirac well enough to understand  
>>> what the
>>> checked bits represent exactly
>>>
>>> has_b_frames == 1 means that a decoder would have a 1 frame  
>>> reordering buffer
>>> (like mpeg1/2 with IPB frames where IP are delayed while B are not)
>>> has_b_frames=0 means that a decoder would not have any frame  
>>> delay, that also
>>> implicates that there is no frame reordering. (mpeg2 low delay is  
>>> an example
>>> of this) in mpeg1/2 no reordering also implicates no b frames
>>> has_b_frames=1 does not require that there are B frames
>>>
>>> also has_b_frames is mainly used for filling in missing timestamps
>>>
>>
>> Dirac, the specification, has a flexible GOP structure. So the frame
>> re-ordering can be anything. This said, the current implementations  
>> of
>> Dirac (dirac-research and Schroedinger) use a frame-reordering on 1.
>> Intra and backward predicted frames can be delayed but bi-directional
>> frames are not.  So the mpeg1/2 logic for has_b_frames holds. In  
>> Dirac
>> it is not possible to tell whether a frame is a backward predicted  
>> frame
>> (similar to 'P' frame) or a bi-directional frame (similar to 'B'  
>> frame)
>> easily without processing its reference frames. So I set  
>> has_b_frames to
>> 1 if I come across any predicted frame (buf[4] &0x08) == 0x08 -  
>> mean the
>> parse unit is a picture, and (buf[4] & 0x03 checks the number of
>> references) . So an I-frame only sequence will set has_b_frames to  
>> 0 but
>> a sequence having only P or P&B frames will set it to 1 since
>> has_b_frames=1 does not require that there be any 'B' frames in the
>> sequence.
>
> what happens with a
> display order I B B I B B P ...
> coded order   I I B B P B B ...
> ?
> the way i understand your explanation is that the first 2 I frames  
> would
> have has_b_frames=0 and then when the first B is encountered it is  
> set to 1
> this doesnt seem correct. has_b_frames should be 1 from the begin  
> ideally
> In that respect i wonder if it wouldnt be more correct to just  
> always set
> has_b_frames=1, but then i dont know dirac, its really a question  
> about
> how a 1-in 1-out decoder would work
>
> i mean if it returned I frames with a delay of 1 then has_b_frames=1  
> would
> be correct. If it didnt then i wonder what it would do with
> I,I,B
> immedeatly decoding&returning the I frames would cause a
> problem once it encounters the B frame.

Because I've been wondering about the best way to handle this in the  
soc decoder, a couple observations/questions:
In Dirac, the only way we know to delay a frame is because the frame  
number of the delayed frame is greater than one more than the frame  
number of the last frame in coded order. If has_b_frames is set then  
(instead of on frames with references) then in that sequence the first  
I frame would have has_b_frames=0, then for second and onwards it  
would be set.
Given this, should has_b_frames simply be set for all Dirac streams?

I think what currently would happen with libdirac/schroedinger in that  
situation is that it would decode and return the first I frame, then  
the second I frame would return no picture (STATE_BUFFER/ 
SCHRO_DECODER_NEED_BITS), then starting with the b-frame would  
continue returning frames in coded order. The soc decoder currently  
does this as well; e.g. for that sequence:
in:  I I B B P B B
out: I   B B I B B P

Is this acceptable? IIRC ffmpeg api is that the pts of the outputted  
frame is equal to the dts of the inputted frame, which wouldn't be  
true for the first frame, but I'm not sure how to fix that...

Second, the h264 decoder sets has_b_frames to be equal to the delay,  
rather than just 0 or 1. Is this part of the API or just a convenience  
for h264? Nothing except the current encoders prevent Dirac from  
having an arbitrarily long delay.