[MPlayer-users] Bug: av sync, bframes, dwStart issues
John Stebbins
stebbins at jetheaddev.com
Thu Jun 28 17:48:16 CEST 2007
John Stebbins wrote:
> Thanks for the feedback and links to additional background. I see your
> rational and pretty much
> agree with you (grudgingly).
>
> Corey Hickey wrote:
>
>> John Stebbins wrote:
>>
>>
>>> I've scanned over the mail list archives and found activity in this area
>>> back in March 06 by John Koleszar and Corey Hickey. Some fixes were
>>> applied, but I believe the real problem is being masked, or worked
>>> around rather than truly fixed.
>>>
>>> The symptom is audio leading video by 1 or 2 video frame times
>>> (depending on bframe settings) at start of playback. Currently,
>>> mencoder masks this by applying an audio delay through the use of
>>> dwStart in the audio stream header of the avi file. I believe this to be
>>> an incorrect fix because bframes are not guaranteed to be generated even
>>> though you have enabled them in your encoding. Some video sequences may
>>> be generated that contain no bframes, so your av sync will jitter by as
>>> much as 2 video frame periods between sequences that contain bframes and
>>> sequences that do not.
>>>
>>>
>> Are you sure about this? I can't test right now, so that's not a
>> rhetorical question; I didn't think that was the case, though.
>>
>>
> I don't know specifically what mencoder will generate. But I do know
> that commercial encoders
> are highly configurable and can generate such sequences. This jitter
> problem forces the av sync
> threshold to be fairly sloppy. I'm guessing that AV sync can't be kept
> any tighter than about 45ms
> without creating oscillations. Typical commercial decoders maintain AV
> sync within about 25ms.
>
>>
>>
>>> This masking becomes more problematic when using encoder front-ends that
>>> encode the audio and video separately and then mux them in a final step
>>> using tools such as mkvmerge or ogmmerge. OgmRip is a good example of
>>> such a front-end. Because the streams are encoded separately and the
>>> front-end knows little (nothing) about the delays that could be caused
>>> by the various decoding processes, the front-end does not intelligently
>>> insert compensating stream delays. And it shouldn't need to.
>>>
>>> I believe a correct fix belongs in mplayer and would delay the
>>> presentation of initial audio and video until the decoder has queued
>>> enough decoded video frames to handle its worst case video delay due to
>>> bframe processing. This would be done entirely by the player with no
>>> hints from the encoder (e.g. dwStart fudge factor). The player knows if
>>> it supports bframe processing and should be able to delay initial
>>> presentation until sufficient buffering of decoded frames has occurred
>>> to handle its worst case delay that can be caused by bframe processing.
>>>
>>>
>> As for the rest of your argument, I won't disagree with you. There was
>> no way to do it that was clearly right; while making mplayer compensate
>> for decoding delay seems correct, ideally, it ended up being more
>> practical to use dwStart because other players (at least the ones
>> tested) did not handle the delay either. If you want to implement a
>> different approach, feel free to do so, but be sure to read some of the
>> rationale behind the original decision.
>>
>> Here's where I confer with Michael about how to handle decoder delay. Be
>> sure to read the quoted material:
>> http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2006-January/040088.html
>> http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2006-January/040098.html
>>
>> Here's where Rich objects to dropping frames and Michael suggests
>> delaying other streams:
>> http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2006-January/040115.html
>>
>> Here's where I propose using dwStart to delay other streams:
>> http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2006-January/040126.html
>>
>>
>> There's other information in the rest of the thread. Beware that I
>> didn't initially know what I was doing; that was a very educational
>> process for me.
>>
>>
> I can understand how my proposal could be difficult to wedge into
> mplayer. I haven't tried to figure
> out where specifically the code would need to be modified, but if it was
> not initially designed to support
> this kind of decoder management, it could be quite difficult to add.
>
> I also see your point that most (if not all) the commonly used players
> also have this problem. I verified
> totem-gstreamer and xine just this morning. Gstreamer is especially
> broken as it doesn't even pay attention to dwStart.
>
> The thing that really bugs me about this is that all encoders (including
> many encoder front ends) need to be
> aware of this ad-hoc requirement and "do the right thing". In the case
> of front-ends that encode audio and video
> completely separately and then mux them into alternative containers,
> they have to be aware of the decoding requirements
> of the chosen codec and options selected for the codec. It also appears
> that the ogg container format may not have the
> equivalent of dwStart. The method ogmmerge uses to delay audio is to
> duplicate audio packets at the beginning
> of the stream (ugh!).
>
> I agree that it is very unclear what the right thing to do is. In an
> ideal world, all players would be fixed to handle
> this sync issue transparently. But I'm not prepared to lead a crazy
> crusade to try to make that happen. So,
> we get the most bang for the buck by doing what has been done in
> mencoder. I'll just have to see what can be
> worked out with the authors of mkvmerge, ogmmerge, and ogmrip to make my
> personal favorite choice of
> tools work better.
>
> Has anyone pinged the transcode people on this or tested what transcode
> does? Its been a while since I tried avimerge,
> but it doesn't even have an option for setting a delay on a stream.
>
>
>
More feedback for anyone interested in this thread.
Something that I stumbled across last night that hadn't occurred to me
before. If you re-encode an avi file that already has audio delayed, you
will be incrementally delaying the audio even more. E.g. you have
high quality archive of h.264 and you want to transcode to lower
resolution for a portable device. To do this correctly, the user needs
to be aware that the audio will need to be manually adjusted
(e.g. -audio-delay -0.83) to get the proper sync. Note the value of
the manual adjustment depends on the amount of delay that was
inserted into the high quality archive copy and has nothing to do with
the new encoding format. So the user also must remember the original
settings used to encode their archive.
Corey asked if I was sure that there could be a variable number (or 0)
bframes in video sequences. I did a little investigation and found
that both x264 and xvid can produce variable numbers of bframes in
a video sequence. x264 will not insert a bframe if it does not improve
bitrate and xvid will refrane from inserting bframes depending on
some PSNR thresholds that can be set.
John
More information about the MPlayer-users
mailing list