[FFmpeg-devel] [PATCH] avformat/dv: fix timestamps of audio packets in case of dropped corrupt audio frames

Mon Nov 2 22:42:21 EET 2020

On Mon, 2 Nov 2020, Michael Niedermayer wrote:

>>> Please correct me if iam wrong but
>>> in cases where no audio is missing or damaged, this would also ignore how much
>>> audio is in each packet. So you could have lets say a timestamp difference
>>> of excatly 1 second between 2 packets while their is actually not exactly
>>> 1 second worth of audio samples between them.
>>
>> This is true, by using the frame counter (and the video time base) for
>> audio, we lose some audio packet timestamp precision inherently. However I
>> don't consider this a problem, audio timestamps do not have to be sample
>> accurate, for most formats they are not.
>
>
>> Also it is not practical to keep
>> track of how many samples are there in the packets, for example when you do
>> seeking, obviously you can't read all the audio data before the seek point
>> to get a precise sample accurate timestamp.
>
> Its true that with seeking there is not enough information for sample precisse
> timestamps. But from packet to packet as long as no seek happened there is.

And that timestamp can turn out to be wrong. If the audio clock is running 
at little more than 48 kHz, there will be A-V desync because after some 
time audio and video timestamps for packets coming from the same DV frame 
will diverge significantly.

> My concern was more about something like significant frame to frame
> differences in audio sample numbers.
> Because if some hw or sw generates this we would produce packets of
> identical duration which differ substantially in number of samples and
> that would not be handled well in any scenario that accepted the timestamps
> and durations as exact.

In general, you can't assume that timestamps or packet durations are 
exact. Consider you have a format which stores timestamps and durations in 
miliseconds. Rounding errors will occur. Also, for consumer equipment 
audio and video is rarely locked together, and audio sample rates are 
rarely very precise.

> Maybe this never occurs and in that case your patch should be a good idea
> but if it does happen then some code would be needed to deal with that.
> It is detectable when sample counts do not match what is expected.

Yeah, and we have tools to fix that, like -af aresample=async=1.

> That said, i would consider a fix for #8762 to produce correct audio in
> all cases including wav/pcm/mov/... output and not just when the output
> can store "corrupted"/"sparse" audio.

I think ffmpeg.c should be smarter about it, and be aware if unlocked or 
sparse audio (or audio not starting at the same time as video) is 
supported by certain muxers or not. And if it is not suppoted, then maybe 
-af async=1 or similar should be used automagically.

> Also to me returning the data from the input file which would represent audio
> if it was not corrupt seems to be somehow the "correct" thing to do.
> Maybe this never contains any useful data then it doesnt matter in
> reality but still it feels a bit odd to fix just the timestamps.

I am not strictly against applying your patch, I can accept that for the 
users it might be useful to get the data at the demuxer level and not play 
with async=1, yes, sparse audio requires extra care. I might even be OK 
with changing the default to pass corrupt packets. But this does not 
change the fact that the audio timestamps are currently wrong, because 
they ignore that audio and video from the same DV frame are synced 
together with at most 1/3 frame duration error.

Regards,
Marton