[FFmpeg-devel] [PATCH] avformat/dv: fix timestamps of audio packets in case of dropped corrupt audio frames

Marton Balint cus at passwd.hu
Tue Feb 23 21:42:01 EET 2021



On Sat, 20 Feb 2021, Dave Rice wrote:

> Hi,
>
>> On Oct 31, 2020, at 5:15 PM, Marton Balint <cus at passwd.hu <mailto:cus at passwd.hu>> wrote:
>> On Sat, 31 Oct 2020, Dave Rice wrote:
>>>> On Oct 31, 2020, at 3:47 PM, Marton Balint <cus at passwd.hu <mailto:cus at passwd.hu>> wrote:
>>>> On Sat, 31 Oct 2020, Dave Rice wrote:
>>>>> Hi Marton,
>>>>>> On Oct 31, 2020, at 12:56 PM, Marton Balint <cus at passwd.hu <mailto:cus at passwd.hu>> wrote:
>>>>>> Fixes out of sync timestamps in ticket #8762.
>>>>> Although Michael’s recent patch does address the issue documented in 8762, I haven’t found this patch to fix the issue. I tried with -c:a copy and with -c:a pcm_s16le with some sample files that exhibit this issue but each output was out of sync. I put an output at https://gist.github.com/dericed/659bd843bd38b6f24a60198b5e345795 <https://gist.github.com/dericed/659bd843bd38b6f24a60198b5e345795>. That output notes that 3597 packages of video are read and 3586 packets of audio. In the resulting file, at the end of the timeline the audio is 9 frames out of sync and my output video stream is 00:02:00.020 and output audio stream is 00:01:59.653.
>>>>> Beyond copying or encoding the audio, are there other options I should use to test this?
>>>> Well, it depends on what you want. After this patch you should get a file which has audio packets synced to video, but the audio stream is sparse, not every video packet has a corresponding audio packet. (It looks like our MOV muxer does not support muxing of sparse audio therefore does not produce proper timestamps. But MKV does, please try that.)
>>>> You can also make ffmpeg generate the missing audio based on packet timestamps. Swresample has an async=1 option, so something like this should get you synced audio with continous audio packets:
>>>> ffmpeg -y -i 1670520000_12.dv -c:v copy \
>>>> -af aresample=async=1:min_hard_comp=0.01 -c:a pcm_s16le 1670520000_12.mov
>>> 
>>> Thank you for this. With the patch and async, the result is synced and the resulting audio was the same as Michael’s patch.
>>> 
>>> Could you explain why you used min_hard_comp here? IIUC min_hard_comp is a set a threshold between the strategies of trim/fill or stretch/squeeze to align the audio to time; however, the async documentation says "Setting this to 1 will enable filling and trimming, larger values represent the maximum amount in samples that the data may be stretched or squeezed” so I thought that async=1 would not permit stretch/squeeze anyway.
>> 
>> It is documented poorly, but if you check the source code you will see that async=1 implicitly sets min_comp to 0.001 enabling trimming/dropping. min_hard_comp decides the threshold when silence injection actually happens, and the default for that is 0.1, which is more than a frame, therefore not acceptable if we want to maintain <1 frame accuracy. Or at least that is how I think it should work.
>

> I’ve found that aresample=async=1:min_hard_comp=0.01, as discussed here, 
> works well to add audio samples to maintain timestamp accuracy when 
> muxing into a format like mov. However, this approach doesn’t work if 
> the sparseness of the audio stream is at the end of the stream. Is there 
> a way to use min_hard_comp to consider differences between a timestamp 
> and audio data when one of the ends of that range is the end of the 
> file?

I am not aware of a smart method to generate missing audio in the end 
until the end of video.

As a possible workaround you may query the video length using
ffprobe or mediainfo, and then use a second filter, apad to pad audio:

-af aresample=async=1:min_hard_comp=0.01,apad=whole_dur=<video_length>

Tnis might do what you want, but requires an additional step to query the 
video length...

Regards,
Marton


More information about the ffmpeg-devel mailing list