[FFmpeg-user] Live lipsync from different inputs and RTCSTART in (a)setpts

Vilnis Bicevskis vilnis.bicevskis1 at gmail.com
Mon Jan 13 17:09:22 EET 2025


Hello!

I am trying to deal with syncing audio and video from different sources for live streaming.
Need to mention that video source PTS cannot be fully trusted there, some duplicates or drops should be made to maintain PTS in sync with wall clock.

One way to synchronise the sources is by ensuring the input PTS are aligned for audio and video.
Input PTS can be updated by setpts/asetpts filters. And wallclock could be the common clock source to rely on as inputs cannot reference one another.
But there is a problem, that RTCSTART field (that is "The wallclock (RTC) time at the start of the movie in microseconds.") doesn't contain the exact wallclock when the PTS of the input is started (initiated). It contains the time when the setpts/asetpts filter is initiated.
It differs from the timestamp from which PTS is offseted and so it cannot be used as a trusted timing reference source. At least for precision required to ensure lip-sync.
And I didn't find any other parameter that could give system time from which components PTS time is offseted to rely on.

One way to pass desired wallclock value to RTCSTART parameter is to create a patch that:
1) adds extra field containing wallclock of the input start to InputFile, AVFrame, AVPacket,
2) sets RTCSTART from the AVFrame upon first frame passing through the setpts/asetpts filter, like it is done for STARTPTS and STARTT variables.

Tested and that works fine at least for my use case.
Kinda don't really like the "dragging" of extra field in AVFrame and AVPacket that are needed only for initialisation.

Some thoghts on other (maybe better) ways how the lip-sync could be achieved? Or wallclock time value passed from input to setpts filter?

Best regards,
Vilnis


More information about the ffmpeg-user mailing list