[FFmpeg-user] IP camera recording via RTSP: audio/video desync (dropped frames?)
me at player701.ru
Thu Jul 14 09:00:14 EEST 2022
Here's some additional information that may or may not be useful in
1. I removed the "+genpts+igndts+ignidx" from "fflags". From what I
could understand, it wasn't necessary to use them, and removing them
didn't make anything better or worse.
2. I tried to find out why the video lag gets introduced only when the
audio stream is present, so I ran two live recording sessions in
parallel: one with audio, and another one without - and found something
very interesting. Remember I said that the lag happens when the video
does not have enough frames? Well, that's not quite true, it seems -
frames do get dropped sometimes, but it's not the actual cause of the
problem. I've been analyzing the recorded files, and it looks like the
actual issue is with incorrect presentation timestamps (PTS) being
transmitted or calculated. The frames themselves come on time, but when
FFmpeg records the video along with the audio, it erroneously puts some
frames in the next segment, probably in an attempt to synchronize the
two streams, which actually makes it worse!
Here's the comparison of two segments recorded during the experiment,
from the time when the lag began:
The segment recorded WITHOUT audio has 15000 frames and PTS ranging from
0 to 612. This is obviously incorrect, because the actual clock time, as
seen in the video itself, only counts 600 seconds (from 00:10:00 to
00:20:00). Due to this discrepancy, the video length is also wrong (10
minutes and 12 seconds). However, the next segment does not exhibit any
lag (the clock starts at 00:20:00 as expected) and does not seem to have
any PTS issues whatsoever.
Now, the segment recorded WITH audio is another story entirely. It has
only 14750 frames, and the PTS values range from 0 to about 602. The
latter corresponds to its reported length of 10 minutes and 2 seconds,
but the clock ends counting on 00:19:50 - entire 10 seconds are missing!
But they aren't gone, they are actually included in the next segment,
which reports 15000 frames just like the silent one, but these frames
are not the same - the clock runs from 00:19:50 to 00:20:50 (should be
00:20:00 to 00:30:00), and the PTS also starts from 2 instead of from 0
- but the audio begins playing right from the start. As a result, a lag
of 12 seconds between the audio and the video has been introduced.
Now, the question is, how to fix this. Like I already said before,
"-use_wallclock_as_timestamps 1" works, but causes constant stuttering
in the video. But it's not actually necessary to aggressively enforce
new PTS at all times. Because a 10-minute segment reports a length of 10
minutes and 12 seconds, it implies that some of the reported (or
calculated) timestamps are ahead of real time - which is obviously wrong
for a live stream. Perhaps a less aggressive correction of these
discrepancies can be implemented with some option or bitstream filter
(like "setts") that can detect this scenario of PTS jumping ahead of
Thank you very much.
More information about the ffmpeg-user