[FFmpeg-user] PTS resolution[s]
Jim DeLaHunt
list+ffmpeg-user at jdlh.com
Tue Feb 23 10:58:16 EET 2021
On 2021-02-22 21:35, Mark Filipak (ffmpeg) wrote:
> On 2021-02-23 00:01, Jim DeLaHunt wrote:
>> The Presentation Time Stamp (PTS) value which FFmpeg associates with
>> video frames and audio data is a 64-bit integer. There is an
>> associated time base attribute for each video or audio stream, which
>> gives the number of seconds between successive values of PTS. This
>> time base might be thought of as the resolution of PTS. Thus if you
>> have two PTS values pts1 and pts2, then the difference in seconds
>> between them is (pts2-pts1)*time_base.
>
>
> MPEG PES (Presentation Elemental Stream) uses a 27MHz (exact) clock
> divided by 300 (exact), so that timebase is 1/(90000Hz)…
I've read something similar. My understanding is that MPEG PES encodes
Presentation Time Stamp values as integer tick counts in the data
stream. Is the timebase of 1/(90,000Hz) encoded in the data stream, or
it is only defined in the spec?
> …(which is 0.01[1..]ms between ticks, exactly).
Actually, for this discussion I think it's fair to say that 0.01[1..]ms
is not exactly 1/90 ms, it is just an approximation. Finite decimal
numbers will never get you the exact value. The rational number is
exact. For this discussion, it will be clearer to use exact rational
numbers.
> …my best information so far is that, at least out of the encoder,
> ffmpeg encodes frames with PTS resolution = 1ms.
My impression from reading the FPS filter source code is that it is
incomplete to talk about ffmpeg PTS values without also giving the
corresponding timebase value. It looks to me like the FPS filter does
not attempt to preserve the incoming PTS values or timebase. It sets a
new time base of 1/frame_rate, and generates successive integer values
for PTS. However, and this is crucial, it does seem to value being exact
about the value of PTS*time_base.
So, that seems to say that your statement "at least out of the encoder,
ffmpeg encodes frames with PTS resolution = 1ms" is not complete without
stating the time base value ffmpeg sets out of the encoder.
> To put this into perspective, a 24fps video has delta-PTS = 41.[6..]ms
> whereas a 24/1.001fps video has delta-PTS = 41.708[3..]milliseconds.
> That means that the difference between the two is less than the
> resolution of the ffmpeg timebase (at least, for the encoder -- I
> don't know about the decoder and the pipeline). That essentially means
> that ffmpeg can't differentiate between them based on the working PTSs
> that it keeps.
But what are the time base values which ffmpeg uses for these two
cases? If the time base is 1/24 in the first case, and 1,001/24,000 in
the second case, then the same integer PTS values result in
PTS*time_base products being exactly the correct time offsets from the
first frame of the video in each of the two cases.
> I seek someone who can either, 1, confirm what I think, or 2, tell me
> what the resolution of the decoder and pipeline actually is.
Implicit in your use of the definite article "the" is an apparent
assumption that FFmpeg has only one resolution for the decoder and the
pipeline. It looks to me like FFmpeg could well take the liberty of
changing resolution at each stage of decoder and pipeline, as long as it
preserves the values for PTS*time_base at each frame (or modifies them
intentionally, as the FPS filter does).
Best regards,
—Jim DeLaHunt
More information about the ffmpeg-user
mailing list