[FFmpeg-devel] [PATCH] pthread_frame: attempt to get frame to reduce latency

Martin Storsjö martin at martin.st
Wed Mar 11 22:42:55 EET 2020


On Wed, 11 Mar 2020, Derek Buitenhuis wrote:

> On 11/03/2020 14:53, Devin Heitmueller wrote:
>> Regardless of the actual proposed patch, I think the author's use of
>> wallclock time to describe the problem is very reasonable.  I do a
>> large amount of work where I'm measuring "glass-to-glass" latency,
>> where I am interested in the total pipeline (encode/network/decode),
>> and I definitely went through the experience of trying to figure out
>> why ffmpeg was introducing nearly 500ms of extra latency during
>> decoding.  It turned out that it was because my particular platform
>> had 8-cores and thus 16 decoding threads and thus 15x33ms of delay,
>> regardless of the stream complexity.
>
> Heavy disagree. The measurement is *specifically* referring to an API call
> here, and it *specifically* effects the delay, in frames. The email in question
> is conflating timestamps (33ms) per frame with wallclock time later on. It is
> not a meaningful comparison to make. Only pain lies down the path of mixing
> timestamps and wallclock like that.
>
> Glass-to-glass measurement is important too, but don't conflate the two.
>
> For what it's worth, I pick deterministic delay (in frames! packets-in-to-frames-out)
> over the possibility of less delay but non-deterministic every day of the week.
> For my own sanity. *Certainly* not as the default and only mode of operation.

FWIW, while I agree it shouldn't be the default, I have occasionally 
considered the need for this particular feature.

Consider a live stream with a very variable framerate, essentially varying 
in the full range between 0 and 60 fps. To cope with decoding the high end 
of the framerate range, one needs to have frame threading enabled - maybe 
not with something crazy like 16 threads, but say at least 5 or so.

Then you need to feed 5 packets into the decoder before you get the first 
frame output (for a stream without any reordering).

Now if packets are received at 60 fps, you get one new packet to feed the 
decoder per 16 ms, and you get the first frame to output 83 ms later, 
assuming that the decoding of that individual frame on that thread took 
less than 83 ms.

However, if the rate of input packets drops to e.g. 1 packet per second, 
it will take 5 seconds before I have 5 packets to feed to the decoder, 
before I have the first frame output, even though it actually was finished 
decoding in say less than 100 ms after the first input packet was given to 
the decoder.

So in such a setup, being able to fetch output frames from the decoder 
sooner would be very useful - giving a closer to fixed decoding time in 
wallclock time, regardless of the packet input rate.

// Martin



More information about the ffmpeg-devel mailing list