[FFmpeg-devel] [PATCH 2/2] avformat/mpegtsenc: fix flushing of audio packets

Wed Aug 28 22:13:25 EEST 2019

On Wed, 28 Aug 2019, Andreas Håkon wrote:

> Hi Marton,
>
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Tuesday, 27 de August de 2019 23:33, Marton Balint <cus at passwd.hu> wrote:
>
>> > Please, note that the main problem at time with the mpegts muxer is that all PES packets are
>> > written sequentially. And this generates a lot of problems when the video PES packets are large,
>> > or when the audio packets aren't flushed at regular intervals. If you prefer to improve the
>> > current sequential mode before you do anything with the interleaved mode, then I give this
>> > suggestion: Use a PES SIZE INTERVAL for audio packets instead of calculating a TIME DELAY. With
>> > CBR audio steams, every audio PES packet has the same payload size.
>>
>> I am not sure what you mean when you say PES size interval, but if you are
>> referring to the size of the PES packet - that is exactly what we had in
>> the very beginning, and it was not sufficent because for low bitrate
>> streams when combining small audio packets to a PES packet it took too
>> long time, and in order to generate a proper TS we have to make sure that
>> we don't delay the audio packets too much, becuase if we do, then it will
>> arrive at the destination later then the PCR which makes presentation
>> impossible.
>
> The problem is that you're thinking of using the same pes_size for all audio packets!
> For each audio stream you need to calculate the correct pes_size.

max_pes_size = max_audio_delay * audio_bitrate

it is the same thing for CBR, you calculate one from the other.

> And the value is based on the bitrate. So for CBR audio streams the value is
> fixed, and you only need to recalculate it for VBR audio streams.
>
> Please, try to add some "pes_top_size" member at stream level, and use it for
> audio streams. You can calculate the value when you know the bitrate, and
> after that a simple "if ts_st->payload >= ts_st->pes_top_size" will be sufficient
> to trigger the dispatch of the PES packet.

I don't see how calclating a max_pes_size is superior. It _only_ works for 
CBR, plus you don't really know the audio bitrate, you'd have to guess it 
from some frame sample count and audio packet size.

What is the disadvantage of always using a timestamp based contraint 
instead of a sized based? A timestamp based one works for both CBR and 
VBR, and you don't have to recalculate anything based on some bitrate 
guessing.

Regards,
Marton