[FFmpeg-devel] mpegts_write_header(), the buffering and thus reordering of audio samples.

Will Korbe wkorbe at gmail.com
Thu Nov 3 21:50:48 CET 2011


On Thu, Nov 3, 2011 at 1:42 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Thu, Nov 03, 2011 at 01:21:54PM -0700, Will Korbe wrote:
>> On Thu, Nov 3, 2011 at 12:09 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> > On Wed, Nov 02, 2011 at 04:13:57PM -0700, Will Korbe wrote:
>> >> On Tue, Nov 1, 2011 at 11:39 AM, Will Korbe <wkorbe at gmail.com> wrote:
>> >> > On Mon, Oct 31, 2011 at 7:41 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> >> >> On Mon, Oct 31, 2011 at 06:24:07PM -0700, Will Korbe wrote:
>> >> >>> Hello,
>> >> >>>
>> >> >>> I hopefully have a minor question about the
>> >> >>> libavformat/mpegtssec.c:mpegts_write_header() implementation. My hours
>> >> >>> of online searching and specification reading for the answer has been
>> >> >>> unsuccessful.
>> >> >>>
>> >> >>> I see there is an efficiency built in to bundle multiple audio samples
>> >> >>> into one PES packet and was wondering if someone could point out a
>> >> >>> paragraph from the MPEG2 specification which indicates the reordering
>> >> >>> of TS Packets this causes is okay? I have a video with many 41 byte
>> >> >>> audio samples at the start, and this seems to push the first audio
>> >> >>> samples 5 seconds into the output. At this point, the PTS and DTS for
>> >> >>> the audio samples are beyond the last PCR value by 5 seconds. The
>> >> >>> video plays fine in QuickTime, the audio and video are precisely in
>> >> >>> sync, so I assume QuickTime has logic to buffer the video frames until
>> >> >>> seeming the first audio sample to account for this, but I wonder if
>> >> >>> most MPEG2TS players handle this the same way. If this is spelled out
>> >> >>> in the specification, I would sure feel more comfortable with it.
>> >> >>>
>> >> >>> Any information would be useful.
>> >> >>
>> >> >> I dont remember a tight limit on the PCR-DTS distance in the spec but
>> >> >> then i dont remember the spec very well ...
>> >> >> but having this large is bad because this means a long startup delay.
>> >> >> So 5 seconds is IMHO not appropriate by default.
>> >> >> Of course that delay would only affect actual broadcast and not playing
>> >> >> from a local file with a modern player.
>> >> >> the user parameter max_delay should be able to limit this, if not
>> >> >> a bugreport & patch are welcome
>> >> >>
>> >> >>
>> >> >> [...]
>> >> >> --
>> >> >> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>> >> >>
>> >> >> Asymptotically faster algorithms should always be preferred if you have
>> >> >> asymptotical amounts of data
>> >> >>
>> >> >> _______________________________________________
>> >> >> ffmpeg-devel mailing list
>> >> >> ffmpeg-devel at ffmpeg.org
>> >> >> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>> >> >>
>> >> >>
>> >> >
>> >> > Hi Michael,
>> >> >
>> >> > Thank you for your response. From what I can tell, max_delay does not
>> >> > limit the amount of audio sample buffering, rather, it is a macro,
>> >> > DEFAULT_PES_PAYLOAD_SIZE (2930 bytes). In my test case,
>> >> > AVFormatContext::max_delay is set to 50000 (50 ms). My sample video is
>> >> > actually a movie trailer with close to 5 seconds of silence before the
>> >> > audio sample byte sizes become substantial, after that point, the
>> >> > audio samples are large enough that the PTS/DTS values become just 500
>> >> > ms behind the last PCR. Given this, it seems when there are
>> >> > multi-second periods of silence, the PTS/DTS of an audio sample has
>> >> > the potential to become multi-seconds behind the last PCR in the TS
>> >> > Packet Stream. As you mentioned, this may be acceptable with regards
>> >> > to the specification and/or current playback implementations, however,
>> >> > to attempt to handle more cases, perhaps the buffering of audio
>> >> > samples should be limited by max_delay as you suggest, that would seem
>> >> > to make it more inline with the mpeg2 timing model, but I'm not and
>> >> > expert in this area. I plan to investigate further to see if a BUG
>> >> > report is warranted.
>> >> >
>> >> > Thank you,
>> >> > Will
>> >> >
>> >>
>> >> When I study the MPEG2 timing and buffering model, it would seem to
>> >> indicate the PCR value is the system-clock used to synchronize access
>> >> units (audio samples and video frames). For the decoding side, the
>> >> only delay specified is the buffers after the demux step, apparently
>> >> one for the video frames, and one for audio samples. If the MPEG2TS
>> >> stream was processed serially where the PCR value indicates the
>> >> current system time as it is received in the MPEGTS stream, then in
>> >> the audio or video buffers, if any access units have a DTS value which
>> >> matches the current PCR, those access units are instantly removed from
>> >> the buffer and decoded. If we don't have a DTS, the PTS is considered.
>> >> When any of the access units have a PTS equal to the current PCR, then
>> >> those access units are presented. Given this, it seems apparent to me
>> >> that this is not how some players are implemented, otherwise the audio
>> >> would arrive late, after the video was decoded and presented, and
>> >> would have been dropped. I suspect players have an additional layer of
>> >> logic and an initial delay built-in to handle MPEG2TS streams that are
>> >> not quite spec compliant, otherwise my test video stream wouldn't have
>> >> played correctly. If anyone knows more about this and can point out a
>> >> flaw in my reasoning, please let me know.
>> >>
>> >> If I compare ffmpeg's MPEG2TS output with Apple's mediastreamsegmenter
>> >> MPEG2TS output, the mediastreamsegmenter tool also groups audio
>> >> samples together to form larger PES packets, however, it places this
>> >> group of audio samples in the correct location in the MPEG2TS stream
>> >> (DTS and PTS < PCR). I suspect they buffer both audio samples and
>> >> video frames to be able to do this, using considerably more memory in
>> >> the process.
>> >>
>> >> I work on an application that doesn't have the luxury to use extra
>> >> memory, so I patched the mpegts_write_pes function making a trade-off
>> >> which increases the output overhead slightly in some cases, but it
>> >> does not use extra memory. It also was a minor modification. In
>> >> mpegts_write_pes(), when it is determined a PCR will be written, which
>> >> is just for video frames, I check the AVStreams associated to the
>> >> AVFormatContext, and if they are buffering access units where the
>> >> first DTS is not less than the PCR value, I write out the buffered
>> >> access units before continuing to write out the PCR value. This keeps
>> >> everything in order.
>> >>
>> >> The extra overhead bytes introduced is due to the additional stuffing
>> >> bytes which results from not filling the TS Packets as completely as
>> >> before with audio samples, and due to the additional PES headers. The
>> >> patch still allows audio samples to get grouped, but only while their
>> >> DTS value is less than the next PCR value mpegts_write_pes will add to
>> >> the output stream. With a max_delay of 50ms, I noticed an increase in
>> >> bytes of 0.8%. With a max_dealy of 1000ms, I didn't see an increase in
>> >> overhead since this allows more audio samples to get grouped.
>> >>
>> >> I suspect this patch isn't general purpose enough, some would rather
>> >> use more memory instead of increasing the potential container
>> >> overhead, for example. Also, this was just tested with ffmpeg 0.5
>> >> since that is the version we are currently working with here, but I
>> >> forward ported it below to the latest version and it is shown below to
>> >> see if anyone has comments regarding it.
>> >
>> > i have some comments
>> > i think write_pcr can be set later and pcr is not set then and thus
>> > used uninitialized
>> >
>> > also somehow it feels like the code would fit better in
>> > mpegts_write_packet().
>> > I mean calling mpegts_write_pes() for all streams out of
>> > mpegts_write_pes() itself feels like a recipe for overflowing the
>> > stack if theres some way to cause this to call too deep recursively
>> >
>> > and i think the code needs to be always called not just when
>> > write_pcr is 1
>> >
>> > either way the patch is a needed bugfix and if you improve it iam
>> > happy to include it
>> >
>> > [..]
>> > --
>> > Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>> >
>> > I know you won't believe me, but the highest form of Human Excellence is
>> > to question oneself and others. -- Socrates
>> >
>> > _______________________________________________
>> > ffmpeg-devel mailing list
>> > ffmpeg-devel at ffmpeg.org
>> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>> >
>> >
>>
>> Hi Michael,
>>
>> Thank you for your comments, I'll try to address some of them below.
>>
>> > i think write_pcr can be set later and pcr is not set then and thus
>> > used uninitialized
>>
>> Ah, yes, in the latest ffmpeg version that is true, in the 0.5 version
>> it wasn't, I missed that in the forward port due to haste in an
>> attempt to get some feedback sooner. This definitely needs fixed.
>>
>> > also somehow it feels like the code would fit better in
>> > mpegts_write_packet().
>>
>> Yes, I also would have preferred that as well, it is what I first
>> looked at, but it did not look trivial to attempt to predict if it was
>> time to write one more or more PCRs, and what their values would be
>> from within mpegts_write_packet(). Inside mpegts_write_pes(), there is
>> a while loop, and depending on certain conditions, in between one of
>> the many TS_Packets, a PCR is added during one of the loop iterations
>> and potentially during more than one of the loop iterations, though
>> not likely.
>>
>> > I mean calling mpegts_write_pes() for all streams out of
>> > mpegts_write_pes() itself feels like a recipe for overflowing the
>> > stack if theres some way to cause this to call too deep recursively
>>
>> I did ensure the number of recursive calls is limited by the number of
>> AVStreams minus 1 associated to the AVFormatContext (setting
>> ts_st2->payload_index to 0 before calling mpegts_write_pes() again,
>> and not calling mpegts_write_pes() on itself). I assumed the number of
>> streams associated to a context is usually relatively low, it was 2
>> for our use case, one video and one audio, would this get larger than
>> say 10? .. and yes, I agree, the recursive nature of this is tricky, I
>> did try to think through all of the permutations that could arise.
>> Since write_pcr is only true for video streams, and we don't buffer
>> video frames, it seems there shouldn't be any out of order issues
>> causing a PCR value to be written before the access units with a lower
>> DTS/PTS value.
>>
>
>> > and i think the code needs to be always called not just when
>> > write_pcr is 1
>>
>> I chose to do this when write_pcr is 1 since I was only concerned
>> about keeping the access unit's DTS below the PCR. If we are not
>> writing a PCR value, then this can't occur. Was there another case
>> that occurred to you?
>
> the way i remember the spec, the PCR increases even when its not
> stored. so to keep DTS in the future of when the data actually is
> received in a reference decoder. Checking would be needed more often.
>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> In fact, the RIAA has been known to suggest that students drop out
> of college or go to community college in order to be able to afford
> settlements. -- The RIAA
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>

> the way i remember the spec, the PCR increases even when its not
> stored. so to keep DTS in the future of when the data actually is
> received in a reference decoder. Checking would be needed more often.

I see, I guess we could end up with a case where the PCR increased
beyond the the lowest buffered audio sample's DTS, and the next PCR to
write/store would be greater than the audio-sample's DTS, but we are
not at the point to write/store it yet.

Thank you,
Will


More information about the ffmpeg-devel mailing list