[MPlayer-dev-eng] More on timestamps in NUT

D Richard Felker III dalias at aerifal.cx
Mon May 3 18:51:08 CEST 2004


On Mon, May 03, 2004 at 02:43:43PM +0200, Michael Niedermayer wrote:
> Hi
> 
> On Monday 03 May 2004 07:01, D Richard Felker III wrote:
> [...]
> > First, an explanation of the problem:
> >
> > Michael noted on the cvslog list (in a thread originating from some
> > changes he made to mpcf.txt, the in-progress NUT spec) some
> > unfortunate consquences of the current timestamp coding system in NUT.
> > In the working spec, the muxer and demuxer keep track of the three
> > most recent timestamp deltas for each stream, and then subsequent
> > frames can be coded to reuse these deltas, rather than storing the lsb
> > timestamp or full timestamp explicitly. This is particularly useful
> > for video streams with B frames (which will have a timestamp pattern
> > of +N+1, -N, +1 (N times), where N is the number of B frames, and for
> > vorbis audio where there are 3 size frames (128, 576, and 1024), and
> > it allows most packet headers to be encoded in a single byte for
> > very-low-bitrate streams (where overhead matters a lot).
> >
> > The problem arises with error resilience. I had been supposing that
> > after an error, we could resync to the next valid packet using some
> > nice tricks I worked out (which are explained in another thread). But
> > Michael pointed out that such damage can mess up the timestamp delta
> > prediction entirely, leading to completely bogus timestamps. Thus, it
> > seems that in the presence of timestamp delta predictors, error
> > resilience can only recover at the next lsb-coded or fully-coded
> > timestamp.
> [...]
> > THEREFORE, I propose that we remove timestamp delta prediction from
> > NUT, and put in its place fixed timestamp deltas in the framecode
> > table.
> agree

Excellent!

> > THEREFORE, I propose that we replace the type-2 startcodes with a sync
> > point packet containing its own timestamp, specify that subsequent
> > relative/lsb timestamps are based on the sync point timestamp, and
> > remove the requirement that frames following a type-2 startcode fully
> > code their timestamps.
> agree

Excellent!

A couple notes on sync point timestamps: Given 3 time bases b1, b2,
and b3, and two timestamps t1 and t2 in time bases b1 and b2
respectively, it is not necessarily possible to choose a timestamp t3
in base b3 such that t1 <= t3 <= t2. Thus, the existing monotonicity
rule can't apply to sync point timestamps in arbitrary timebase. And I
don't think we should require the global timebase to be fine-grained
enough to enforce monotonicity, because very course units (for example
half-seconds!) can fit in 14 bits (2-byte vlc) and do not increase
overhead if the subsequent frames are already storing lsb timestamps
anyway.

IMO we should recommend that the global timebase be the same as the
timebase for the stream that a user is most likely to want to seek
based on. For a movie, this would generally be the primary video
stream, while for a music video file (possibly with several alternate
videos for the same song), it could be the audio stream. This would
only be a recommendation (after all, half-seconds also make a good
unit!), but it allows the next frame in at least one stream to code
pts as delta=0 (which helps efficiency slightly) and it may be useful
for editing applications that want to show the index to users in units
of frames or something.

One thing we _should_ probably require is that the sync point
timestamp be chosen maximally such that it comes before the timestamp
of the following frame. Otherwise every sync point could just have
timestamp 0 and fully code the timestamp in the next frame of each
stream (i.e. the old way). And while this works, it would make the
syncpoint timestamps useless for an index... :))

> > THEREFORE, I propose that we add optional 3-byte recovery points to
> > the nut spec, which muxers can use at their discression to improve the
> > error resilience of the file. (Discussion point: perhaps 2 bytes is
> > enough?)
> agree
> btw, i did some tests a few days ago, and interrstingly there are several 
> 3byte sequences starting with 'N' which never occred in ~500mb of test 
> videos, for 2 bytes the best (0x4EFE) was IIRC occuring approximately once 
> every 100k but thats just IIRC, as i cant find my notes about it ATM

Wow. If an extra byte reduces collisions by over 5000 times, it's
probably worth it!

Rich




More information about the MPlayer-dev-eng mailing list