[MPlayer-dev-eng] Nut and B frames

D Richard Felker III dalias at aerifal.cx
Mon Apr 26 13:38:08 CEST 2004


Nut is supposed to have a "perfect interleaving" rule, i.e. if frame B
comes after frame A, regardless of whether they're part of the same
stream or different streams, frame B must have a later timestamp than
frame A (or the same). This is to avoid situations like AVI where
demuxers have to deal with badly interleaved (or noninterleaved) files
and thus end up rapidly seeking in order to be able to play the movie.
But how do B frames (or out-of-order frames in general) fit into this
picture?

I see two possibilities:

1. In files with B frames, out-of-order frames may be stored anywhere
   between the previous frame and the next frame in decoding order.

2. No special treatment for files with B frames. Perfect interleaving
   still applies.

Before you flame and choice #2 sucks, let's hear the arguments for
both:

Choice 1 is appealing because it gives you frames in the order that
most decoders expect. But it also has some drawbacks. Let's say you
have the following sequence of frames in a file: I0 P3 B1 B2 (decode
order). When you encounter the P frame, you have to be aware that the
file has B frames and look ahead. Otherwise you'll see the (well in
the future) timestamp on the P frame, and think there's nothing to do
for a while! By the time you see the B frames, it'll be too late to
present them. Also, after seeking you have to take special precautions
to reset the codec so you don't decode B frames that belong before the
I frame you seek to.

Some of the problems of choice 1 could be remedied by an out-of-order
flag in the frame header (but that's wasteful) or a global header that
says out-of-order frames are present and that you need to look ahead N
frames to find next pts...but that's all complicated.

Choice 2 immediately looks _bad_, because it requires you to buffer
one or more B frames before actually passing them to the decoder. I've
been a very vocal opponent of doing stuff that requires buffering in
NUT. But unlike in some other cases (audio subpackets, forward
pointers, etc.), the buffering here doesn't increase latency beyond
the latency already present in B frames. Further, any device that has
to cope with B-frame codecs already has to have some degree of
buffering for decoded frames, to display them in the right order, so
requiring some buffering of encoded frames (much smaller) isn't much
of a big deal.

Now, the advantage of choice 2: you never have to care whether your
file contains B frames. Pulling a frame from the demuxer immediately
gives you the next PTS at which a frame needs to be displayed, without
having to look ahead and see if a B frame follows. The player can then
grab more frames from the demuxer until it gets the next I/P frame it
needs to finish decoding the B frame.

For a naive player, there may be some disadvantages. For example, the
player may end up decoding both the P and the B frame at once right
before the B frame needs to be displayed, as with the "packed" B
frames DivX uses in AVI files. But this can be avoided, and in fact
the demuxer can emulate decode-order demuxing if desired. For example,
a Windows DirectShow demuxer for NUT could do this to make all the
broken Windows player apps happy.

I'm not extremely particular to either choice 1 or 2. In fact I don't
like and don't use B frames because they're very bad for decoding
performance with filters and the tests I've seen show minimal PSNR
gain (and sometimes loss) for anime. But IMO it's worth considering
both options, and I especially think that we should decide on one and
rule out the other, rather than leaving it ambiguous. If the standard
isn't defined, encoders will make de facto standards and more often
than not these are very bad...

Rich







More information about the MPlayer-dev-eng mailing list