[MPlayer-dev-eng] a few nut suggestions

D Richard Felker III dalias at aerifal.cx
Sun Oct 3 22:19:37 CEST 2004


let's see what ppl think of these ideas for cleaning up nut. one
request: if you reply, please reply to each point in a separate
message (unless 2 are closely related), rather than together. that way
we'll make a more organized thread out of the discussion.

1. info packets

earlier there was a big argument between me and michael about info
packets. one of the major concerns was about where info packets would
appear, and if they belong just at the beginning of the file or all
over the place in the file...

what if we change things a little bit, and require the existing type
of info packet to be at the beginning of the file, but add a fourth
stream type, info streams, with which users could include interleaved
metainfo? this way there's actually a logical way to interleave the
info packets (pts), rather than just putting them at arbitrary points
in the stream. and if they're two separate things, people would be
less likely to misuse one when they mean the other, imo..

now, what should the info headers look like? my inclination is that
they should just appear immediately after the main/stream headers they
go with. i.e. global metainfo about the file should go in an info
header immediately following (or "attached to"?) the main header, and
metainfo about specific streams should go in an info header
immediately following (or attached to?) the corresponding stream
header. so the headers would look like:

main header
main info
stream 0 header
stream 0 info
stream 1 header
stream 1 info
...

of course if we do decide to do this, we might as well just collapse
the info into a field of the headers, rather than making it a separate
packet type. this is sort of what i was proposing for "essential
metadata" before, but it gets rid of the distinction that michael
didn't like between essential and nonessential. and of course, we have
info streams to handle the other stuff.

what could the uses of an info stream be? basically anything that
before would have required info packets placed within the streams
rather than at the beginning of the file. for example...

- song info on an internet radio stream
- animal/plant/character identification tags & outlines
- background information on scenes, citations for quotes, ...

..and of course whatever else some crazy users come up with.

what real _advantages_ do we get by converting these packets into a
stream with timestamps and interleaving rules? well, one nice side
effect is that demuxing and remuxing (with the same framecode table)
should automatically give a byte-identical file, without having to
process the info packets specially. this could be particularly nice if
a user wants to split a movie between two discs, then re-merge it
later to share it (while not introducing a new "corrupt" version with
different hash).

2. language code

michael has finally convinced me that language, disposition, and other
"essential" metadata should not get special treatment by being stored
in dedicated fields rather than generic info fields. why? well aside
from the fact that it _is_ a little arbitrary, the main reason is for
widespread support. i noticed that lavf just ignores the language
field in the nut stream header, because processing it doesn't fit in
with lavf's general info-header processing. i suspect the same is true
with many other pieces of software too. i don't want these "essential"
headers to become ignored just because the software doesn't provide a
good interface to accessing them.

so, i'd like to remove language code from the stream headers. but...
this is a problem, because i don't want to break compatibility again.
can we just replace it with a "reserved" vb-type field? and i'll drink
my cola for demanding that it be put in the header to begin with...

3. m$-specific and apple-specific headers

i'd like to strongly recommend removing support for bitmapinfoheader,
waveformatex, imagedesc, and sounddesc types in the codec specific
header. first, the motivation, and then i'll explain why they're in
fact not needed.

the motivation is actually really simple: i'm sick of new containers
clinging onto this legacy crap. furthermore, since lots of these
headers duplicate stuff already in the nut header, it adds a layer of
redundancy and possible ambiguity, which is bad.

imo these headers are not needed because the vast majority, if not
all, of the information in them is redundant. that is to say, if the
player is using a m$ or apple binary codec to play the file, it can
regenerate the bitmapinfoheader or whatever crap is needed to make
this codec happy based on the other info in the nut header. we don't
need to explicitly include this dumb header which is associated with a
specific piece of software (one decoder for the format) rather than
with the audio or video format itself.

if what i've said proves not quite true, maybe we can find some common
ground where we store extra data from the proprietary headers, but in
a sane format (nut universal vlc based) and without the redundancy. i
don't know exactly...let's just discuss it. anyway i'd like to just
see one type of codec-specific data, rather than this list of several
useless m$ and apple structures.

4. file is string

i recall someone being against the file id string. don't remember who
it was right off. we could discuss if this is really desirable or
necessary. i don't really care one way or the other.

5. fourcc and codec identification

i'd like to rethink the way we identify codecs, especially for audio.
most of the m$ fourccs for audio are incredibly stupid (just arbitrary
numbers with no string meaning) and in fact they're ambiguous with
regard to nut's header structure.

in the traditional header structures in riff-based formats (avi and
wav), the original idea was that data was uncompressed, so the header
described the sample format really closely with useless stuff like
bytes-per-sample, signed/unsigned, etc. with compressed data, fields
like this are meaningless, since different decoders could in principle
(and will in the future!) output different formats. so i agree with
the way nut is doing things now, not including that mess.

but, with uncompressed audio we're left with a big problem. there's
just one fourcc for pcm, and it doesn't specify the sample format at
all. so we have basically two choices: one is to make up and include a
codec-specific header for pcm that tells the sample format, and the
other is to make up our own fourccs for each sampleformat (like S16L,
S16B, U8LE, FLLE, ...). i'm not sure which is better in principle, but
if we're already going to make up sane fourccs for audio rather than
using the m$ junk, it wouldn't hurt to make separate pcm fourccs while
we're at it.




ok, that's it for now. imo nut is looking pretty good. i'm sorta
writing an informal nut-intro doc now...not sure how long that'll
take. and i've considered writing libnut independent of lavf to make
sure we can take advantage of all the benefits of nut muxing even if
they don't fit nicely into the lavf framework. also if i do write it,
it will be mit-license or public domain to encourage adoption.

please reply with opinions, flames, etc. :)

rich







More information about the MPlayer-dev-eng mailing list