[NUT-devel] [nut]: r604 - docs/nutissues.txt

Wed Feb 13 02:37:21 CET 2008

On Tue, Feb 12, 2008 at 08:24:21PM +0100, Michael Niedermayer wrote:
> > Also keep in mind that
> > what I said does NOT only apply to the mmap design but to a nice clean
> > read-based design with low copying that probably IS similar to various
> > current implementations.
> 
> I dont see how
> It wouldnt affect libavformat, also it wouldnt affect the simple:
> x= malloc(size+header)
> put header in x
> fread into x+header

This design is highly inefficient for high bitrate streams. The
efficient implementation is:

- fill large buffer directly with read()
- while(at least one whole frame in buffer) return pointer to the
  frame in-place in the buffer
- keep a lock on all sections of the buffer which the caller has not
  yet "released". allow the initial segment of the buffer which has
  been released to be reused, copying any partial frame at the end of
  buffer back to the beginning.
- if there is insufficient released space at beginning of the buffer,
  allocate a new buffer.

Just because libavformat does not use the efficient implementation
does not mean we should preclude it. Maybe in the future we'll even
want to do that in lavf, but if not, still other implementations
might. It should give at least a 5-10% performance boost on playback
of high-bitrate video.

> Seriously, i dont think that having 2 compressed frames instead of one
> in memory is a significant burden, the uncomperssed ones are so much
> bigger ...

Think of a possible hardware implementation where the uncompressed
frame is only ever seen by the decoder chip. Then the demuxer buffers
might be rather small. Obviously this would put limits on the _codec_
parameters, which the device manufacturer would publish as the
supported profile, but it would be rather bad to also have limits on
the container parameters. Remember, we limited framecodes to a single
byte for a very good reason -- the possibility of small hardware
implementations. If a device supports codec X, it should support ANY
valid NUT file containing media encoded with codec X. The NUT
parameters should not preclude support for the file.

> But iam not that much interrested in huge frames, a simple
> "header elision with frames > 1024 shall be considered an error"
> would probably be ok.

That would make me happy. I would even be happy allowing sizes up to
4096 or so if you think it would help.

> > It's NOT the business of the person making the file to put
> > restrictions on the process reading the file. This has been one of my
> > key issues for NUT since the beginning. Just because
> > idiot_who_makes_file thinks the only purpose of the file is to watch
> > it with player_x doesn't mean they're exempt from the features of NUT
> > that make the file nicely usable for editing, optimal seeking, etc.
> 
> True if the world was just a bunch of warez kids leeching pr0n of bittorrent.
> 
> But lets take a big internet provider who wants to offer TV to their
> customers and for that purpose gives everyone a HW decoder (the PCs being
> to slow maybe ...) Why should that ISP not be able to make the ideal
> decission during encoding for the decoders which he know everyone will be
> using?

But that's the thing -- he does not know what everyone will be using.
He knows what he WANTS everyone to be using, but then hackers like us
go and save the streams and want to use them directly for reencoding,
archiving, warezing, etc. Look how nasty the situation now is with
DVDs and digital TV. The content producers just assume everyone is
using a crappy TV to watch their content and not doing anything else
with it. They fill the video with mixed telecine, incorrect
interlacing flags, etc. And in the end, everyone loses. It's not just
the people who want to rip/save the streams to clean files to watch on
a computer, but also the people with nice high-resolution TVs that
need to present a progressive picture. Due to the crap the content
producers deliver, folks always end up with flicker and aliasing.

Obviously NUT can't solve these video processing issues. But we can
stick to the philosophy that just because the content producer is too
short-sighted to envision everything the content recipient might want
to do with the content, that's not an excuse for delivering something
broken with artificial limitations.

> > > It should be in nut.txt as well ...
> > 
> > If it's not feel free to add it, but I think it belongs in the
> > information about codecs, since the definition of a frame is pretty
> > codec-specific. Expressing the abstract idea in the semantic
> > requirements section of nut.txt would be nice though!
> 
> I will as soon as the RAW frame issue is decided, unless i forget, in which
> case please flame me!

PCM you mean?

For video, RAW frames are quite obviously single pictures. :)
Yes, that means even if the resolution is 1x1 pixel. :) :) :)

For audio, I would be happy with a clean SHOULD like I suggested
before, but even more happy if you have a clean technical requirement
to ensure that frames not be too big without resorting to physical
units (limit on number of samples would be okay, but I don't
particularly like limit on number of seconds since it's not
scale-invariant).

> > > Which codec would that be? (we are talking about 5-10kbit/sec).
> > 
> > Vorbis.
> 
> 5kbit/sec ?

I've gotten good results at 24kbit/sec with 32000 Hz sampling, mono.
If we drop the sampling rate, I think 8-10kbit/sec is very realistic.
I don't know about 5... would need to run some experiments. If the
bitstream were optimized heavily to kill the overhead, it might work
very well, while technically no longer being "vorbis".

> > > AMR-NB has 6-32 byte frames see libavcodec/libamr.c
> > > qcelp has 4-35 byte frames see soc/qcelp/qcelp_parser.c
> > 
> > And is there any interest in these codecs aside from watching files
> > generated by camera phones and transcoding them to something sane?
> 
> Theres no sense in transcoding the audio. It just will reduce the quality
> and very significantly increate the bitrate because the codecs you call
> sane are VERY bad at such low bitrate.

Well, until there's a free decoder there's a lot of interest in
transcoding them, but maybe the free decoder will be done and ready
for merge soon? :)

> > For PCM, there's no seeking issue because one always knows how to seek
> > to an arbitrary sample in PCM 'frames'. In this case I would just
> > recommend having a SHOULD clause that PCM frames SHOULD be same or
> > shorter duration than the typical frame durations for other streams in
> > the file (as a way of keeping interleaving clean and aiding players
> > that don't want to do sample-resolution seeking inside PCM frames).
> 
> Well ...
> I do have a AVI with all audio in a single chunk (something like 10 or 20mb), 
> do you want that
> .... in nut? mplayers avi demuxer can even seek in that avi :)
> 
> so honestly i dont think a should requirement alone would be a good idea.

Of course I don't want that in NUT. I suppose we should have a nice
strong technical requirement to prevent it. How about just a limit to
4096 samples per frame? If one uses the maximum allowed frame size
then, the overhead would be trivial (1 byte per 4096 bytes in the
worst case, i.e. 0.025%, and much less with 16bit/stereo/etc.).

> > If you really want this header-elision, I'm willing to consider it for
> > SMALL frames. But please don't flame about wanting to support it for
> > large frames where it's absolutely useless and has lots of practical
> > problems for efficient implementations!
> 
> Fine ill add it for small frames only. Would a 1024 byte limit be ok?

Yeah. As I said above, even a larger limit, maybe up to 4096, would be
fine with me. The limit should be on the _total_ frame size, BTW, not
the size with the elided header, so that the "extra buffer" needed for
reassembly has a fixed max size.

BTW, note that this also provides a cheap way of compressing silence
with PCM audio: have a framecode for "all zero frame" and then use it
to encode the whole frame. :) The same could potentially work with
other codecs too.

Rich