
On Tue, Feb 12, 2008 at 08:24:21PM +0100, Michael Niedermayer wrote:
Also keep in mind that what I said does NOT only apply to the mmap design but to a nice clean read-based design with low copying that probably IS similar to various current implementations.
I dont see how It wouldnt affect libavformat, also it wouldnt affect the simple: x= malloc(size+header) put header in x fread into x+header
This design is highly inefficient for high bitrate streams. The efficient implementation is: - fill large buffer directly with read() - while(at least one whole frame in buffer) return pointer to the frame in-place in the buffer - keep a lock on all sections of the buffer which the caller has not yet "released". allow the initial segment of the buffer which has been released to be reused, copying any partial frame at the end of buffer back to the beginning. - if there is insufficient released space at beginning of the buffer, allocate a new buffer. Just because libavformat does not use the efficient implementation does not mean we should preclude it. Maybe in the future we'll even want to do that in lavf, but if not, still other implementations might. It should give at least a 5-10% performance boost on playback of high-bitrate video.
Seriously, i dont think that having 2 compressed frames instead of one in memory is a significant burden, the uncomperssed ones are so much bigger ...
Think of a possible hardware implementation where the uncompressed frame is only ever seen by the decoder chip. Then the demuxer buffers might be rather small. Obviously this would put limits on the _codec_ parameters, which the device manufacturer would publish as the supported profile, but it would be rather bad to also have limits on the container parameters. Remember, we limited framecodes to a single byte for a very good reason -- the possibility of small hardware implementations. If a device supports codec X, it should support ANY valid NUT file containing media encoded with codec X. The NUT parameters should not preclude support for the file.
But iam not that much interrested in huge frames, a simple "header elision with frames > 1024 shall be considered an error" would probably be ok.
That would make me happy. I would even be happy allowing sizes up to 4096 or so if you think it would help.
It's NOT the business of the person making the file to put restrictions on the process reading the file. This has been one of my key issues for NUT since the beginning. Just because idiot_who_makes_file thinks the only purpose of the file is to watch it with player_x doesn't mean they're exempt from the features of NUT that make the file nicely usable for editing, optimal seeking, etc.
True if the world was just a bunch of warez kids leeching pr0n of bittorrent.
But lets take a big internet provider who wants to offer TV to their customers and for that purpose gives everyone a HW decoder (the PCs being to slow maybe ...) Why should that ISP not be able to make the ideal decission during encoding for the decoders which he know everyone will be using?
But that's the thing -- he does not know what everyone will be using. He knows what he WANTS everyone to be using, but then hackers like us go and save the streams and want to use them directly for reencoding, archiving, warezing, etc. Look how nasty the situation now is with DVDs and digital TV. The content producers just assume everyone is using a crappy TV to watch their content and not doing anything else with it. They fill the video with mixed telecine, incorrect interlacing flags, etc. And in the end, everyone loses. It's not just the people who want to rip/save the streams to clean files to watch on a computer, but also the people with nice high-resolution TVs that need to present a progressive picture. Due to the crap the content producers deliver, folks always end up with flicker and aliasing. Obviously NUT can't solve these video processing issues. But we can stick to the philosophy that just because the content producer is too short-sighted to envision everything the content recipient might want to do with the content, that's not an excuse for delivering something broken with artificial limitations.
It should be in nut.txt as well ...
If it's not feel free to add it, but I think it belongs in the information about codecs, since the definition of a frame is pretty codec-specific. Expressing the abstract idea in the semantic requirements section of nut.txt would be nice though!
I will as soon as the RAW frame issue is decided, unless i forget, in which case please flame me!
PCM you mean? For video, RAW frames are quite obviously single pictures. :) Yes, that means even if the resolution is 1x1 pixel. :) :) :) For audio, I would be happy with a clean SHOULD like I suggested before, but even more happy if you have a clean technical requirement to ensure that frames not be too big without resorting to physical units (limit on number of samples would be okay, but I don't particularly like limit on number of seconds since it's not scale-invariant).
Which codec would that be? (we are talking about 5-10kbit/sec).
Vorbis.
5kbit/sec ?
I've gotten good results at 24kbit/sec with 32000 Hz sampling, mono. If we drop the sampling rate, I think 8-10kbit/sec is very realistic. I don't know about 5... would need to run some experiments. If the bitstream were optimized heavily to kill the overhead, it might work very well, while technically no longer being "vorbis".
AMR-NB has 6-32 byte frames see libavcodec/libamr.c qcelp has 4-35 byte frames see soc/qcelp/qcelp_parser.c
And is there any interest in these codecs aside from watching files generated by camera phones and transcoding them to something sane?
Theres no sense in transcoding the audio. It just will reduce the quality and very significantly increate the bitrate because the codecs you call sane are VERY bad at such low bitrate.
Well, until there's a free decoder there's a lot of interest in transcoding them, but maybe the free decoder will be done and ready for merge soon? :)
For PCM, there's no seeking issue because one always knows how to seek to an arbitrary sample in PCM 'frames'. In this case I would just recommend having a SHOULD clause that PCM frames SHOULD be same or shorter duration than the typical frame durations for other streams in the file (as a way of keeping interleaving clean and aiding players that don't want to do sample-resolution seeking inside PCM frames).
Well ... I do have a AVI with all audio in a single chunk (something like 10 or 20mb), do you want that .... in nut? mplayers avi demuxer can even seek in that avi :)
so honestly i dont think a should requirement alone would be a good idea.
Of course I don't want that in NUT. I suppose we should have a nice strong technical requirement to prevent it. How about just a limit to 4096 samples per frame? If one uses the maximum allowed frame size then, the overhead would be trivial (1 byte per 4096 bytes in the worst case, i.e. 0.025%, and much less with 16bit/stereo/etc.).
If you really want this header-elision, I'm willing to consider it for SMALL frames. But please don't flame about wanting to support it for large frames where it's absolutely useless and has lots of practical problems for efficient implementations!
Fine ill add it for small frames only. Would a 1024 byte limit be ok?
Yeah. As I said above, even a larger limit, maybe up to 4096, would be fine with me. The limit should be on the _total_ frame size, BTW, not the size with the elided header, so that the "extra buffer" needed for reassembly has a fixed max size. BTW, note that this also provides a cheap way of compressing silence with PCM audio: have a framecode for "all zero frame" and then use it to encode the whole frame. :) The same could potentially work with other codecs too. Rich