[MPlayer-dev-eng] more NUT questions

Michael Niedermayer michaelni at gmx.at
Thu Apr 15 02:45:26 CEST 2004


Hi

On Wednesday 14 April 2004 21:31, Ivan Kalvachev wrote:
> > Hi
> >
> > On Tuesday 13 April 2004 01:01, Ivan Kalvachev wrote:
> > > Hi,
> > > is possible to make more than maximum allowed 256 possible codes.
> >
> > no
> >
> > > I also
> > > wonder how the program knows them before starting encoding.
> >
> > i have the feeling that u missunderstand the way the frame sizes are
>
> stored,
> yes, I have missed few bits;)
>
> > they can either be identical to the 2 last recent used, or the least
> > significant bits are encoded in the frame_code and the most significant
>
> bits
>
> > in a vlc afterwards
>
> THAT"S THE POINT. We have 256 frame_codes (hmm less than 256),
> so we can code 256 lsb combinations.
> But frame_code is used also for flags(128 possible)
no, there just 24 legal combinations of the flags
packet_type(P)	msb_size(D)	pts(TTT)	keyframe(K)
0		X		XXX		X 		20 cases
1		X		101		X		4 cases 

> stream_id (unlimited?) 
the stream id _can_ be stored in the frame_code, or it can be stored as vlc 
afterwards, yes its unlimited

> and donno what else.
> I cannot understand how we syntheses these codes before start encoding
> (they cannot be written at the end just like the index) and how we are
> sure that they will be enough.
basic logic, if we can encode every possible frame, then they are enough, and 
we just need to use 4 frame_codes of the 255 to ensure this, the remaining 
251 can be used to store the most likely packets, how does the muxer know 
which are likely? its the problem of the muxer, but its really not difficult, 
just think about it, many combinations simply dont exist, non-keyframes in 
audio streams? full timestamps will be rare, keyframes in video streams are 
rare too, ...
and if u never store the frame size in the frame_code, u could store all 24 
legal flag combinations for 10 stream_ids, this very simply choice still 
results in a pretty small overhead, if u have more stream_ids u need to use 
one of the 10 to indicate that the stream_id is stored as vlc

>
> Well I guess that I can code data_size with only 2 lsb (0,1) and send
see above, u could store the whole data size as vlc afterwards if u like

> the rest as msb (mul=2). I will send full timestamps and stream_id. But
> in this case I will lose all advantages as frame_code is turned info
> flags.
even with this intentionally bad choosen frame_code table ur statement is 
false, because if u do store all flag combinations, this also means u can 
always use the timestamp prediction or the lsb timestamp, so u would very 
rarely need to store the full timestamp, i guess in practice this table would 
result in about 4-5 bytes per frame, 1frame_code, 1stream_id, 2frame_size

>
> Maybe i should store all packets in the memory before start writing nut?
> Or write nut once and then "shrink" it?
well u could certanly do this and it might result in a 0.001% smaller file, 
but its not really intended to be done that way

>
> Sorry, I prefer simpler format than smaller format.
it IS simple, maybe the spec doesnt explain it well, and if u have any 
suggestions to improve the format besides just saying its bad, we will 
certainly consider any suggestions u have
u also should keep in mind that the frame_code stuff just adds optional 
complexity to the muxer, its extreemly simple for the demuxer and even the 
muxer can choose a simple static frame_code table, i guess we should add an 
example frame_code table to the spec so a muxer author could copy&paste it 
instead of thinking about how to generate a optimal one depening upon the 
number of streams and other information which may be available to the muxer

>
> > > - How timestamp is coded (mul,lsb look too limited/small),
> >
> > use larger font :)
>
> It is unclear flags' TTT explanation, they need just
> rewording or few newlines;) Do it like next_last explanation.
ill try to clarify it

>
> I wonder how this funky masking formula come into your mind?
it just chooses the closest timestamp relative to the last timestamp which 
matches the given lsb, if u have a less funky one just tell me, ill replace 
it

> Won't it have same effect if we use median/average_frame_time?
huh? what?

>
> > > 2. won't it be better checksum to be moved before reserved_bytes?
> >
> > no
> >
> > > I like keeping all known fields in one piece
> > > I guess it is better they to be right after packet_header (or part of
> > > it, if this doesn't conflict readability).
> >
> > NO, this may require seeking back to update them, and this may be a
> > problem for some muxers, the packet size is already there, but its much
> > easier to predict the packet size instead of a checksum
>
> I would not say that if I had so many variable size parameters.
> I thought that whole packet is done in memory and written at once.
> The crc writing won't be issue as it is fixed size.
> Don't forget that the size/pointers/ are vlb, so we may not be able to fix
> them even with seeking (at least I don't know system function for
> inserting bytes):
> Hmm, now I see why you define stuffing vlb code (0x80).
yes, exactly thats why 0x80 is there :)

>
> But by "guessing" I think that you mean to examine all parameters
> before writing. Well you can easily compute and crc with it:P
no, its not easy, its quite complicated, for size we would just need a
size= get_length(a) + get_length(b) + ...;
i leave the code for generating the checksum to u ;)

>
> > > 3. In stream_header fixed_fps is useless,
> >
> > its usefull for transcoding to container formats which only support
>
> fixed fps,
>
> > and its usefull for error detection
>
> error detection? HOW do you detect error with this?
if u demux a stream and find a different timestamp delta its a error

> Transcoding _TO_ other formats? Why somebody would do that?
why not? are we microsoft? so we should make it as difficult as possible ...

> Moreover, it is INTEGER! How would you code 23.976? This alone is enough
> to crap everything.
its a flag, 0 -> variable fps, 1-> fixed fps

>
> > > maybe average (or median) would
> > > have more sane (it may even be in timestamp units) e.g.
> > > time_base=1001/24000, average_frame_time=1; => fps=23,976
>
> What's wrong with this?
u didnt explain for what it would be usefull

>
> > > 4. That worries me much are the recording issues. The drift of audio
> > > sampling frequency. Audio packets have timestamps so possible
> > > workaround maybe audio to use system time for them, and not the time
> > > calculated from the frequencies. Rich said that audio should be
> > > resampleed to correct sampling frequency, but nor FFMpeg nor MPlayer
> > > have such precise resampler (check af_resample if you don't believe
> > > me).
>
> I think that Rich should comment that.
> The main problem I see is if we do that -^, we will need to rework
> mplayer with audio pts support.
sorry i dont understand, please elaborate

>
> > > 5. Is there some protection against start code appearing in the data
> > > stream?
> >
> > no, u will find one approximately every 4 exa byte in a random data
> > stream
>
> Well don't forget that we don't work with random streams. They are quite
> ordered. I mean that the probability is a little bit higher (only a bit),
> because the entropy levels are similar (in other words, startcodes don't
> have repeating symbols and zeroes, and good encoders should not produce
> repeating symbols and lots of zeroes).
our startcodes ARE random they dont contain repeating zeros

[...]
>
> > but that problem isnt specific to NUT, just try it with another format
> > ...
> >
> > > 6. After latest change, it is not clear when main, audio/video stream,
>
> and
>
> > > frames are positioned in the stream. Only header order is given. ;)
> >
> > yes, iam not exactly sure if some strict ordering should be enforced, it
>
> could
>
> > be very difficult for some muxers to support, just think of realtime
>
> capture,
>
> > we dont want to waste time with buffering there
>
> well I mean something like
>   main;
>
>   video_stream_header
>   audio_stream_header
>
>   video_packet_1 (type 2)
>   audio_packet_1 (type 2)
>   video_packet_2 (type 0/1/2)
>   ...
>   audio_packet_n (type 2)
>   video_packet_n (type 0/1/2)
>
> etc...
> You have gave all peaces. But we have to see the big picture at the end.
> These specifications start from simplest structures to the more
> complex one (btw u(x) and f(x) definition should be moved on top).
agree, moved them

>
> > > 8. Hmm, something even more fishy. We have frame_type2_startcode if we
> > > have frame_type=2. But frame_type=2 is indicated by
> > > (flags[frame_code]&1)==1. Yeh, frame_code is written after the
> > > startcode!
> >
> > i dont see the problem here, if theres a frame_type2 start code its a
>
> type 2
>
> > frame if theres no such startcode, its not a type 2 frame
>
> It's kind a of recursive definition. It's confusing.
its not recursive

> I guess it is legacy of the time when there have been type 4.
wtf?!

> The bad thing is that we MUST check for startcode.
we must read the next byte to identify the frame type after the last frame, 
thats all, theres no startcode search, u missunderstand the format if u 
belive that there is
all startcodes start with 'N', and 'N' is disalowed as frame_code, so if the 
next byte is N we know its a type 2 frame or a repeated main or stream 
header, if its not 'N' we know its a type 0 or 1 frame, and looking that byte 
up in the frame_code table will tell us if its type 0 or 1

>
>
> Few more questions.
>
> 10. What's the point of using _BOTH_ MSB and LSB ordering? Just to make
> sure that demuxer have conversion for both?
> Unify them.
i dont understand what u mean, please elaborate

>
> 11. There is no way to check is frame_type 0 broken. If there is small
> broken part in frame_type_0 beginning we will get wrong frame_code.
> We will read some values. We will seek to some position.
> And so on, until we jump out of stream, calculate negative or
> forbidden value.
yes, its always the case, u parse format foobar until u see a illegal value

>
> 12. Yet another question. The forward/backward pointers are relative.
> It is said that forward pointer is the size of current packet and that
> backward pointer is the size of the previous packet.
> The problem is that if next packet is frame_type 0 it won't have any
> pointers.
> This breaks backward seeking.
well, try ffmpeg/ffplay/mplayer, they can seek, and ffmpeg doesnt write an 
index

> e.g.
>
> We don't have index, and we are about 5/6 position in 4GB file (will be
> very slow to start from beginning)
> We are in main_header. Seek backward backward_ptr bytes. We read
> frame_type_0.
this is not allowed, the backward pointer MUST point to the last packet 
header, type 0 frame dont have a packet header

> Frame_type_0 don't have startcode, backward_pointer and forward pointer.
> We can only calculate the data_size, but we already know it.
>
> Solution 1.
> Seek backward until startcode is found. IMHO startcodes should not be
> used in perfectly valid (not damaged) streams.
no, its not possible to avoid a startcode search if theres no index, think of 
a slow cdrom or network, following the pointer chain in either direction in a 
large file without an index takes too long its O(n) vs. O(log n) seeking, we 
could make indexes mandatory but its not possible for realtime streams, so 
its not a solution either

>
> Solution 2.
> Write forward/backward pointers so they point to the next packet that
> have such pointers. This will require to buffer of all frames (type 0/1)
wrong, type 1 have a packet header and a backward pointer, only type 0 must be 
buffered, and again, u complain but u dont suggest an alternative, its easy 
to say buffering sucks, but if u dont ever buffer anything u cannot store any 
forward pointers, so u would always have to search for the next startcode, 
but thats just what u didnt like above

> and write them at once. We cannot seek back to write the size, as size
> may change from one byte to two, or from 2 to 3. If we stuffing, we will
> spend more bytes for something useless when we make hell-a-lot-of
> tricks to save few bytes per packet.
hmm, its extreemly simple, there is a limit of 16k byte max between type1/2 
frames, so u never need to buffer more then 16k, if the next frame is bigger 
its written as type 1or2 and the buffer is flushed, a realtime muxer which 
need very low delay could choose to not write type 0 packets at all, the 
length of the forward pointer is also guranteed to fit in 2 bytes unless the 
packet itself is larger then 16k

> Other possibility is main_header pointer to point to other main_header
> pointers. IMHO it is still ugly.
>
> Solution 3.
> Remember position of all main_headers. This is not solution at all. It
> is just AVI file index rebuild. We could drop all backward pointers then.
>
> Solution 4.
> GOP. All frame headers at one place. Kinda of #2.
> Even better, it could be some kind of destributed index.
> small indexes all over connected with forward/backward pointers.
> Hmm sound familiar, maybe I have seen it before?
>
>
>
> Sorry, but I don't like nut. It tries to be smaller at all cost. And it
> pays too much. It trade simplicity. It trade stability. And then try to
> compensate with huge startcodes.
well, IMHO u shouldnt say such things before u understand the format at all
simplicity is something subjective so its difficult to argue about, stability 
is easier, i suggest u damage a nut file and try to play it and seek in it, 
compare it against other formats, and then judge its stability, i attached 
the program i used for such tests

[...]
-- 
Michael
level[i]= get_vlc(); i+=get_vlc();		(violates patent EP0266049)
median(mv[y-1][x], mv[y][x-1], mv[y+1][x+1]);	(violates patent #5,905,535)
buf[i]= qp - buf[i-1];				(violates patent #?)
for more examples, see http://mplayerhq.hu/~michael/patent.html
stop it, see http://petition.eurolinux.org & http://petition.ffii.org/eubsa/en
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trasher.c
Type: text/x-csrc
Size: 644 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20040415/093a76a9/attachment.c>


More information about the MPlayer-dev-eng mailing list