[MPlayer-dev-eng] more NUT questions

Ivan Kalvachev ivan at cacad.com
Wed Apr 14 21:31:26 CEST 2004


> Hi
>
> On Tuesday 13 April 2004 01:01, Ivan Kalvachev wrote:
> > Hi,
> > is possible to make more than maximum allowed 256 possible codes.
> no
>

> > I also
> > wonder how the program knows them before starting encoding.
> i have the feeling that u missunderstand the way the frame sizes are
stored,
yes, I have missed few bits;)

> they can either be identical to the 2 last recent used, or the least
> significant bits are encoded in the frame_code and the most significant
bits
> in a vlc afterwards
THAT"S THE POINT. We have 256 frame_codes (hmm less than 256),
so we can code 256 lsb combinations.
But frame_code is used also for flags(128 possible) stream_id (unlimited?)
and donno what else.
I cannot understand how we syntheses these codes before start encoding
(they cannot be written at the end just like the index) and how we are
sure that they will be enough.

Well I guess that I can code data_size with only 2 lsb (0,1) and send
the rest as msb (mul=2). I will send full timestamps and stream_id. But
in this case I will lose all advantages as frame_code is turned info
flags.

Maybe i should store all packets in the memory before start writing nut?
Or write nut once and then "shrink" it?

Sorry, I prefer simpler format than smaller format.


> >
> > - How timestamp is coded (mul,lsb look too limited/small),
> use larger font :)
It is unclear flags' TTT explanation, they need just
rewording or few newlines;) Do it like next_last explanation.

I wonder how this funky masking formula come into your mind?
Won't it have same effect if we use median/average_frame_time?

> > 2. won't it be better checksum to be moved before reserved_bytes?
> no
>
> > I like keeping all known fields in one piece
> > I guess it is better they to be right after packet_header (or part of it,
> > if this doesn't conflict readability).
> NO, this may require seeking back to update them, and this may be a problem
> for some muxers, the packet size is already there, but its much easier to
> predict the packet size instead of a checksum
I would not say that if I had so many variable size parameters.
I thought that whole packet is done in memory and written at once.
The crc writing won't be issue as it is fixed size.
Don't forget that the size/pointers/ are vlb, so we may not be able to fix
them even with seeking (at least I don't know system function for
inserting bytes):
Hmm, now I see why you define stuffing vlb code (0x80).

But by "guessing" I think that you mean to examine all parameters
before writing. Well you can easily compute and crc with it:P

> >
> > 3. In stream_header fixed_fps is useless,
> its usefull for transcoding to container formats which only support
fixed fps,
> and its usefull for error detection
error detection? HOW do you detect error with this?
Transcoding _TO_ other formats? Why somebody would do that?
Moreover, it is INTEGER! How would you code 23.976? This alone is enough
to crap everything.

> > maybe average (or median) would
> > have more sane (it may even be in timestamp units) e.g.
> > time_base=1001/24000, average_frame_time=1; => fps=23,976
What's wrong with this?


> >
> > 4. That worries me much are the recording issues. The drift of audio
> > sampling frequency. Audio packets have timestamps so possible
> > workaround maybe audio to use system time for them, and not the time
> > calculated from the frequencies. Rich said that audio should be
> > resampleed to correct sampling frequency, but nor FFMpeg nor MPlayer have
> > such precise resampler (check af_resample if you don't believe me).
I think that Rich should comment that.
The main problem I see is if we do that -^, we will need to rework
mplayer with audio pts support.

> >
> >
> > 5. Is there some protection against start code appearing in the data
> > stream?
> no, u will find one approximately every 4 exa byte in a random data stream
Well don't forget that we don't work with random streams. They are quite
ordered. I mean that the probability is a little bit higher (only a bit),
because the entropy levels are similar (in other words, startcodes don't
have repeating symbols and zeroes, and good encoders should not produce
repeating symbols and lots of zeroes).
Anyway it doesn't matter if it is 4 exa byte or 1 exa bytes probability ;)


> > Or in the worst case if we have nut in
> > nut, we may find wrong packet with right checksum.
> puting a container format into another like ogg in avi is a ugly hack,
putting
> the same container format into itself is so idiotic that it hasnt been done
> yet (AFAIK at least)
Oh yes, it is done. By the same format you give as example. Ogg vorbis
audio in ogm, is ogg in ogg.
Enjoy.


> but that problem isnt specific to NUT, just try it with another format ...
>
> >
> > 6. After latest change, it is not clear when main, audio/video stream,
and
> > frames are positioned in the stream. Only header order is given. ;)
> yes, iam not exactly sure if some strict ordering should be enforced, it
could
> be very difficult for some muxers to support, just think of realtime
capture,
> we dont want to waste time with buffering there
well I mean something like
  main;

  video_stream_header
  audio_stream_header

  video_packet_1 (type 2)
  audio_packet_1 (type 2)
  video_packet_2 (type 0/1/2)
  ...
  audio_packet_n (type 2)
  video_packet_n (type 0/1/2)

etc...
You have gave all peaces. But we have to see the big picture at the end.
These specifications start from simplest structures to the more
complex one (btw u(x) and f(x) definition should be moved on top).

> >
> > 8. Hmm, something even more fishy. We have frame_type2_startcode if we
> > have frame_type=2. But frame_type=2 is indicated by
> > (flags[frame_code]&1)==1. Yeh, frame_code is written after the startcode!
> i dont see the problem here, if theres a frame_type2 start code its a
type 2
> frame if theres no such startcode, its not a type 2 frame
It's kind a of recursive definition. It's confusing.
I guess it is legacy of the time when there have been type 4.
The bad thing is that we MUST check for startcode.


Few more questions.

10. What's the point of using _BOTH_ MSB and LSB ordering? Just to make
sure that demuxer have conversion for both?
Unify them.

11. There is no way to check is frame_type 0 broken. If there is small
broken part in frame_type_0 beginning we will get wrong frame_code.
We will read some values. We will seek to some position.
And so on, until we jump out of stream, calculate negative or
forbidden value.

12. Yet another question. The forward/backward pointers are relative.
It is said that forward pointer is the size of current packet and that
backward pointer is the size of the previous packet.
The problem is that if next packet is frame_type 0 it won't have any
pointers.
This breaks backward seeking.
e.g.

We don't have index, and we are about 5/6 position in 4GB file (will be
very slow to start from beginning)
We are in main_header. Seek backward backward_ptr bytes. We read
frame_type_0.
Frame_type_0 don't have startcode, backward_pointer and forward pointer.
We can only calculate the data_size, but we already know it.

Solution 1.
Seek backward until startcode is found. IMHO startcodes should not be
used in perfectly valid (not damaged) streams.

Solution 2.
Write forward/backward pointers so they point to the next packet that
have such pointers. This will require to buffer of all frames (type 0/1)
and write them at once. We cannot seek back to write the size, as size
may change from one byte to two, or from 2 to 3. If we stuffing, we will
spend more bytes for something useless when we make hell-a-lot-of
tricks to save few bytes per packet.
Other possibility is main_header pointer to point to other main_header
pointers. IMHO it is still ugly.

Solution 3.
Remember position of all main_headers. This is not solution at all. It
is just AVI file index rebuild. We could drop all backward pointers then.

Solution 4.
GOP. All frame headers at one place. Kinda of #2.
Even better, it could be some kind of destributed index.
small indexes all over connected with forward/backward pointers.
Hmm sound familiar, maybe I have seen it before?



Sorry, but I don't like nut. It tries to be smaller at all cost. And it
pays too much. It trade simplicity. It trade stability. And then try to
compensate with huge startcodes.




More information about the MPlayer-dev-eng mailing list