[MPlayer-dev-eng] Lots of stuff for NUT
Oded Shimon
ods15 at ods15.dyndns.org
Thu Jan 5 19:14:08 CET 2006
On Thu, Jan 05, 2006 at 03:21:22PM +0100, Michael Niedermayer wrote:
> Hi
>
> On Thu, Jan 05, 2006 at 04:04:17PM +0200, Oded Shimon wrote:
> [...]
> > > furthermore we havnt thought about error resistance of these features its
> > > part of the goals, and being ignored entirely
> > > what if a the pts or back ptrs are damaged? how do we ensure recovery of
> > > the demuxer?
> > >
> > > i really dont want to flame but every few month iam being confronted with a
> > > new set of 3-5 requirements which are absolute and undisscussable and which
> > > nut must conform too while everything else is not worth even mentioning
> > >
> > > first it was 1pass no buffering, strict dts, absolute minimal overhead,
> > > absolute no redundancy to avoid inconsistant files, ...
> > > now its optimal seeking, 1pass (dunno if the strict no buffer rule was droped?)
> > > dts ordering and overhead doesnt matter at all anymore, ...
> >
> > Well, you miss all the IRC convos where most of this takes place. :)
> >
> > Looking back at it all now, I think you are right, we should go back to
> > single back_ptr, "percise optimal seeking" isn't all that useful, Rich was
> > mostly thinking for video editors when he brought it up, and now that I
> > think of it, they will most likely use intra only codecs anyway. As for
> > music videos, I found that "keyframe percision" is actually very high,
> > I've never had a nut seek going more then 3 seconds off target. (with
> > single back_ptr). Single back_ptr is still necessary IMO, because otherwise
> > using NUT for video editing is impossible altogether. BTW, given this, the
> > rule for syncpoint pts is using the max keyframe pts across all streams,
> > this is the only correct way for doing it (and without global timebase at
> > all, we should still use coded_pts). The index is the old syncpoint index I
> > proposed - those with a common back_ptr can be removed, making a tiny 10kb
> > index per hour. The nice pro to all this, I already have the demuxer fully
> > implemented. :) The biggest thing I still hate with back_ptr - subtitles.
> > (As in, any stream with gaps.) back_ptr will be huge and useless, and EOR
> > can't really be enforced, it is merely a hint.
>
> why? the "non relevant" streams are just ignored in the single back_ptr
yes, ofcourse EOR streams are ignored in back ptr, however we cannot
enforce EOR being set for every subtitle gap, it SHOULD be used, but not
MUST.
BTW, with the "ancient" index method with max_index_distance, we were NOT
under 10kb an hour, unless max_index_distance was 512kb. When it was 32kb,
the index was 100kb or so. However a syncpoint index with single back_ptr
and syncpoints collapsed, the index is indeed 5-8kb.
Anyway, here are my new patches, we're back to single back_ptr, hurray.
step 1: mosly cosmetic
1) change date and goals slightly
2) fix 's' in info packets
3) rename sync_point to syncpoint, and frame_startcode to syncpoint_startcode
step 2:
1) MN rule
step 3:
1) remove global_timebase
2) define convert_ts
3) use coded_pts for syncpoint
step 4:
1) rearrange main header
2) add coded_stream_flags
3) add EOR
step 5:
1) remove max_index_distance
2) change goals
3) change index
4) Add pts rule for syncpoint - the max pts of all keyframes in all
streams. I've given this much thought before we were dealing with per
stream stuff, and it is the only perfectly correct pts, the one that
will only ever rewind you too much, and not not enough.
I never did commit 1 & 3, but I think now I can commit everything here?
I guess rearranging main header is probably thing we're still undecided on.
- ods15
-------------- next part --------------
--- mpcf.txt 2006-01-04 07:53:43.000000000 +0200
+++ mpcf.1.txt 2006-01-04 16:20:56.000000000 +0200
@@ -1,5 +1,5 @@
========================================
-NUT Open Container Format DRAFT 20051118
+NUT Open Container Format DRAFT 20060105
========================================
@@ -23,7 +23,7 @@
~0.2% overhead, for normal bitrates
index is <10kb per hour (1 keyframe every 3sec)
a usual header for a file is about 100 bytes (audio + video headers together)
- a packet header is about ~1-8 bytes
+ a packet header is about ~1-5 bytes
Error resistant
seeking / playback without an index
@@ -243,6 +243,8 @@
name vb
if(type=="v")
value v
+ else if(type=="s")
+ value s
else
value vb
}
@@ -254,8 +256,8 @@
packet header
info_frame
-sync_point:
- frame_startcode f(64)
+syncpoint:
+ syncpoint_startcode f(64)
global_timestamp v
back_ptr v
@@ -277,8 +279,8 @@
info_packet
}
while(next_code != main_startcode){
- if(next_code == frame_startcode)
- sync_point
+ if(next_code == syncpoint_startcode)
+ syncpoint
frame
}
}
@@ -316,10 +318,10 @@
stream_starcode
0x11405BF2F9DBULL + (((uint64_t)('N'<<8) + 'S')<<48)
-frame_startcode
+syncpoint_startcode
0xE4ADEECA4569ULL + (((uint64_t)('N'<<8) + 'K')<<48)
- frame_startcodes SHOULD be placed immediately before a keyframe if the
+ syncpoint_startcodes SHOULD be placed immediately before a keyframe if the
previous frame of the same stream was a non-keyframe, unless such
non-keyframe - keyframe transitions are very frequent
@@ -333,8 +335,8 @@
NUT version. The current value is 2.
max_distance
- max distance of frame_startcodes, the distance may only be larger if
- there is only a single frame between the two frame_startcodes this can
+ max distance of syncpoints, the distance may only be larger if
+ there is no more than a single frame between the two syncpoints. This can
be used by the demuxer to detect damaged frame headers if the damage
results in too long of a chain
-------------- next part --------------
--- mpcf.1.txt 2006-01-04 16:20:56.000000000 +0200
+++ mpcf.2.txt 2006-01-04 17:26:05.000000000 +0200
@@ -481,9 +481,9 @@
stream, into which the current pts is inserted and the element with
the smallest value is removed, this is then the current dts
this buffer is initalized with decode_delay -1 elements
- all frames must be monotone, that means a frame
- which occurs later in the stream must have a larger or equal dts
- than an earlier frame
+
+ Pts of all frames in all streams MUST be bigger or equal to dts of all
+ previous frames in all streams, compared in common timebase.
width/height
MUST be set to the coded width/height
-------------- next part --------------
--- mpcf.2.txt 2006-01-04 17:26:05.000000000 +0200
+++ mpcf.3.txt 2006-01-04 17:24:32.000000000 +0200
@@ -135,8 +135,6 @@
stream_count v
max_distance v
max_index_distance v
- global_time_base_nom v
- global_time_base_denom v
for(i=0; i<256; ){
tmp_flag v
tmp_fields v
@@ -258,7 +256,9 @@
syncpoint:
syncpoint_startcode f(64)
- global_timestamp v
+ coded_pts v
+ stream = coded_pts % stream_count
+ global_key_pts = coded_pts/stream_count
back_ptr v
Complete definition:
@@ -306,6 +306,10 @@
one keyframe for each stream lies between the syncpoint to which
real_back_ptr points, and the current syncpoint.
+global_key_pts
+ After a syncpoint, last_pts of each stream is to be set to:
+ last_pts[i] = convert_ts(global_key_pts, timebase[stream], timebase[i])
+
file_id_string
"nut/multimedia container\0"
@@ -383,22 +387,15 @@
29.97 1001 30000
23.976 1001 24000
-global_time_base_nom / global_time_base_denom = global_time_base
- the length of a timer tick in seconds
- global_time_base_nom and global_time_base_denom MUST NOT be 0
- global_time_base_nom and global_time_base_denom MUST be relatively prime
- global_time_base_denom MUST be < 2^31
-
-global_timestamp
- timestamp in global_time_base units
- when a global_timestamp is encountered the last_pts of all
- streams is set to the following:
-
- ln = global_time_base_nom*time_base_denom
- sn = global_timestamp
- d1 = global_time_base_denom
- d2 = time_base_nom
- last_pts = (ln/d1*sn + ln%d1*sn/d1)/d2
+convert_ts
+ To switch from 2 different timebases, the following calculation is
+ defined:
+
+ ln = from_time_base_nom*to_time_base_denom
+ sn = from_timestamp
+ d1 = from_time_base_denom
+ d2 = to_time_base_nom
+ timestamp = (ln/d1*sn + ln%d1*sn/d1)/d2
Note: this calculation MUST be done with unsigned 64 bit integers, and
is equivalent to (ln*sn)/(d1*d2) but this would require a 96bit integer
-------------- next part --------------
--- mpcf.3.txt 2006-01-04 17:24:32.000000000 +0200
+++ mpcf.4.txt 2006-01-05 19:41:53.000000000 +0200
@@ -138,20 +138,26 @@
for(i=0; i<256; ){
tmp_flag v
tmp_fields v
- if(tmp_fields>0) tmp_pts s
- if(tmp_fields>1) tmp_mul v
- if(tmp_fields>2) tmp_stream v
- if(tmp_fields>3) tmp_size v
+ if(tmp_fields>0) tmp_mul v
+ else tmp_mul=1
+ if(tmp_fields>1) tmp_sflag v
+ else tmp_sflag=0
+ if(tmp_fields>2) tmp_pts s
+ else tmp_pts=0
+ if(tmp_fields>3) tmp_stream v
+ else tmp_stream=0
+ if(tmp_fields>4) tmp_size v
else tmp_size=0
- if(tmp_fields>4) tmp_res v
+ if(tmp_fields>5) tmp_res v
else tmp_res=0
- if(tmp_fields>5) count v
+ if(tmp_fields>6) count v
else count= tmp_mul - tmp_size
- for(j=6; j<tmp_fields; j++){
+ for(j=7; j<tmp_fields; j++){
tmp_reserved[i] v
}
for(j=0; j<count && i<256; j++, i++){
flags[i]= tmp_flag;
+ stream_flags[i]= tmp_sflag;
stream_id_plus1[i]= tmp_stream;
data_size_mul[i]= tmp_mul;
data_size_lsb[i]= tmp_size + j;
@@ -212,6 +218,9 @@
if(flags[frame_code]&1){
data_size_msb v
}
+ if(flags[frame_code]&2){
+ coded_stream_flags v
+ }
for(i=0; i<reserved_count[frame_code]; i++)
reserved v
data
@@ -306,6 +315,11 @@
one keyframe for each stream lies between the syncpoint to which
real_back_ptr points, and the current syncpoint.
+ A stream where EOR is set is to be ignored for back_ptr.
+
+ Note: back_ptr can be zero if there is only a single relavent stream
+ and has a keyframe immediately following the syncpoint.
+
global_key_pts
After a syncpoint, last_pts of each stream is to be set to:
last_pts[i] = convert_ts(global_key_pts, timebase[stream], timebase[i])
@@ -421,17 +435,27 @@
different from the first byte of any startcode
flags[frame_code]
- first of the flags from MSB to LSB are called KD
- if D is 1 then data_size_msb is coded, otherwise data_size_msb is 0
- K is the keyframe_type
- 0 -> no keyframe,
- 1 -> keyframe,
- flags=4 can be used to mark illegal frame_code bytes
- frame_code=78 must have flags=4
- Note: frames MUST NOT depend(1) upon frames prior to the last
- frame_startcode
- Important: depend(1) means dependency on the container level (NUT) not
- dependency on the codec level
+ Bit Name Description
+ 1 data_size_msb if set, data_size_msb is at frame header,
+ otherwise data_size_msb is 0
+ 2 more_flags if set, stream control flags are at frame header.
+ 4 invalid if set, frame_code is invalid.
+
+ frame_code=78 ('N') MUST have flags=64
+
+stream_flags
+ stream_flags is "stream_flags[frame_code] ^ coded_stream_flags"
+
+ Bit Name Description
+ 1 is_key if set, frame is keyframe
+ 2 end_of_relevance if set, stream has no relevance on
+ presentation. (EOR)
+
+ EOR frames MUST be zero-length and must be set keyframe.
+ All streams SHOULD end with EOR, where the pts of the EOR indicates the
+ end presentation time of the final frame.
+ An EOR set stream is unset by the first content frames.
+ When an EOR is unset, dts_cache of the stream is reset to -1.
stream_id_plus1[frame_code]
must be <250
@@ -480,7 +504,8 @@
this buffer is initalized with decode_delay -1 elements
Pts of all frames in all streams MUST be bigger or equal to dts of all
- previous frames in all streams, compared in common timebase.
+ previous frames in all streams, compared in common timebase. (EOR
+ frames are NOT exempt from this rule)
width/height
MUST be set to the coded width/height
-------------- next part --------------
--- mpcf.4.txt 2006-01-05 19:41:53.000000000 +0200
+++ mpcf.final.txt 2006-01-05 20:02:12.000000000 +0200
@@ -27,7 +27,7 @@
Error resistant
seeking / playback without an index
- headers & index can be repeated
+ headers can be repeated
damaged files can be played back with minimal data loss and fast
resync times
@@ -134,7 +134,6 @@
version v
stream_count v
max_distance v
- max_index_distance v
for(i=0; i<256; ){
tmp_flag v
tmp_fields v
@@ -228,7 +227,6 @@
index:
index_startcode f(64)
packet header
- stream_id v
max_pts v
index_length v
for(i=0; i<index_length; i++){
@@ -294,9 +292,7 @@
}
}
if (next_code == index_startcode){
- while(!eof){
- index
- }
+ index
index_ptr u(64)
}
@@ -324,6 +320,9 @@
After a syncpoint, last_pts of each stream is to be set to:
last_pts[i] = convert_ts(global_key_pts, timebase[stream], timebase[i])
+ global_key_pts MUST be the highest pts of all keyframes across all
+ streams.
+
file_id_string
"nut/multimedia container\0"
@@ -362,13 +361,6 @@
good reason to set it higher, otherwise reasonable error recovery will
be impossible
-max_index_distance
- max distance of keyframes which are represented in the index, the
- distance between consecutive entries A and B may only be larger if
- there are no keyframes within this stream between A and B
- SHOULD be set to <=32768 or at least <=65536 unless there is a very
- good reason to set it higher
-
stream_id
Stream identifier
stream_id MUST be < stream_count
@@ -532,23 +524,30 @@
forward_ptr until last byte before the checksum).
max_pts
- The highest pts in the stream.
+ s = max_pts % stream_count
+ pts = max_pts / stream_count
+ The highest pts in the entire file in the timebase of stream 's'.
index_pts
- value of the pts of a keyframe relative to the last keyframe
- stored in this index
+ s = index_pts % stream_count
+ pts = index_pts / stream_count
+ pts is relative to last index entry by:
+ pts += convert(last_index_pts, timebase[last_index_s], timebase[s])
+ pts is the lowest global_key_pts of a group of syncpoints with a common
+ back_ptr.
index_position
- position in bytes of the first byte of a keyframe, relative to the
- last keyframe stored in this index
- there MUST be no keyframe with the same stream_id as this index between
- two consecutive index entries if they are more than max_index_distance
- apart
+ relative to last index_position.
+ real_back_ptr = index_position * 8
+ offset from begginning of file to up to 7 bytes before the syncpoint,
+ pointed to by the back_ptr of the syncpoint referred to in this index
+ entry. All syncpoints with a common back_ptr MUST be reduced to a
+ single index entry, with the pts of the first syncpoint.
index_ptr
- Length in bytes from the first byte of the first index startcode
- to the first byte of the index_ptr. If there is no index, index_ptr
- MUST NOT be written.
+ Length in bytes from the first byte of the index startcode to the first
+ byte of the index_ptr. If there is no index, index_ptr MUST NOT be
+ written.
id
the ID of the type/name pair, so it is more compact
More information about the MPlayer-dev-eng
mailing list