[MPlayer-dev-eng] Lots of stuff for NUT

Oded Shimon ods15 at ods15.dyndns.org
Thu Jan 5 19:14:08 CET 2006


On Thu, Jan 05, 2006 at 03:21:22PM +0100, Michael Niedermayer wrote:
> Hi
> 
> On Thu, Jan 05, 2006 at 04:04:17PM +0200, Oded Shimon wrote:
> [...]
> > > furthermore we havnt thought about error resistance of these features its
> > > part of the goals, and being ignored entirely
> > > what if a the pts or back ptrs are damaged? how do we ensure recovery of
> > > the demuxer?
> > > 
> > > i really dont want to flame but every few month iam being confronted with a
> > > new set of 3-5 requirements which are absolute and undisscussable and which
> > > nut must conform too while everything else is not worth even mentioning
> > > 
> > > first it was 1pass no buffering, strict dts, absolute minimal overhead,
> > > absolute no redundancy to avoid inconsistant files, ...
> > > now its optimal seeking, 1pass (dunno if the strict no buffer rule was droped?)
> > > dts ordering and overhead doesnt matter at all anymore, ...
> > 
> > Well, you miss all the IRC convos where most of this takes place. :)
> > 
> > Looking back at it all now, I think you are right, we should go back to 
> > single back_ptr, "percise optimal seeking" isn't all that useful, Rich was 
> > mostly thinking for video editors when he brought it up, and now that I 
> > think of it, they will most likely use intra only codecs anyway. As for 
> > music videos, I found that "keyframe percision" is actually very high, 
> > I've never had a nut seek going more then 3 seconds off target. (with 
> > single back_ptr). Single back_ptr is still necessary IMO, because otherwise 
> > using NUT for video editing is impossible altogether. BTW, given this, the 
> > rule for syncpoint pts is using the max keyframe pts across all streams, 
> > this is the only correct way for doing it (and without global timebase at 
> > all, we should still use coded_pts). The index is the old syncpoint index I 
> > proposed - those with a common back_ptr can be removed, making a tiny 10kb 
> > index per hour. The nice pro to all this, I already have the demuxer fully 
> > implemented. :) The biggest thing I still hate with back_ptr - subtitles. 
> > (As in, any stream with gaps.) back_ptr will be huge and useless, and EOR 
> > can't really be enforced, it is merely a hint.
> 
> why? the "non relevant" streams are just ignored in the single back_ptr

yes, ofcourse EOR streams are ignored in back ptr, however we cannot 
enforce EOR being set for every subtitle gap, it SHOULD be used, but not 
MUST.

BTW, with the "ancient" index method with max_index_distance, we were NOT 
under 10kb an hour, unless max_index_distance was 512kb. When it was 32kb, 
the index was 100kb or so. However a syncpoint index with single back_ptr 
and syncpoints collapsed, the index is indeed 5-8kb.

Anyway, here are my new patches, we're back to single back_ptr, hurray.

step 1: mosly cosmetic
1) change date and goals slightly
2) fix 's' in info packets
3) rename sync_point to syncpoint, and frame_startcode to syncpoint_startcode

step 2:
1) MN rule

step 3:
1) remove global_timebase
2) define convert_ts
3) use coded_pts for syncpoint

step 4:
1) rearrange main header
2) add coded_stream_flags
3) add EOR

step 5:
1) remove max_index_distance
2) change goals
3) change index
4) Add pts rule for syncpoint - the max pts of all keyframes in all 
   streams. I've given this much thought before we were dealing with per 
   stream stuff, and it is the only perfectly correct pts, the one that 
   will only ever rewind you too much, and not not enough.

I never did commit 1 & 3, but I think now I can commit everything here?
I guess rearranging main header is probably thing we're still undecided on.

- ods15
-------------- next part --------------
--- mpcf.txt	2006-01-04 07:53:43.000000000 +0200
+++ mpcf.1.txt	2006-01-04 16:20:56.000000000 +0200
@@ -1,5 +1,5 @@
 ========================================
-NUT Open Container Format DRAFT 20051118
+NUT Open Container Format DRAFT 20060105
 ========================================
 
 
@@ -23,7 +23,7 @@
     ~0.2% overhead, for normal bitrates
     index is <10kb per hour (1 keyframe every 3sec)
     a usual header for a file is about 100 bytes (audio + video headers together)
-    a packet header is about ~1-8 bytes
+    a packet header is about ~1-5 bytes
 
 Error resistant
     seeking / playback without an index
@@ -243,6 +243,8 @@
             name                        vb
         if(type=="v")
             value                       v
+        else if(type=="s")
+            value                       s
         else
             value                       vb
     }
@@ -254,8 +256,8 @@
     packet header
     info_frame
 
-sync_point:
-    frame_startcode                     f(64)
+syncpoint:
+    syncpoint_startcode                 f(64)
     global_timestamp                    v
     back_ptr                            v
 
@@ -277,8 +279,8 @@
             info_packet
         }
         while(next_code != main_startcode){
-            if(next_code == frame_startcode)
-                sync_point
+            if(next_code == syncpoint_startcode)
+                syncpoint
             frame
         }
     }
@@ -316,10 +318,10 @@
 stream_starcode
     0x11405BF2F9DBULL + (((uint64_t)('N'<<8) + 'S')<<48)
 
-frame_startcode
+syncpoint_startcode
     0xE4ADEECA4569ULL + (((uint64_t)('N'<<8) + 'K')<<48)
 
-    frame_startcodes SHOULD be placed immediately before a keyframe if the
+    syncpoint_startcodes SHOULD be placed immediately before a keyframe if the
     previous frame of the same stream was a non-keyframe, unless such
     non-keyframe - keyframe transitions are very frequent
 
@@ -333,8 +335,8 @@
     NUT version. The current value is 2.
 
 max_distance
-    max distance of frame_startcodes, the distance may only be larger if
-    there is only a single frame between the two frame_startcodes this can
+    max distance of syncpoints, the distance may only be larger if
+    there is no more than a single frame between the two syncpoints. This can
     be used by the demuxer to detect damaged frame headers if the damage
     results in too long of a chain
 
-------------- next part --------------
--- mpcf.1.txt	2006-01-04 16:20:56.000000000 +0200
+++ mpcf.2.txt	2006-01-04 17:26:05.000000000 +0200
@@ -481,9 +481,9 @@
     stream, into which the current pts is inserted and the element with
     the smallest value is removed, this is then the current dts
     this buffer is initalized with decode_delay -1 elements
-    all frames must be monotone, that means a frame
-    which occurs later in the stream must have a larger or equal dts
-    than an earlier frame
+
+    Pts of all frames in all streams MUST be bigger or equal to dts of all
+    previous frames in all streams, compared in common timebase.
 
 width/height
     MUST be set to the coded width/height
-------------- next part --------------
--- mpcf.2.txt	2006-01-04 17:26:05.000000000 +0200
+++ mpcf.3.txt	2006-01-04 17:24:32.000000000 +0200
@@ -135,8 +135,6 @@
     stream_count                        v
     max_distance                        v
     max_index_distance                  v
-    global_time_base_nom                v
-    global_time_base_denom              v
     for(i=0; i<256; ){
         tmp_flag                        v
         tmp_fields                      v
@@ -258,7 +256,9 @@
 
 syncpoint:
     syncpoint_startcode                 f(64)
-    global_timestamp                    v
+    coded_pts                           v
+    stream = coded_pts % stream_count
+    global_key_pts = coded_pts/stream_count
     back_ptr                            v
 
             Complete definition:
@@ -306,6 +306,10 @@
     one keyframe for each stream lies between the syncpoint to which
     real_back_ptr points, and the current syncpoint.
 
+global_key_pts
+    After a syncpoint, last_pts of each stream is to be set to:
+    last_pts[i] = convert_ts(global_key_pts, timebase[stream], timebase[i])
+
 file_id_string
     "nut/multimedia container\0"
 
@@ -383,22 +387,15 @@
         29.97     1001             30000
         23.976    1001             24000
 
-global_time_base_nom / global_time_base_denom = global_time_base
-    the length of a timer tick in seconds
-    global_time_base_nom and global_time_base_denom MUST NOT be 0
-    global_time_base_nom and global_time_base_denom MUST be relatively prime
-    global_time_base_denom MUST be < 2^31
-
-global_timestamp
-    timestamp in global_time_base units
-    when a global_timestamp is encountered the last_pts of all
-    streams is set to the following:
-
-    ln       = global_time_base_nom*time_base_denom
-    sn       = global_timestamp
-    d1       = global_time_base_denom
-    d2       = time_base_nom
-    last_pts = (ln/d1*sn + ln%d1*sn/d1)/d2
+convert_ts
+    To switch from 2 different timebases, the following calculation is
+    defined:
+
+    ln        = from_time_base_nom*to_time_base_denom
+    sn        = from_timestamp
+    d1        = from_time_base_denom
+    d2        = to_time_base_nom
+    timestamp = (ln/d1*sn + ln%d1*sn/d1)/d2
     Note: this calculation MUST be done with unsigned 64 bit integers, and
     is equivalent to (ln*sn)/(d1*d2) but this would require a 96bit integer
 
-------------- next part --------------
--- mpcf.3.txt	2006-01-04 17:24:32.000000000 +0200
+++ mpcf.4.txt	2006-01-05 19:41:53.000000000 +0200
@@ -138,20 +138,26 @@
     for(i=0; i<256; ){
         tmp_flag                        v
         tmp_fields                      v
-        if(tmp_fields>0) tmp_pts        s
-        if(tmp_fields>1) tmp_mul        v
-        if(tmp_fields>2) tmp_stream     v
-        if(tmp_fields>3) tmp_size       v
+        if(tmp_fields>0) tmp_mul        v
+        else tmp_mul=1
+        if(tmp_fields>1) tmp_sflag      v
+        else tmp_sflag=0
+        if(tmp_fields>2) tmp_pts        s
+        else tmp_pts=0
+        if(tmp_fields>3) tmp_stream     v
+        else tmp_stream=0
+        if(tmp_fields>4) tmp_size       v
         else tmp_size=0
-        if(tmp_fields>4) tmp_res        v
+        if(tmp_fields>5) tmp_res        v
         else tmp_res=0
-        if(tmp_fields>5) count          v
+        if(tmp_fields>6) count          v
         else count= tmp_mul - tmp_size
-        for(j=6; j<tmp_fields; j++){
+        for(j=7; j<tmp_fields; j++){
             tmp_reserved[i]             v
         }
         for(j=0; j<count && i<256; j++, i++){
             flags[i]= tmp_flag;
+            stream_flags[i]= tmp_sflag;
             stream_id_plus1[i]= tmp_stream;
             data_size_mul[i]= tmp_mul;
             data_size_lsb[i]= tmp_size + j;
@@ -212,6 +218,9 @@
     if(flags[frame_code]&1){
         data_size_msb                   v
     }
+    if(flags[frame_code]&2){
+        coded_stream_flags              v
+    }
     for(i=0; i<reserved_count[frame_code]; i++)
         reserved                        v
     data
@@ -306,6 +315,11 @@
     one keyframe for each stream lies between the syncpoint to which
     real_back_ptr points, and the current syncpoint.
 
+    A stream where EOR is set is to be ignored for back_ptr.
+
+    Note: back_ptr can be zero if there is only a single relavent stream
+    and has a keyframe immediately following the syncpoint.
+
 global_key_pts
     After a syncpoint, last_pts of each stream is to be set to:
     last_pts[i] = convert_ts(global_key_pts, timebase[stream], timebase[i])
@@ -421,17 +435,27 @@
     different from the first byte of any startcode
 
 flags[frame_code]
-    first of the flags from MSB to LSB are called KD
-    if D is 1 then data_size_msb is coded, otherwise data_size_msb is 0
-    K is the keyframe_type
-        0 -> no keyframe,
-        1 -> keyframe,
-    flags=4 can be used to mark illegal frame_code bytes
-    frame_code=78 must have flags=4
-    Note: frames MUST NOT depend(1) upon frames prior to the last
-          frame_startcode
-    Important: depend(1) means dependency on the container level (NUT) not
-    dependency on the codec level
+    Bit  Name             Description
+      1  data_size_msb    if set, data_size_msb is at frame header,
+                          otherwise data_size_msb is 0
+      2  more_flags       if set, stream control flags are at frame header.
+      4  invalid          if set, frame_code is invalid.
+
+    frame_code=78 ('N') MUST have flags=64
+
+stream_flags
+    stream_flags is "stream_flags[frame_code] ^ coded_stream_flags"
+
+    Bit  Name               Description
+      1  is_key             if set, frame is keyframe
+      2  end_of_relevance   if set, stream has no relevance on
+                            presentation. (EOR)
+
+    EOR frames MUST be zero-length and must be set keyframe.
+    All streams SHOULD end with EOR, where the pts of the EOR indicates the
+    end presentation time of the final frame.
+    An EOR set stream is unset by the first content frames.
+    When an EOR is unset, dts_cache of the stream is reset to -1.
 
 stream_id_plus1[frame_code]
     must be <250
@@ -480,7 +504,8 @@
     this buffer is initalized with decode_delay -1 elements
 
     Pts of all frames in all streams MUST be bigger or equal to dts of all
-    previous frames in all streams, compared in common timebase.
+    previous frames in all streams, compared in common timebase. (EOR
+    frames are NOT exempt from this rule)
 
 width/height
     MUST be set to the coded width/height
-------------- next part --------------
--- mpcf.4.txt	2006-01-05 19:41:53.000000000 +0200
+++ mpcf.final.txt	2006-01-05 20:02:12.000000000 +0200
@@ -27,7 +27,7 @@
 
 Error resistant
     seeking / playback without an index
-    headers & index can be repeated
+    headers can be repeated
     damaged files can be played back with minimal data loss and fast
     resync times
 
@@ -134,7 +134,6 @@
     version                             v
     stream_count                        v
     max_distance                        v
-    max_index_distance                  v
     for(i=0; i<256; ){
         tmp_flag                        v
         tmp_fields                      v
@@ -228,7 +227,6 @@
 index:
     index_startcode                     f(64)
     packet header
-    stream_id                           v
     max_pts                             v
     index_length                        v
     for(i=0; i<index_length; i++){
@@ -294,9 +292,7 @@
         }
     }
     if (next_code == index_startcode){
-        while(!eof){
-            index
-        }
+        index
         index_ptr                       u(64)
     }
 
@@ -324,6 +320,9 @@
     After a syncpoint, last_pts of each stream is to be set to:
     last_pts[i] = convert_ts(global_key_pts, timebase[stream], timebase[i])
 
+    global_key_pts MUST be the highest pts of all keyframes across all
+    streams.
+
 file_id_string
     "nut/multimedia container\0"
 
@@ -362,13 +361,6 @@
     good reason to set it higher, otherwise reasonable error recovery will
     be impossible
 
-max_index_distance
-    max distance of keyframes which are represented in the index, the
-    distance between consecutive entries A and B may only be larger if
-    there are no keyframes within this stream between A and B
-    SHOULD be set to <=32768 or at least <=65536 unless there is a very
-    good reason to set it higher
-
 stream_id
     Stream identifier
     stream_id MUST be < stream_count
@@ -532,23 +524,30 @@
     forward_ptr until last byte before the checksum).
 
 max_pts
-    The highest pts in the stream.
+    s = max_pts % stream_count
+    pts = max_pts / stream_count
+    The highest pts in the entire file in the timebase of stream 's'.
 
 index_pts
-    value of the pts of a keyframe relative to the last keyframe
-    stored in this index
+    s = index_pts % stream_count
+    pts = index_pts / stream_count
+    pts is relative to last index entry by:
+       pts += convert(last_index_pts, timebase[last_index_s], timebase[s])
+    pts is the lowest global_key_pts of a group of syncpoints with a common
+    back_ptr.
 
 index_position
-    position in bytes of the first byte of a keyframe, relative to the
-    last keyframe stored in this index
-    there MUST be no keyframe with the same stream_id as this index between
-    two consecutive index entries if they are more than max_index_distance
-    apart
+    relative to last index_position.
+    real_back_ptr = index_position * 8
+    offset from begginning of file to up to 7 bytes before the syncpoint,
+    pointed to by the back_ptr of the syncpoint referred to in this index
+    entry. All syncpoints with a common back_ptr MUST be reduced to a
+    single index entry, with the pts of the first syncpoint.
 
 index_ptr
-    Length in bytes from the first byte of the first index startcode
-    to the first byte of the index_ptr. If there is no index, index_ptr
-    MUST NOT be written.
+    Length in bytes from the first byte of the index startcode to the first
+    byte of the index_ptr. If there is no index, index_ptr MUST NOT be
+    written.
 
 id
     the ID of the type/name pair, so it is more compact


More information about the MPlayer-dev-eng mailing list