[MPlayer-dev-eng] Lots of stuff for NUT

Oded Shimon ods15 at ods15.dyndns.org
Wed Jan 4 15:40:39 CET 2006


On Fri, Dec 30, 2005 at 09:10:02PM +0200, Oded Shimon wrote:
> [..]

I've split the patchs to 5 parts:

step 1: mosly cosmetic
1) change date and goals slightly
2) fix 's' in info packets
3) rename sync_point to syncpoint, and frame_startcode to 
syncpoint_startcode

step 2:
1) MN rule
2) put syncpoints before far keyframes (hmm, just noticed, this is slightly 
broken cause it talks about per stream back_ptr's, which don't exist yet)

step 3:
1) remove global_timebase
2) define convert_ts
3) use coded_pts for syncpoint

step 4:
1) rearrange main header
2) add coded_stream_flags
3) add SOR and EOR

step 5:
1) remove max_index_distance
2) change goals
3) change index
4) add back_ptr and pts per stream in syncpoints


Micheal:
I sort of agree with your repeated index, that it should be allowed, but 
only on one condition - it can only be immediately after the main headers 
in the begginning of the file. Nowhere else except the very end of the file 
is acceptable, as it causes the index altering itself. When put at 
begginning of file, all offsets in the index need to be added to the length 
of the index itself.

I think I also agree that back_ptr and etc. can be made purely optional. 
Nothing but the startcode and pts need to be written in "low percision" 
mode, in which case when you seek you just seek to the syncpoint you want 
and start playing, end of story. The index should also be as simple in low 
percision mode. (A file's behavior should not change due to index)

This way, for those who want can have low overhead and low percision for 
their fun movies, and those who need video editing can have all the power 
of active streams and etc. Default should certainely be high percision 
though IMO.

Can I atleast commit steps 1 and 3? (2 is broken due to back_ptr thing).

I'll make a patch with my proposition for optional high percision.

- ods15
-------------- next part --------------
--- mpcf.txt	2006-01-04 07:53:43.000000000 +0200
+++ mpcf.1.txt	2006-01-04 16:20:56.000000000 +0200
@@ -1,5 +1,5 @@
 ========================================
-NUT Open Container Format DRAFT 20051118
+NUT Open Container Format DRAFT 20060103
 ========================================
 
 
@@ -23,7 +23,7 @@
     ~0.2% overhead, for normal bitrates
     index is <10kb per hour (1 keyframe every 3sec)
     a usual header for a file is about 100 bytes (audio + video headers together)
-    a packet header is about ~1-8 bytes
+    a packet header is about ~1-5 bytes
 
 Error resistant
     seeking / playback without an index
@@ -243,6 +243,8 @@
             name                        vb
         if(type=="v")
             value                       v
+        else if(type=="s")
+            value                       s
         else
             value                       vb
     }
@@ -254,8 +256,8 @@
     packet header
     info_frame
 
-sync_point:
-    frame_startcode                     f(64)
+syncpoint:
+    syncpoint_startcode                 f(64)
     global_timestamp                    v
     back_ptr                            v
 
@@ -277,8 +279,8 @@
             info_packet
         }
         while(next_code != main_startcode){
-            if(next_code == frame_startcode)
-                sync_point
+            if(next_code == syncpoint_startcode)
+                syncpoint
             frame
         }
     }
@@ -316,10 +318,10 @@
 stream_starcode
     0x11405BF2F9DBULL + (((uint64_t)('N'<<8) + 'S')<<48)
 
-frame_startcode
+syncpoint_startcode
     0xE4ADEECA4569ULL + (((uint64_t)('N'<<8) + 'K')<<48)
 
-    frame_startcodes SHOULD be placed immediately before a keyframe if the
+    syncpoint_startcodes SHOULD be placed immediately before a keyframe if the
     previous frame of the same stream was a non-keyframe, unless such
     non-keyframe - keyframe transitions are very frequent
 
@@ -333,8 +335,8 @@
     NUT version. The current value is 2.
 
 max_distance
-    max distance of frame_startcodes, the distance may only be larger if
-    there is only a single frame between the two frame_startcodes this can
+    max distance of syncpoints, the distance may only be larger if
+    there is no more than a single frame between the two syncpoints. This can
     be used by the demuxer to detect damaged frame headers if the damage
     results in too long of a chain
 
-------------- next part --------------
--- mpcf.1.txt	2006-01-04 16:20:56.000000000 +0200
+++ mpcf.2.txt	2006-01-04 16:21:20.000000000 +0200
@@ -321,10 +321,6 @@
 syncpoint_startcode
     0xE4ADEECA4569ULL + (((uint64_t)('N'<<8) + 'K')<<48)
 
-    syncpoint_startcodes SHOULD be placed immediately before a keyframe if the
-    previous frame of the same stream was a non-keyframe, unless such
-    non-keyframe - keyframe transitions are very frequent
-
 index_startcode
     0xDD672F23E64EULL + (((uint64_t)('N'<<8) + 'X')<<48)
 
@@ -340,6 +336,12 @@
     be used by the demuxer to detect damaged frame headers if the damage
     results in too long of a chain
 
+    Syncpoints MUST be placed immediately before a non-EOR keyframe if the
+    back_ptr of this stream in the last syncpoint is greater than
+    max_distance.
+
+    The begginning of a frame is defined by the first byte of the frame header.
+
     SHOULD be set to <=32768 or at least <=65536 unless there is a very
     good reason to set it higher, otherwise reasonable error recovery will
     be impossible
@@ -481,9 +483,9 @@
     stream, into which the current pts is inserted and the element with
     the smallest value is removed, this is then the current dts
     this buffer is initalized with decode_delay -1 elements
-    all frames must be monotone, that means a frame
-    which occurs later in the stream must have a larger or equal dts
-    than an earlier frame
+
+    Pts of all frames in all streams MUST be bigger or equal to dts of all
+    previous frames in all streams, compared in common timebase.
 
 width/height
     MUST be set to the coded width/height
-------------- next part --------------
--- mpcf.2.txt	2006-01-04 16:21:20.000000000 +0200
+++ mpcf.3.txt	2006-01-04 16:27:16.000000000 +0200
@@ -135,8 +135,6 @@
     stream_count                        v
     max_distance                        v
     max_index_distance                  v
-    global_time_base_nom                v
-    global_time_base_denom              v
     for(i=0; i<256; ){
         tmp_flag                        v
         tmp_fields                      v
@@ -258,7 +256,9 @@
 
 syncpoint:
     syncpoint_startcode                 f(64)
-    global_timestamp                    v
+    coded_pts                           v
+    stream = coded_pts % stream_count
+    global_key_pts = coded_pts/stream_count
     back_ptr                            v
 
             Complete definition:
@@ -306,6 +306,10 @@
     one keyframe for each stream lies between the syncpoint to which
     real_back_ptr points, and the current syncpoint.
 
+global_key_pts
+    After a syncpoint, last_pts of each stream is to be set to:
+    last_pts[i] = convert_ts(global_key_pts, timebase[stream], timebase[i])
+
 file_id_string
     "nut/multimedia container\0"
 
@@ -385,22 +389,15 @@
         29.97     1001             30000
         23.976    1001             24000
 
-global_time_base_nom / global_time_base_denom = global_time_base
-    the length of a timer tick in seconds
-    global_time_base_nom and global_time_base_denom MUST NOT be 0
-    global_time_base_nom and global_time_base_denom MUST be relatively prime
-    global_time_base_denom MUST be < 2^31
-
-global_timestamp
-    timestamp in global_time_base units
-    when a global_timestamp is encountered the last_pts of all
-    streams is set to the following:
-
-    ln       = global_time_base_nom*time_base_denom
-    sn       = global_timestamp
-    d1       = global_time_base_denom
-    d2       = time_base_nom
-    last_pts = (ln/d1*sn + ln%d1*sn/d1)/d2
+convert_ts
+    To switch from 2 different timebases, the following calculation is
+    defined:
+
+    ln        = from_time_base_nom*to_time_base_denom
+    sn        = from_timestamp
+    d1        = from_time_base_denom
+    d2        = to_time_base_nom
+    timestamp = (ln/d1*sn + ln%d1*sn/d1)/d2
     Note: this calculation MUST be done with unsigned 64 bit integers, and
     is equivalent to (ln*sn)/(d1*d2) but this would require a 96bit integer
 
-------------- next part --------------
--- mpcf.3.txt	2006-01-04 16:27:16.000000000 +0200
+++ mpcf.4.txt	2006-01-04 16:27:46.000000000 +0200
@@ -138,20 +138,26 @@
     for(i=0; i<256; ){
         tmp_flag                        v
         tmp_fields                      v
-        if(tmp_fields>0) tmp_pts        s
-        if(tmp_fields>1) tmp_mul        v
-        if(tmp_fields>2) tmp_stream     v
-        if(tmp_fields>3) tmp_size       v
+        if(tmp_fields>0) tmp_mul        v
+        else tmp_mul=1
+        if(tmp_fields>1) tmp_sflag      v
+        else tmp_sflag=0
+        if(tmp_fields>2) tmp_pts        s
+        else tmp_pts=0
+        if(tmp_fields>3) tmp_stream     v
+        else tmp_stream=0
+        if(tmp_fields>4) tmp_size       v
         else tmp_size=0
-        if(tmp_fields>4) tmp_res        v
+        if(tmp_fields>5) tmp_res        v
         else tmp_res=0
-        if(tmp_fields>5) count          v
+        if(tmp_fields>6) count          v
         else count= tmp_mul - tmp_size
-        for(j=6; j<tmp_fields; j++){
+        for(j=7; j<tmp_fields; j++){
             tmp_reserved[i]             v
         }
         for(j=0; j<count && i<256; j++, i++){
             flags[i]= tmp_flag;
+            stream_flags[i]= tmp_sflag;
             stream_id_plus1[i]= tmp_stream;
             data_size_mul[i]= tmp_mul;
             data_size_lsb[i]= tmp_size + j;
@@ -212,6 +218,9 @@
     if(flags[frame_code]&1){
         data_size_msb                   v
     }
+    if(flags[frame_code]&2){
+        coded_stream_flags              v
+    }
     for(i=0; i<reserved_count[frame_code]; i++)
         reserved                        v
     data
@@ -423,17 +432,32 @@
     different from the first byte of any startcode
 
 flags[frame_code]
-    first of the flags from MSB to LSB are called KD
-    if D is 1 then data_size_msb is coded, otherwise data_size_msb is 0
-    K is the keyframe_type
-        0 -> no keyframe,
-        1 -> keyframe,
-    flags=4 can be used to mark illegal frame_code bytes
-    frame_code=78 must have flags=4
-    Note: frames MUST NOT depend(1) upon frames prior to the last
-          frame_startcode
-    Important: depend(1) means dependency on the container level (NUT) not
-    dependency on the codec level
+    Bit  Name             Description
+      1  data_size_msb    if set, data_size_msb is at frame header,
+                          otherwise data_size_msb is 0
+      2  more_flags       if set, stream control flags are at frame header.
+      4  invalid          if set, frame_code is invalid.
+
+    frame_code=78 ('N') MUST have flags=64
+
+stream_flags
+    stream_flags is "stream_flags[frame_code] ^ coded_stream_flags"
+
+    Bit  Name               Description
+      1  is_key             if set, frame is keyframe
+      2  end_of_relevance   if set, stream has no relevance on
+                            presentation. (EOR)
+      4  start_of_relevance if set, unsets EOR. (SOR)
+
+    EOR and SOR frames MUST be zero-length and must be set keyframe.
+    All streams SHOULD end with EOR, where the pts of the EOR indicates the
+    end presentation time of the final frame.
+    An EOR set stream MUST be unset by an SOR before any content frames.
+    An SOR sets the dts_cache of the stream to the pts of of the SOR.
+    The dts of an SOR is its pts. SOR pts MUST be smaller to pts of all
+    subsequent frames on this stream.
+    Note: SOR can and SHOULD immediately precede the first content frame
+    of its stream.
 
 stream_id_plus1[frame_code]
     must be <250
@@ -482,7 +506,8 @@
     this buffer is initalized with decode_delay -1 elements
 
     Pts of all frames in all streams MUST be bigger or equal to dts of all
-    previous frames in all streams, compared in common timebase.
+    previous frames in all streams, compared in common timebase. (SOR and
+    EOR frames are NOT exempt from this rule)
 
 width/height
     MUST be set to the coded width/height
-------------- next part --------------
--- mpcf.4.txt	2006-01-04 16:27:46.000000000 +0200
+++ mpcf.final.txt	2006-01-04 16:28:12.000000000 +0200
@@ -21,13 +21,13 @@
 
 Compact
     ~0.2% overhead, for normal bitrates
-    index is <10kb per hour (1 keyframe every 3sec)
+    index is <100kb per hour (1 keyframe every 3sec)
     a usual header for a file is about 100 bytes (audio + video headers together)
     a packet header is about ~1-5 bytes
 
 Error resistant
     seeking / playback without an index
-    headers & index can be repeated
+    headers can be repeated
     damaged files can be played back with minimal data loss and fast
     resync times
 
@@ -134,7 +134,6 @@
     version                             v
     stream_count                        v
     max_distance                        v
-    max_index_distance                  v
     for(i=0; i<256; ){
         tmp_flag                        v
         tmp_fields                      v
@@ -228,12 +227,36 @@
 index:
     index_startcode                     f(64)
     packet header
-    stream_id                           v
     max_pts                             v
-    index_length                        v
-    for(i=0; i<index_length; i++){
-        index_pts                       v
-        index_position                  v
+    syncpoints                          v
+    for(i=0; i<syncpoints; i++){
+        syncpoint_pos_div8              v
+    }
+    for(i=0; i<stream_count; i++) {
+        j = 0
+        while (j < syncpoints) {
+            repeat                      v
+            type = repeat & 1
+            repeat = repeat >> 1
+            b = repeat & 1
+            repeat = (repeat >> 1) + 1
+            if (type) {
+                key_pts                 v
+                key_pts += syncpoint[j-1].stream[i].key_pts
+                for(k=0; k<repeat; k++) {
+                    syncpoint[j+k].stream[i].back_ptr = syncpoint[j-b].pos_div8
+                    syncpoint[j+k].stream[i].key_pts = key_pts
+                }
+            } else {
+                for(k=0; k<repeat; k++) {
+                    syncpoint[j+k].stream[i].back_ptr = syncpoint[j+k-b].pos_div8
+                    key_pts             v
+                    key_pts += syncpoint[j+k-1].stream[i].key_pts
+                    syncpoint[j+k].stream[i].key_pts = key_pts
+                }
+            }
+            j += repeat
+        }
     }
     reserved_bytes
     checksum                            u(32)
@@ -267,8 +290,22 @@
     syncpoint_startcode                 f(64)
     coded_pts                           v
     stream = coded_pts % stream_count
+    back_ptr_div8[0]                    v
+    back_ptr[stream] = back_ptr_div8[0]
     global_key_pts = coded_pts/stream_count
-    back_ptr                            v
+    key_pts[stream] = global_key_pts
+    n=1
+    for (i=0; i<stream_count; i++) {
+        if (i == stream) continue
+        coded_pts                       v
+        A= coded_pts % (n+1)
+        B= coded_pts / (n+1)
+        if(A == n)
+            back_ptr_div8[n++]          v
+        back_ptr[i]= back_ptr_div8[A]
+        key_pts[i] = covert_ts(global_key_pts, timebase[stream], timebase[i])
+        key_pts[i] -= B
+    }
 
             Complete definition:
 
@@ -294,9 +331,7 @@
         }
     }
     if (next_code == index_startcode){
-        while(!eof){
-            index
-        }
+        index
         index_ptr                       u(64)
     }
 
@@ -308,17 +343,34 @@
     size of the packet data (exactly the distance from the first byte
     after the forward_ptr to the first byte of the next packet)
 
-back_ptr
+back_ptr[stream]
     real_back_ptr = back_ptr * 8 + 7
-    real_back_ptr must point to a position such that a syncpoint
-    startcode begins within the next 8 bytes, and such that at least
-    one keyframe for each stream lies between the syncpoint to which
-    real_back_ptr points, and the current syncpoint.
+    real_back_ptr must point to a position within 8 bytes of a syncpoint
+    startcode. This syncpoint MUST be the closest syncpoint such that at
+    least one keyframe for this stream lies between it and the current
+    syncpoint, or immediately after the current syncpoint.
+
+    Note: back_ptr can be zero, when the frame immediately following is
+    a keyframe of this stream, or EOR has been set for this stream.
+    back_ptr of a stream where EOR is set MUST be zero.
+
+    Note: SOR is a keyframe like any other and back_ptr must point to it if
+    necessary.
 
 global_key_pts
     After a syncpoint, last_pts of each stream is to be set to:
     last_pts[i] = convert_ts(global_key_pts, timebase[stream], timebase[i])
 
+    To be able to code key_pts for every stream, global_key_pts MUST be the
+    max key_pts across all streams.
+
+key_pts[stream]
+    The pts of the last keyframe in the stream until the syncpoint
+    including the frame immediately following the syncpoint.
+
+    Note: After an EOR, key_pts MUST be set to global_key_pts in correct
+    timebase. This is to be done by using coded_pts of 0.
+
 file_id_string
     "nut/multimedia container\0"
 
@@ -359,13 +411,6 @@
     good reason to set it higher, otherwise reasonable error recovery will
     be impossible
 
-max_index_distance
-    max distance of keyframes which are represented in the index, the
-    distance between consecutive entries A and B may only be larger if
-    there are no keyframes within this stream between A and B
-    SHOULD be set to <=32768 or at least <=65536 unless there is a very
-    good reason to set it higher
-
 stream_id
     Stream identifier
     stream_id MUST be < stream_count
@@ -534,23 +579,22 @@
     forward_ptr until last byte before the checksum).
 
 max_pts
-    The highest pts in the stream.
-
-index_pts
-    value of the pts of a keyframe relative to the last keyframe
-    stored in this index
-
-index_position
-    position in bytes of the first byte of a keyframe, relative to the
-    last keyframe stored in this index
-    there MUST be no keyframe with the same stream_id as this index between
-    two consecutive index entries if they are more than max_index_distance
-    apart
+    s = max_pts % stream_count
+    pts = max_pts / stream_count
+    The highest pts in the entire file in the timebase of stream 's'.
+
+syncpoints
+    amount of syncpoints in the file.
+
+syncpoint_pos_div8
+    offset from begginning of file to up to 7 bytes before the syncpoint
+    referred to in this index entry. Relative to position of last
+    syncpoint.
 
 index_ptr
-    Length in bytes from the first byte of the first index startcode
-    to the first byte of the index_ptr. If there is no index, index_ptr
-    MUST NOT be written.
+    Length in bytes from the first byte of the index startcode to the first
+    byte of the index_ptr. If there is no index, index_ptr MUST NOT be
+    written.
 
 id
     the ID of the type/name pair, so it is more compact


More information about the MPlayer-dev-eng mailing list