[MPlayer-dev-eng] Lots of stuff for NUT

Oded Shimon ods15 at ods15.dyndns.org
Mon Jan 9 11:24:29 CET 2006


On Sat, Jan 07, 2006 at 09:38:54PM -0500, Rich Felker wrote:
> On Sat, Jan 07, 2006 at 09:07:05PM +0100, Michael Niedermayer wrote:
> > Hi
> > 
> > On Sat, Jan 07, 2006 at 01:41:52PM -0500, Rich Felker wrote:
> > > On Sat, Jan 07, 2006 at 12:24:48PM +0100, Michael Niedermayer wrote:
> > > > > You never really answered the question, does the overhead really bother you 
> > > > > that much Michael? I want to make NUT done already. Rich wants keyframe 
> > > > > exact seeking (and I agree), we all want demuxer and muxer simplicity, and 
> > > > > the only solution for this is per stream pts and ptr in every syncpoint.
> > > > 
> > > > i dont think pts per stream are needed for this, what exactly was the problem
> > > > anyway if we just have a rule like
> > > > output a syncpoint immedeatly if a back_ptr changes by more then t
> > > > doesnt that ensure that we never have to search more then t?
> > > 
> > > The problem is: how do you find the correct syncpoint to start your
> > > search at? How do you properly assign timestamps to make it work?
> > 
> > hmmmmmmm, what about the following
> > 
> > * store syncpoints with timestamp and a single back pointer and a pointer
> > to the last index chunk
> > * regularely store index chunks using my proposed index format (syncpoint
> > timestamp+pos and keyframe flags for all streams and syncpoints since the last
> > index chunk)
> 
> The proposal still has per-stream pts, even for the index. Without it
> you won't be able to find the correct position with just a single
> media seek.

Index still needs improoving. I want this committed...

- ods15
-------------- next part --------------
--- mpcf.3.txt	2006-01-04 17:24:32.000000000 +0200
+++ mpcf.4.txt	2006-01-05 19:41:53.000000000 +0200
@@ -138,20 +138,26 @@
     for(i=0; i<256; ){
         tmp_flag                        v
         tmp_fields                      v
-        if(tmp_fields>0) tmp_pts        s
-        if(tmp_fields>1) tmp_mul        v
-        if(tmp_fields>2) tmp_stream     v
-        if(tmp_fields>3) tmp_size       v
+        if(tmp_fields>0) tmp_mul        v
+        else tmp_mul=1
+        if(tmp_fields>1) tmp_sflag      v
+        else tmp_sflag=0
+        if(tmp_fields>2) tmp_pts        s
+        else tmp_pts=0
+        if(tmp_fields>3) tmp_stream     v
+        else tmp_stream=0
+        if(tmp_fields>4) tmp_size       v
         else tmp_size=0
-        if(tmp_fields>4) tmp_res        v
+        if(tmp_fields>5) tmp_res        v
         else tmp_res=0
-        if(tmp_fields>5) count          v
+        if(tmp_fields>6) count          v
         else count= tmp_mul - tmp_size
-        for(j=6; j<tmp_fields; j++){
+        for(j=7; j<tmp_fields; j++){
             tmp_reserved[i]             v
         }
         for(j=0; j<count && i<256; j++, i++){
             flags[i]= tmp_flag;
+            stream_flags[i]= tmp_sflag;
             stream_id_plus1[i]= tmp_stream;
             data_size_mul[i]= tmp_mul;
             data_size_lsb[i]= tmp_size + j;
@@ -212,6 +218,9 @@
     if(flags[frame_code]&1){
         data_size_msb                   v
     }
+    if(flags[frame_code]&2){
+        coded_stream_flags              v
+    }
     for(i=0; i<reserved_count[frame_code]; i++)
         reserved                        v
     data
@@ -306,6 +315,11 @@
     one keyframe for each stream lies between the syncpoint to which
     real_back_ptr points, and the current syncpoint.
 
+    A stream where EOR is set is to be ignored for back_ptr.
+
+    Note: back_ptr can be zero if there is only a single relavent stream
+    and has a keyframe immediately following the syncpoint.
+
 global_key_pts
     After a syncpoint, last_pts of each stream is to be set to:
     last_pts[i] = convert_ts(global_key_pts, timebase[stream], timebase[i])
@@ -421,17 +435,27 @@
     different from the first byte of any startcode
 
 flags[frame_code]
-    first of the flags from MSB to LSB are called KD
-    if D is 1 then data_size_msb is coded, otherwise data_size_msb is 0
-    K is the keyframe_type
-        0 -> no keyframe,
-        1 -> keyframe,
-    flags=4 can be used to mark illegal frame_code bytes
-    frame_code=78 must have flags=4
-    Note: frames MUST NOT depend(1) upon frames prior to the last
-          frame_startcode
-    Important: depend(1) means dependency on the container level (NUT) not
-    dependency on the codec level
+    Bit  Name             Description
+      1  data_size_msb    if set, data_size_msb is at frame header,
+                          otherwise data_size_msb is 0
+      2  more_flags       if set, stream control flags are at frame header.
+      4  invalid          if set, frame_code is invalid.
+
+    frame_code=78 ('N') MUST have flags=64
+
+stream_flags
+    stream_flags is "stream_flags[frame_code] ^ coded_stream_flags"
+
+    Bit  Name               Description
+      1  is_key             if set, frame is keyframe
+      2  end_of_relevance   if set, stream has no relevance on
+                            presentation. (EOR)
+
+    EOR frames MUST be zero-length and must be set keyframe.
+    All streams SHOULD end with EOR, where the pts of the EOR indicates the
+    end presentation time of the final frame.
+    An EOR set stream is unset by the first content frames.
+    When an EOR is unset, dts_cache of the stream is reset to -1.
 
 stream_id_plus1[frame_code]
     must be <250
@@ -480,7 +504,8 @@
     this buffer is initalized with decode_delay -1 elements
 
     Pts of all frames in all streams MUST be bigger or equal to dts of all
-    previous frames in all streams, compared in common timebase.
+    previous frames in all streams, compared in common timebase. (EOR
+    frames are NOT exempt from this rule)
 
 width/height
     MUST be set to the coded width/height
-------------- next part --------------
--- mpcf.4.txt	2006-01-05 19:41:53.000000000 +0200
+++ mpcf.final.txt	2006-01-09 12:08:21.000000000 +0200
@@ -21,13 +21,13 @@
 
 Compact
     ~0.2% overhead, for normal bitrates
-    index is <10kb per hour (1 keyframe every 3sec)
+    index is <100kb per hour (1 keyframe every 3sec)
     a usual header for a file is about 100 bytes (audio + video headers together)
     a packet header is about ~1-5 bytes
 
 Error resistant
     seeking / playback without an index
-    headers & index can be repeated
+    headers can be repeated
     damaged files can be played back with minimal data loss and fast
     resync times
 
@@ -134,7 +134,6 @@
     version                             v
     stream_count                        v
     max_distance                        v
-    max_index_distance                  v
     for(i=0; i<256; ){
         tmp_flag                        v
         tmp_fields                      v
@@ -228,12 +227,36 @@
 index:
     index_startcode                     f(64)
     packet header
-    stream_id                           v
     max_pts                             v
-    index_length                        v
-    for(i=0; i<index_length; i++){
-        index_pts                       v
-        index_position                  v
+    syncpoints                          v
+    for(i=0; i<syncpoints; i++){
+        syncpoint_pos_div8              v
+    }
+    for(i=0; i<stream_count; i++) {
+        j = 0
+        while (j < syncpoints) {
+            repeat                      v
+            type = repeat & 1
+            repeat = repeat >> 1
+            b = repeat & 1
+            repeat = (repeat >> 1) + 1
+            if (type) {
+                key_pts                 v
+                key_pts += syncpoint[j-1].stream[i].key_pts
+                for(k=0; k<repeat; k++) {
+                    syncpoint[j+k].stream[i].back_ptr = syncpoint[j-b].pos_div8
+                    syncpoint[j+k].stream[i].key_pts = key_pts
+                }
+            } else {
+                for(k=0; k<repeat; k++) {
+                    syncpoint[j+k].stream[i].back_ptr = syncpoint[j+k-b].pos_div8
+                    key_pts             v
+                    key_pts += syncpoint[j+k-1].stream[i].key_pts
+                    syncpoint[j+k].stream[i].key_pts = key_pts
+                }
+            }
+            j += repeat
+        }
     }
     reserved_bytes
     checksum                            u(32)
@@ -268,7 +291,21 @@
     coded_pts                           v
     stream = coded_pts % stream_count
     global_key_pts = coded_pts/stream_count
-    back_ptr                            v
+    back_ptr_div8[0]                    v
+    back_ptr[stream] = back_ptr_div8[0]
+    key_pts[stream] = global_key_pts
+    n=1
+    for (i=0; i<stream_count; i++) {
+        if (i == stream) continue
+        coded_pts                       v
+        A= coded_pts % (n+1)
+        B= coded_pts / (n+1)
+        if(A == n)
+            back_ptr_div8[n++]          v
+        back_ptr[i]= back_ptr_div8[A]
+        key_pts[i] = covert_ts(global_key_pts, timebase[stream], timebase[i])
+        key_pts[i] -= B
+    }
 
             Complete definition:
 
@@ -294,9 +331,7 @@
         }
     }
     if (next_code == index_startcode){
-        while(!eof){
-            index
-        }
+        index
         index_ptr                       u(64)
     }
 
@@ -308,22 +343,31 @@
     size of the packet data (exactly the distance from the first byte
     after the forward_ptr to the first byte of the next packet)
 
-back_ptr
+back_ptr[stream]
     real_back_ptr = back_ptr * 8 + 7
-    real_back_ptr must point to a position such that a syncpoint
-    startcode begins within the next 8 bytes, and such that at least
-    one keyframe for each stream lies between the syncpoint to which
-    real_back_ptr points, and the current syncpoint.
-
-    A stream where EOR is set is to be ignored for back_ptr.
-
-    Note: back_ptr can be zero if there is only a single relavent stream
-    and has a keyframe immediately following the syncpoint.
+    real_back_ptr must point to a position within 8 bytes of a syncpoint
+    startcode. This syncpoint MUST be the closest syncpoint such that at
+    least one keyframe for this stream lies between it and the current
+    syncpoint, or immediately after the current syncpoint.
+
+    Note: back_ptr can be zero, when the frame immediately following is
+    a keyframe of this stream, or EOR has been set for this stream.
+    back_ptr of a stream where EOR is set MUST be zero.
 
 global_key_pts
     After a syncpoint, last_pts of each stream is to be set to:
     last_pts[i] = convert_ts(global_key_pts, timebase[stream], timebase[i])
 
+    To be able to code key_pts for every stream, global_key_pts MUST be the
+    max key_pts across all streams.
+
+key_pts[stream]
+    The pts of the first keyframe in the back_ptr region, including the
+    frame immediately following the syncpoint.
+
+    Note: After an EOR, key_pts MUST be set to global_key_pts in correct
+    timebase. This is to be done by using coded_pts of 0.
+
 file_id_string
     "nut/multimedia container\0"
 
@@ -339,10 +383,6 @@
 syncpoint_startcode
     0xE4ADEECA4569ULL + (((uint64_t)('N'<<8) + 'K')<<48)
 
-    syncpoint_startcodes SHOULD be placed immediately before a keyframe if the
-    previous frame of the same stream was a non-keyframe, unless such
-    non-keyframe - keyframe transitions are very frequent
-
 index_startcode
     0xDD672F23E64EULL + (((uint64_t)('N'<<8) + 'X')<<48)
 
@@ -358,17 +398,16 @@
     be used by the demuxer to detect damaged frame headers if the damage
     results in too long of a chain
 
+    Syncpoints MUST be placed immediately before a non-EOR keyframe if the
+    back_ptr of this stream in the last syncpoint is greater than
+    max_distance.
+
+    The begginning of a frame is defined by the first byte of the frame header.
+
     SHOULD be set to <=32768 or at least <=65536 unless there is a very
     good reason to set it higher, otherwise reasonable error recovery will
     be impossible
 
-max_index_distance
-    max distance of keyframes which are represented in the index, the
-    distance between consecutive entries A and B may only be larger if
-    there are no keyframes within this stream between A and B
-    SHOULD be set to <=32768 or at least <=65536 unless there is a very
-    good reason to set it higher
-
 stream_id
     Stream identifier
     stream_id MUST be < stream_count
@@ -532,23 +571,22 @@
     forward_ptr until last byte before the checksum).
 
 max_pts
-    The highest pts in the stream.
-
-index_pts
-    value of the pts of a keyframe relative to the last keyframe
-    stored in this index
-
-index_position
-    position in bytes of the first byte of a keyframe, relative to the
-    last keyframe stored in this index
-    there MUST be no keyframe with the same stream_id as this index between
-    two consecutive index entries if they are more than max_index_distance
-    apart
+    s = max_pts % stream_count
+    pts = max_pts / stream_count
+    The highest pts in the entire file in the timebase of stream 's'.
+
+syncpoints
+    amount of syncpoints in the file.
+
+syncpoint_pos_div8
+    offset from begginning of file to up to 7 bytes before the syncpoint
+    referred to in this index entry. Relative to position of last
+    syncpoint.
 
 index_ptr
-    Length in bytes from the first byte of the first index startcode
-    to the first byte of the index_ptr. If there is no index, index_ptr
-    MUST NOT be written.
+    Length in bytes from the first byte of the index startcode to the first
+    byte of the index_ptr. If there is no index, index_ptr MUST NOT be
+    written.
 
 id
     the ID of the type/name pair, so it is more compact


More information about the MPlayer-dev-eng mailing list