
Redone from scratch, if you are fine with it, I'm going to take the current nut.txt and make it as rfc and commit it on our docs stuff (currently I'm tracking it on git so I won't mind if we could switch the nut tree to git, someone does?) Additional authors/editor can be added as shown (currently the rfc has Michael as author and me as editor) I'm using xmlrfc since it's the sanest markup for such stuff please refer to xml.resource.org for a way to check and validate changes. lu -- Luca Barbato Gentoo/linux Gentoo/PPC http://dev.gentoo.org/~lu_zero Network Working Group M. Niedermayer Internet-Draft FFmpeg Intended status: Standards Track L. Barbato, Ed. Expires: March 28, 2008 Politecnico di Torino September 25, 2007 NUT Multimedia Container File Format Status of This Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on March 28, 2008. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract This memo defines a method for efficiently storing generic multimedia streams so that operation like seeking and recover for error can be performed with minimal computational cost. Minimal overhead and maximal extensibility had been considered in the development of the format. Niedermayer & Barbato Expires March 28, 2008 [Page 1] Internet-Draft NUT Container Format September 2007 Editors Note All references to RFC XXXX are to be replaced by references to the RFC number of this memo, when published. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2. Syntax Convetions . . . . . . . . . . . . . . . . . . . . . 4 1.2.1. Datatypes . . . . . . . . . . . . . . . . . . . . . . . 4 2. NUT file layout . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1. High level File structure . . . . . . . . . . . . . . . . . 5 2.2. Main Header . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3. Reserved Headers . . . . . . . . . . . . . . . . . . . . . 5 2.4. Stream Header . . . . . . . . . . . . . . . . . . . . . . . 5 2.5. Info Packet . . . . . . . . . . . . . . . . . . . . . . . . 5 2.6. Index . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.7. Syncpoint . . . . . . . . . . . . . . . . . . . . . . . . . 6 3. Interleaving Rules . . . . . . . . . . . . . . . . . . . . . . 6 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 6 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 7.1. Normative References . . . . . . . . . . . . . . . . . . . 6 7.2. Informative References . . . . . . . . . . . . . . . . . . 6 Niedermayer & Barbato Expires March 28, 2008 [Page 2] Internet-Draft NUT Container Format September 2007 1. Introduction NUT is a multimedia container format for storage of audio, video, subtitles and related user defined streams, it provides exact timestamps for synchronization and seeking, is simple, has low overhead and can recover in case of errors in the stream. This document defines: The file format layout The common stream interleaving rules 1.1. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. This document refers to the following definitions pts Presentation time of the first frame/sample that is completed by decoding the coded frame. dts The time when a frame is input into a synchronous 1-in-1-out decoder. frame Minimal unit of information that can be decoded completely, it is usually holds a full frame video frame, a group of audio samples or a subtitle line. Keyframe A keyframe is a frame from which you can start decoding. The nth frame is a keyframe if and only if frames n, n+1, ... in presentation order (that are all frames with a pts >= frame[n].pts) can be decoded successfully without reference to frames prior n in storage order (that are all frames with a dts < frame[n].dts). If no such frames exist (for example due to using overlapped transforms like the MDCT in an audio codec), then the definition shall be extended by dropping n out of the set of frames which must be decodable, if this is still insufficient then n+1 shall be dropped, and so on until there is a keyframe. Every frame which is marked as a keyframe MUST be a keyframe according to the definition above, a muxer MUST mark every frame it knows is a keyframe as such, a muxer SHOULD NOT analyze future frames to determine the keyframe status of the current frame but instead just set the frame as non-keyframe. Niedermayer & Barbato Expires March 28, 2008 [Page 3] Internet-Draft NUT Container Format September 2007 1.2. Syntax Convetions Since NUT heavily uses variable length fields, the simplest way to describe it is using a pseudocode approach instead of graphical bitfield descriptions. The syntax uses datatypes, tagnames and C-like constructs. 1.2.1. Datatypes f(n) n fixed bits in bigendian order u(n) Unsigned value encoded in n bits MSB-first v Unsigned variable length value. value=0 do{ more_data u(1) data u(7) value= 128*value + data }while(more_data) Figure 1: Variable Length Unsigned Value Values can be encoded using the following logic: the data is in network order, every byte has the most significant bit used as flag and the following 7 used to store the value. The first N bit are to be taken, where N is number of bits representing the value modulo 7, and stored in the first byte. If there are more bits, the flag bit is set to 1 and the subsequent 7bit are stored in the following byte, if there are remaining bits set the flag to 1 and the same procedure is repeated. The ending byte has the flag bit set to 0. In order to decode it is enough to iterate over the bytes until the flag bit set to 0, for every byte the data is added to the accumulated value multiplied by 128. s Signed variable length value. temp v temp++ if(temp&1) value= -(temp>>1) else value= (temp>>1) Figure 2: Variable Length Signed Value The signed values are encoded as the absolute value multiplied by 2, positive numbers have 1 subtracted to the value. [FIXME: why Niedermayer & Barbato Expires March 28, 2008 [Page 4] Internet-Draft NUT Container Format September 2007 not just shift&sign] vb Variable length binary data (or utf-8 string). length v for(i=0; i<length; i++){ data[i] u(8) } Figure 3: Variable Length Signed Value Strings and binary data can be encoded basically writing the byte count as a Variable Length Unsigned Value and the the string. The strings MUST be encoded in utf-8 t Variable length binary data (or utf-8 string). tmp v id= tmp % time_base_count value= (tmp / time_base_count) * time_base[id] Figure 4: Variable Length Timestamp [FIXME] 2. NUT file layout 2.1. High level File structure [TODO] 2.2. Main Header [TODO] 2.3. Reserved Headers [TODO] 2.4. Stream Header [TODO] 2.5. Info Packet [TODO] Niedermayer & Barbato Expires March 28, 2008 [Page 5] Internet-Draft NUT Container Format September 2007 2.6. Index [TODO] 2.7. Syncpoint [TODO] 3. Interleaving Rules 4. IANA Considerations [TODO] In order to comply with IESG policy as set forth in http://www.ietf.org/ID-Checklist.html, every Internet-Draft that is submitted to the IESG for publication MUST contain an IANA Considerations section. 5. Security Considerations [TODO] 6. Acknowledgements Thanks to Marshall Rose for developing the XML2RFC format. 7. References 7.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 7.2. Informative References Authors' Addresses Michael Niedermayer FFmpeg EMail: michaelni@gmx.at Niedermayer & Barbato Expires March 28, 2008 [Page 6] Internet-Draft NUT Container Format September 2007 Luca Barbato (editor) Politecnico di Torino Corso Duca degli Abruzzi 25 10135 Torino Italy EMail: lu_zero@gentoo.org Niedermayer & Barbato Expires March 28, 2008 [Page 7] Internet-Draft NUT Container Format September 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgement Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Niedermayer & Barbato Expires March 28, 2008 [Page 8]

Hi On Tue, Sep 25, 2007 at 10:54:09AM +0200, Luca Barbato wrote:
Redone from scratch, if you are fine with it, I'm going to take the current nut.txt and make it as rfc and commit it on our docs stuff
ok
(currently I'm tracking it on git so I won't mind if we could switch the nut tree to git, someone does?)
no objections from me [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB it is not once nor twice but times without number that the same ideas make their appearance in the world. -- Aristotle

Hi, Here are some things I noticed: On Tuesday 25 September 2007 10:54, Luca Barbato wrote: [..]
1.1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
This document refers to the following definitions
definitions: [..]
frame Minimal unit of information that can be decoded completely, it is usually holds a full frame video frame, a group of audio samples or a subtitle line.
it [-is] usually holds Also, I know what a frame is, but the description is a bit confusing. It says it's a minimal unit that can be decoded completely, but below it says you can only start decoding at a keyframe. So you can only decode a given frame/unit completely if it's either a keyframe or you have decoded previous (in dts sense) frames which the current frame depends on.
Keyframe A keyframe is a frame from which you can start decoding. The nth frame is a keyframe if and only if frames n, n+1, ... in presentation order (that are all frames with a pts >= frame[n].pts) can be decoded successfully without reference to frames prior n in storage order (that are all frames with a dts < frame[n].dts). If no such frames exist (for example due to using overlapped transforms like the MDCT in an audio codec), then the definition shall be extended by dropping n out of the set of frames which must be decodable, if this is still insufficient then n+1 shall be dropped, and so on until there is a keyframe. Every frame which is marked as a keyframe MUST be a keyframe according to the definition above, a muxer MUST mark every frame it knows is a keyframe as such, a muxer SHOULD NOT analyze future frames to determine the keyframe status of the current frame but instead just set the frame as non-keyframe.
IMHO the last comma should be replaced by 'and' (i.e. so it says: A, B and C) or the sentence could be split up in three sentences.
1.2. Syntax Convetions
Since NUT heavily uses variable length fields, the simplest way to describe it is using a pseudocode approach instead of graphical bitfield descriptions.
The syntax uses datatypes, tagnames and C-like constructs.
1.2.1. Datatypes
f(n) n fixed bits in bigendian order
big-endian
u(n) Unsigned value encoded in n bits MSB-first
v Unsigned variable length value.
value=0 do{ more_data u(1) data u(7) value= 128*value + data }while(more_data)
I'd prefer spaces between do and { and } and while, but I suppose that's a matter of personal taste. Same for value = instead of value=.
Figure 1: Variable Length Unsigned Value
Values can be encoded using the following logic: the data is in network order, every byte has the most significant bit used as flag and the following 7 used to store the value. The first N bit
bits
are to be taken, where N is number of bits representing the value modulo 7, and stored in the first byte. If there are more bits, the flag bit is set to 1 and the subsequent 7bit are stored in the
7 bits
following byte, if there are remaining bits set the flag to 1 and the same procedure is repeated. The ending byte has the flag bit set to 0.
I find this description a bit confusing, e.g. it's not clear when you talk about the input value and when about the output bytes. [..]
Strings and binary data can be encoded basically writing the byte count as a Variable Length Unsigned Value and the the string. The
and then the
strings MUST be encoded in utf-8
t Variable length binary data (or utf-8 string).
wrong description
tmp v id= tmp % time_base_count value= (tmp / time_base_count) * time_base[id]
Figure 4: Variable Length Timestamp
--Ivo
participants (3)
-
Ivo
-
Luca Barbato
-
Michael Niedermayer