
Forgive the cross posting, this affects several projects. On Fri, Nov 03, 2006 at 12:50:27AM -0800, Unga wrote:
Currently Theora video in Matroska is not supported by Mplayer. To enable the support Michael Niedermayer has made the following proposal sometime back: http://article.gmane.org/gmane.comp.video.mplayer.nut.devel/214
The proposal is that we add a recommendation to the ogg vorbis spec to just concatenate the headers when embedding in a container that needs to store them in a blob, and that readers skip leading and trailing data based on known packet lengths and magic strings. As Michael says, this works, and is just the sort of hacky spec wrangling ogg is (in)famous for. :) I guess my only comment is that this isn't particularly general. While vorbis has a fixed set of header packets with "easy" to determine lengths, it's possible to do a codec with external framing in mind where this wouldn't work. The theora spec, for example, allows additional application-defined header packets after the initial required three. It also means a cross-encapsulator has to understand a codec's header packet format to put the data in an ogg stream, which is something many implementors have complained loudly about. Therefore I'd like to counterpropose something with explicit packet lengths, like matroska has, or the "packed header" format the vorbis and theora rtp drafts use. If we're going to add this to the vorbis and theora specs, I'd like to see it used as broadly as possible, but luca dislikes the metadata header and omitted it from his packed header design for rtp. Luca, what do you think about adding the metadata header back as an optionally empty field? -r

Let me see if I understand the problem correctly. Matroska provides only one header packet per codec to identify it. Michael's proposal suggests to create a new header for each of our codecs (well, his proposal is only for vorbis, but there are other codecs who have secondary header pages)? I am not sure if Matroska would encapsulate the clean codec stream or an ogg framed stream. I also don't understand if there would be one blob per codec in the case of a multitrack file (e.g. Theora + Vorbis) or whether there would be just one large, interleaved blob. In any case, I might put our experience into the mix to get this right. When Skeleton was developed for Ogg, we wanted to have one generic type of header that could help identify all the possible codecs inside an Ogg stream and give enough information to an application to seek without having to decode the secondary header pages. Our first aproach was to add an additional header before each codec. That was really bad though, because it broke all the existing Ogg decoders out there and essentially created a new format. After lengthy discussions, a much better solution was born: create an additional logical bitstream that had one header per codec inside it. This spec is now Skeleton and is supported by just about all the common media players out there. May I therefore suggest that if Matroska needs one header per codec, it could use a Skeleton bitstream to do so (see http://wiki.xiph.org/index.php/Ogg_Skeleton)? Or maybe at least to use the fishead headers as a basis for a new spec, since they have gone through a long thought process in development. Also, it might make it easier to implement support in media players since most already support it for Ogg. Cheers, Silvia. On 11/4/06, Ralph Giles <giles@xiph.org> wrote:
Forgive the cross posting, this affects several projects.
On Fri, Nov 03, 2006 at 12:50:27AM -0800, Unga wrote:
Currently Theora video in Matroska is not supported by Mplayer. To enable the support Michael Niedermayer has made the following proposal sometime back: http://article.gmane.org/gmane.comp.video.mplayer.nut.devel/214
The proposal is that we add a recommendation to the ogg vorbis spec to just concatenate the headers when embedding in a container that needs to store them in a blob, and that readers skip leading and trailing data based on known packet lengths and magic strings.
As Michael says, this works, and is just the sort of hacky spec wrangling ogg is (in)famous for. :)
I guess my only comment is that this isn't particularly general. While vorbis has a fixed set of header packets with "easy" to determine lengths, it's possible to do a codec with external framing in mind where this wouldn't work. The theora spec, for example, allows additional application-defined header packets after the initial required three.
It also means a cross-encapsulator has to understand a codec's header packet format to put the data in an ogg stream, which is something many implementors have complained loudly about. Therefore I'd like to counterpropose something with explicit packet lengths, like matroska has, or the "packed header" format the vorbis and theora rtp drafts use.
If we're going to add this to the vorbis and theora specs, I'd like to see it used as broadly as possible, but luca dislikes the metadata header and omitted it from his packed header design for rtp. Luca, what do you think about adding the metadata header back as an optionally empty field?
-r _______________________________________________ theora-dev mailing list theora-dev@xiph.org http://lists.xiph.org/mailman/listinfo/theora-dev

On 11/3/06, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:
Let me see if I understand the problem correctly.
Matroska provides only one header packet per codec to identify it. Michael's proposal suggests to create a new header for each of our codecs (well, his proposal is only for vorbis, but there are other codecs who have secondary header pages)?
Just to be clear, the first header does identify Vorbis, but the others are needed for setup. Sort of like a super-keyframe. Monty

Hi On Sat, Nov 04, 2006 at 07:09:23AM +1100, Silvia Pfeiffer wrote:
Let me see if I understand the problem correctly.
Matroska provides only one header packet per codec to identify it.
my proposal was about the 2 or 3 codec initalization packets at the start of vorbis, theora, ... streams identifying the codecs is not a problem in any container format besides ogg i know of, containers simply have some field which identifies the codec, that can be a 32bit or 16bit integer, or a variable length string, or in case of matroska several redundant systems the problem with the initalization packets, or super keyframes or sequence headers or whetever you want to call them is that there are several of them but containers are generally designed to handle just one such packet per stream if you would simply store these 3 packets like normal packets then a demuxer which is told by the user to seek to lets say 5min into the stream will do so, first it will pass the single global header (this one is empty in our example) to the codec next it would search for a keyframe around the requested 5min and start passing packets begining with the keyframe to the decoder which would fail as it never received the 3 initalization packets ... if now the 2 or 3 packets are merged into one and stored in the appropriate spot for the global packet for the stream then everything will work fine, of course that requires that the decoder is able to parse or the demuxer is able to split the merged packet (for that a few words in the relevant specs would be helpfull, whatever the exact method is which is used to merge the packets ...) also note this is not about matroska alone, but rather many containers avi, wav, nut, matroska, nuv, asf to name the ones which IIRC support a single global header but do not really support multiple ones without codec specific hacks ... mpeg-ps/ts does not support any global header, they expect such headers to be repeated before keyframes (mpeg1/2 video does exactly that with their sequence headers) mov allows everything but tends to need a special case per codec in the demuxer also APIs tend to support passing a single global packet around but tend not to support multiple ones ... [...]
I am not sure if Matroska would encapsulate the clean codec stream or an ogg framed stream. I also don't understand if there would be one blob per codec in the case of a multitrack file (e.g. Theora + Vorbis) or whether there would be just one large, interleaved blob. In any case, I might put our experience into the mix to get this right.
putting a container into a container is the most insane thing you could do, it also isnt allowed in many containers, avi requires each packet to be a single packet (people do ignore this yes i know but they generally dont put other containers in avi), nut explicitly says that containers inside streams render the file invalid and any player playing such a file is not nut compliant, i dont know about matroska but it would surprise me if a vorbis+theora in ogg stream could be put in matroska without violating some rules also what is such a stream audio? video? something else? also ignoring the rules, such files are a nightmare to support, and even if supported will have a lot of random problems with AV sync [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB In the past you could go to a library and read, borrow or copy any book Today you'd get arrested for mere telling someone where the library is

On Fri, Nov 03, 2006 at 11:02:09AM -0800, Ralph Giles wrote:
Forgive the cross posting, this affects several projects.
On Fri, Nov 03, 2006 at 12:50:27AM -0800, Unga wrote:
Currently Theora video in Matroska is not supported by Mplayer. To enable the support Michael Niedermayer has made the following proposal sometime back: http://article.gmane.org/gmane.comp.video.mplayer.nut.devel/214
The proposal is that we add a recommendation to the ogg vorbis spec to just concatenate the headers when embedding in a container that needs to store them in a blob, and that readers skip leading and trailing data based on known packet lengths and magic strings.
As Michael says, this works, and is just the sort of hacky spec wrangling ogg is (in)famous for. :)
Michael's promosal allows many formats including the MKV one. Obviously choosing the one that might interoperate with broken software could have advantages. Michael's proposal was just for the vorbis decoders to 'be liberal in what they accept' both for the sake of supporting multiple weird formats and for implementation simplicity (this algorithm is simpler than having N special cases for N broken formats).
I guess my only comment is that this isn't particularly general. While vorbis has a fixed set of header packets with "easy" to determine lengths, it's possible to do a codec with external framing in mind where this wouldn't work. The theora spec, for example, allows additional application-defined header packets after the initial required three.
Obviously APPLICATION-specific headers do not belong here at all. More Xiph idiocy. Anyway such crap should never be stored in nut or any non-ogg container.
It also means a cross-encapsulator has to understand a codec's header packet format to put the data in an ogg stream, which is something many implementors have complained loudly about.
Blame Xiph for _intentionally_ designing their formats to break compatibility between containers.
Therefore I'd like to counterpropose something with explicit packet lengths, like matroska has, or the "packed header" format the vorbis and theora rtp drafts use.
Reference for these?
If we're going to add this to the vorbis and theora specs, I'd like to see it used as broadly as possible, but luca dislikes the metadata header and omitted it from his packed header design for rtp. Luca, what do you think about adding the metadata header back as an optionally empty field?
The metadata header should be empty if present, and only present if required by the spec. This data does NOT belong in the codec headers, only the container-level metadata. Rich

On Fri, Nov 03, 2006 at 04:23:31PM -0500, Rich Felker wrote:
Therefore I'd like to counterpropose something with explicit packet lengths, like matroska has, or the "packed header" format the vorbis and theora rtp drafts use.
Reference for these?
http://www.ietf.org/internet-drafts/draft-ietf-avt-rtp-vorbis-01.txt http://svn.xiph.org/trunk/vorbis/doc/draft-ietf-avt-rtp-vorbis-01.xml http://www.ietf.org/internet-drafts/draft-barbato-avt-rtp-theora-01.txt http://svn.xiph.org/trunk/theora/doc/draft-ietf-avt-rtp-theora-00.xml (some unsubmitted updates) -r

Hi On Fri, Nov 03, 2006 at 11:02:09AM -0800, Ralph Giles wrote:
Forgive the cross posting, this affects several projects.
and forgive me too for doing the same ... (if the disscusion is inappropriate for any of the lists just say so and i wont CC that one anymore ...)
On Fri, Nov 03, 2006 at 12:50:27AM -0800, Unga wrote:
Currently Theora video in Matroska is not supported by Mplayer. To enable the support Michael Niedermayer has made the following proposal sometime back: http://article.gmane.org/gmane.comp.video.mplayer.nut.devel/214
The proposal is that we add a recommendation to the ogg vorbis spec to just concatenate the headers when embedding in a container that needs to store them in a blob, and that readers skip leading and trailing data based on known packet lengths and magic strings.
As Michael says, this works, and is just the sort of hacky spec wrangling ogg is (in)famous for. :)
I guess my only comment is that this isn't particularly general. While vorbis has a fixed set of header packets with "easy" to determine lengths, it's possible to do a codec with external framing in mind where this wouldn't work. The theora spec, for example, allows additional application-defined header packets after the initial required three.
It also means a cross-encapsulator has to understand a codec's header packet format to put the data in an ogg stream, which is something many implementors have complained loudly about. Therefore I'd like to counterpropose something with explicit packet lengths, like matroska has, or the "packed header" format the vorbis and theora rtp drafts use.
ive looked at the rtp draft and as far as i understand it it concatenates the first 2 packets and omits all further ones " A Theora Packed Configuration is indicated with the payload type field set to 1. Of the three headers, defined in the Theora I specification [16], the identification and the setup will be packed together, the comment header is completely suppressed. It is up to the client to provide a minimal size comment header to the decoder if required by the implementation. " this definitly has my support, not that that would make any difference... :) comments, userdata, and other non essential data does not belong to a global codec specific header be it in rtp or any container normal containers have their own fields to store data like author, comment, user specified metadata and so on yes its a nightmare to convert from ogg to other containers or back but putting this data in the global codec specific header does not solve anything, the data would be as usefull as random data put there, and for rtp resources would be wasted to ensure error free delivery of possibly large and useless data [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB In the past you could go to a library and read, borrow or copy any book Today you'd get arrested for mere telling someone where the library is

On Fri, Nov 03, 2006 at 11:40:04PM +0100, Michael Niedermayer wrote:
ive looked at the rtp draft and as far as i understand it it concatenates the first 2 packets and omits all further ones
First and 3rd packets, but yes. I missed that it's was omitting further ones in the draft review. It also uses a 16 bit length, with isn't general either. Using unary encoding like matroska and ogg does isn't optimal for large packets either, of course. For RTP we were trying to keep it simple, and of course the RTP formats are by definition codec specific, so we didn't try to do anything like variable length length fields. (I like the jpeg2k scheme, where the length header includes the bytes in the length header, so since the minimum length is greater than zero you have one or more bits to use as a flag to describe the width of the bits field. But a high-bit encoding like utf-8 works too.)
this definitly has my support, not that that would make any difference... :) comments, userdata, and other non essential data does not belong to a global codec specific header be it in rtp or any container
The inline metadata helped solve a problem. Of course it's nice to use a container level metadata format if one is available, and in that case it should supercede the codec-level one. But the codec specs are very clear that this header is required, even if it doesn't contain useful information. I think it's more confusing to treat this as a mistake and try to fix it. -r
participants (5)
-
Michael Niedermayer
-
Ralph Giles
-
Rich Felker
-
Silvia Pfeiffer
-
xiphmont@xiph.org