[FFmpeg-devel] [PATCH] Matroska demuxer adds WebVTT support
nicolas.george at normalesup.org
Thu Aug 1 13:51:43 CEST 2013
Le quartidi 14 thermidor, an CCXXI, Clement Boesch a écrit :
> Actually, I'm not so sure about having SSA deprecated (aside from mkv
> format), but that can be discussed (in another thread please).
Please share the reasons indeed.
> Anyway, about the whole current thread, I'm sorry I didn't have time to
> read every single post, but I'd like to have a few words before random
> development is done, like changing the current (de)muxing of WebVTT.
Of course. If you had not replied today, I would have insisted on exactly
> Summary of the current situation: WebVTT "codec" is defined by a payload
> (the text with its markup) and two extra information. WebVTT and WebM
> formats mux them differently:
> - in WebVTT a cue looks like this: [<chapter>\n]<timestamp>[<settings>]\n<text>\n\n
> - in WebM a cue looks like this: <chapter>\n<settings>\n<text>
> Please correct me if that's incorrect.
I believe this is technically correct, but by using the "cue" vocable, you
forget a very important difference:
- in WebVTT, a cue is part of a text file that is being parsed in full;
- in WebM, a cue is a packetized in a container format with generic
structures and no WebVTT-specific code.
> Now the problem in my opinion is that WebM uses a full textual way of
> muxing the 3 informations: requiring some strchr or similar in a binary
> parser is a bit insane (and dangerous? what about a non null terminated
> payload?). IMO it would have been much more wise to mux it with \0
> separators, but whatever.
I believe you did not think this argument through: memchr(data, '\n', size)
works just as well as memchr(data, 0, size), and you can not use any str*
function on untrusted data anyway (unless you rely on padding for
0-termination, but I consider that bad practice).
Note that I disagree with the argument you give, but I fully agree with the
conclusion itself: \n is a very bad choice for a delimiter in Matroska
packets. Because \n can also appear in the payload of one of the fields.
That will make extensibility much harder.
> The point is, if someone decides to mux WebVTT
> in another format, he might come with a different way and more relevant
> way of muxing it.
We will deal with that if that ever happens.
> TL;DR: the <chapter>, <settings> and <text> are so weirdla (badly?) muxed
> in *both* WebVTT format and WebM that it, in my opinion, makes sense to
> separate them at AVPacket level like it is now (payload for <text> and
> side data for the two other extra info), and let every muxer sanely mux it
> using the API interface.
That means adding specific code for the codec in all muxers: you need to
have a _very_ good reason to do that, having a "weird" packet format is not
Also, note that ASS packets have exactly the same issues, and you agreed
that it was better to eliminate the special cases from the Matroska code.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 198 bytes
Desc: Digital signature
More information about the ffmpeg-devel