[FFmpeg-devel] [PATCH] Matroska demuxer adds WebVTT support

Nicolas George nicolas.george at normalesup.org
Sat Jul 20 00:12:40 CEST 2013


Le primidi 1er thermidor, an CCXXI, Matthew Heaney a écrit :
> The representation of WebVTT cues in WebM is strictly an artifact of
> Matroska.  There's no expectation that that representation would make sense
> for some other container.  Frank and I (along with various browser vendors)
> designed the representation that way so that a Matroska demuxer could
> losslessly reconstruct the original WebVTT cue from its embedded
> representation.

I do not see why other container formats would not want to losslessly
represent WebVTT files too. The reason behind the technical choices you made
should also apply to any format similar to Matroska.

> As it stands now, the Matroska demuxer reconstructs the WebVTT cue such
> that it exactly matches what the WebVTT demuxer already pushes downstream.
>  To me this makes perfect sense (hence the patch), since the format of the
> WebVTT cue itself is canonical.
> 
> I can change the WebVTT demuxer to convert the original WebVTT cues into a
> format that matches the embedded representation in a WebM file, but one
> issue is that this will pollute the rest of ffmpeg with artifacts of WebM.
>  Are you sure this is what you want?  If you use WebVTT cues as the
> canonical representation (as the WebVTT demuxer does now, and the current
> patch does for the WebM demuxer), this has the benefit that only the WebM
> demuxer has to care about the representation WebVTT cues in a WebM file.

If I understand you correctly, you seem to consider the text file
representation of WebVTT as the authoritative form. That may be the source
of the misunderstandings, since I do not share that assumption. For me, the
basic unit is the packet. A packet can be decoded into... something (that
currently looks like ASS markup, until a real solution is filanized), that
can later be encoded into a packet for the same or another codec; a packet
can be read from a file or stored into a file.

.mkv or .vtt files are just serialization of sequences of packets.

Since ffmpeg does everything (encoding, decoding, muxing, demuxing), it can
pretty much choose the packet format as it pleases. But there are choices
smarter than others: the choices that lead to minimum code complexity and
maximum compatibility. They are usually the same, unless someone did
something stupid (H.264-in-MP4, I am looking at you).

Compatibility means that you can use ffmpeg's components with third-party
libraries. For example, it is nice to be able to use ffmpeg's Ogg demuxer
with Xiph's Vorbis decoder, or Xiph's Ogg demuxer with ffmpeg's Vorbis
decoder. For that to work, all libraries must have the same packet format;
usually they do.

I do not believe it applies here. Is there a common library for parsing
WebVTT files that outputs something that can count as a packet?

That leaves us code complexity. Let us assume we are one year ago, when
ffmpeg had no support for WebVTT at all.

To read from text .vtt files (and write to them, that is symmetrical, I will
only talk about demuxing), a brand new demuxer must be written, and it does
only that. In that case, the packet format is pretty much free: adding a
counter or a line break there or here, putting this info in that field or
this one, it does not change anything.

The Matroska demuxer is quite another story: it already knows about packets.
If you leave it as is, it should output the WebVTT packets just as they are
stored in the file. The code complexity argument is there: you can get full
WebVTT packets by just adding the map between CODEC_ID_WEBVTT and the code
used in Matroska. A single line of code.

Of course, now that the text WebVTT parser and writer exist, things are a
little more complex. But on the long run I believe it would be better to
review the packet format without considering that issue.


I can state another argument, but I believe it boils down to the same basic
principles: the reasons that made you choose this particular format to store
in Matroska probably apply to AVPacket as well. Sure, AVPacket is slightly
more powerful than Matroska, with its side data and stuff. But the way I see
it, side data is for situations where the container features really exceeds
the packet model, while WebVTT-in-Matroska fits the model perfectly.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130720/e76ffcc1/attachment.asc>


More information about the ffmpeg-devel mailing list