[FFmpeg-devel] [RFC] Talk about subtitles

Wed Nov 23 13:56:45 CET 2011

Le duodi 2 frimaire, an CCXX, Clément Bœsch a écrit :
> [TL;DR: I want to improve/rewrite a real subtitles support in FFmpeg and need
> some hints.]

My opinion on this:

In the long run, this belongs in libavfilter, with a filtergraph fragment
that looks like that:

 text sub  +----------+ bitmap sub  +---------+  video with alpha
---------->| txt2bmap |------------>| sub2vid |------.
           +----------+             +---------+       \    +---------+
                                                       `-->|         |
    video                                                  | overlay |->
---------------------------------------------------------->|         |
                                                           +---------+

Or possibly a single sub_overlay filter merging sub2vid with overlay, since
subtitles are only small rectangles that may be more efficient.

(In the very long run, I believe libavfilter should handle even the decoders
and encoders, and possibly the demuxers and muxers, but that is another
story.)

Before we get there, we need:

- support for subtitles in libavfilter;

- support for complexes filtergraphs in the command line tools, more
  efficiently and less awkwardly than with movie/amovie/smovie.

This is not for tomorrow, but this will eventually come.

In the short run, the above features can be hardcoded into the ffmpeg
command line tool. Fortunately, all code written here can be later reused
almost as is for the corresponding filter. And in fact, most of the code is
probably already in Stefano's proposal for vf_ass.

In practice, that could look like that:

- -hardsub option similar to map to tell ffmpeg that it needs to overlay the
  subtitles stream #S.s onto the video stream #V.v.

- In transcode_subtitles, if the stream is used in hardcoded sub, keep the
  decoded packet around instead of avsubtitle_free()ing it.

- In do_video_out, just before the call to avcodec_encode_video, call
  avcodec_overlay_subtitle(big_picture, current_sub).

- For text subtitles, some kind avcodec_render_subtitle function, probably
  based on libass (but an internal rudimentary implementation may be
  useful), called by avcodec_overlay_subtitle if necessary.

As a side note, since everything would be temporary (until proper support is
in lavfi), we can skip optimizations. For example, avcodec_overlay_subtitle
can always copy the whole frame, and later we can rely on lavfi's
permissions framework to take care of that.

Concerning the various markups for text subtitles, there are usually two
options in such a case:

- We choose or define an universal format that can represent reliably all
  other markups, so that the "markup X -> universal markup -> markup X"
  round trip is always lossless. And then we always use it.

- We flag all text subtitle data with the used markup, we define a few
  conversion functions and use them as needed, possibly using some kind of
  "shortest path in a graph" algorithm when direct conversion is not
  implemented.

Working with an universal markup is easier, and for text subtitles,
efficiency is not really a concern. Unfortunately, I am not sure that the
universal markup can actually exist. With the correct API, converting in all
directions may be relatively painless.

Note that apart from the markup, the encoding is also a problem.

Concerning VOBSUB and closed caption, it looks rather like a demuxer
problem, and thus quite orthogonal to the discussion at hand.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20111123/f8329d05/attachment.asc>