[FFmpeg-devel] Format of decoded text subtitles

Wed Aug 1 20:54:24 CEST 2012

Hi.

Currently, when a text subtitle is decoded, it is usually re-transformed to
ASS, which gives that:

sub->num_rects = 1; /* or more */
sub->rects[0]->ass =
  "Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello {\i1}World{\i0}!\n";

This is wrong in several ways:

First, it has timestamps in text format, which breaks any kind of trimming
or scaling.

Second, it has a lot of clutter that non-ASS decoders and encoders must deal
with.

Third, the ASS clutter actually depends on external information (the Format
header of the Events section).

Here is what I propose to fix this:

1. Deprecate the AVSubtitleRect.ass field; for compatibility reasons, we may
   put code in lavc to resynthesize it for some time, but that is all. For
   the same reason, the text field could contain the text completely
   stripped of markup.

2. Add a rich_text field instead. The text in this rich_text field has only
   local styling information, such as an italic span. The markup needs to be
   simple and to nest properly, so just ASS is out of the question, but
   slightly modified ASS is possible.

3. For global styling, things become a bit hairy.

  typedef struct AVSubtitleStyle {
      const char *name;
      AVDictionary *tags; /* or another structure */
  } AVSubtitleStyle;

  enum AVSubtitleStyleLevel {
      AV_SUBTITLE_STYLE_RECT,
      AV_SUBTITLE_STYLE_EVENT,
      AV_SUBTITLE_STYLE_GROUP,
      AV_SUBTITLE_STYLE_FILE,
      AV_SUBTITLE_STYLE_HARDCODED,
      AV_SUBTITLE_STYLE_NUMBER,
  };

  typedef struct AVSubtitleRect {
      ...
      const char *rich_text;
      AVSubtitleStyle *style[AV_SUBTITLE_MARKUP_NUMBER];
  } AVSubtitleRect;

  The, considering the example ASS line I quoted above, we would have:

  rect->style[AV_SUBTITLE_STYLE_EVENT] = &{
      .name = NULL,
      .tags = {
	  { "layer", "0" },
	  { "marginl", "0" }, /* maybe omitted */
	  { "marginr", "0" },
	  { "marginv", "0" },
      },
  };
  rect->style[AV_SUBTITLE_STYLE_GROUP] = &{
      .name = "Default",
      .tags = {
	  { "fontname", "DejaVu Serif" },
	  { "fontsize", "22" },
	  /* etc */
      },
  };
  rect->style[AV_SUBTITLE_STYLE_FILE] = &{
      .name = NULL,
      .tags = {
	  { "playresx", "640" },
	  { "playresy", "360" },
	  { "collisions", "normal" },
	  /* etc */
      },
  };

  I believe it is enough to represent abstractly the ASS subtleties, and
  probably most other subtitles system can fit in that too.

  Problem: an encoder like ASS needs to be presented the FILE and all the
  GROUP level styles at the start of encoding.

This is not a completely finalized proposal, but I believe it is a good
start.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120801/7eedf1fa/attachment.asc>