[FFmpeg-devel] [Sponsored work] ASCII Text subtitles in DVB subtitles

Anssi Hannula anssi.hannula at iki.fi
Mon Jan 13 08:31:10 CET 2014


12.01.2014 22:59, Andrey Utkin kirjoitti:
> 2014/1/12 Sylvain <sylvain at lahiette.com>:
>> Thanks for you response.
>>
>> In fact it is possible to send text subtitles in EN 300 743, in
>> addition of bitmap subtitles. The documents states that transfering
>> text-only strings shall be negotiated between the broadcaster and the
>> set-top-box manufacturer, as they need to agree on the font used on
>> the decoder side (at least). See page 26 of  ETSI EN 300 743 V1.2.1,
>> field "object_coded_method". .
> 
> There's indeed something looking like text info, but not exactly usual
> text strings:
> 
> character_code (it is 16 bits each - my remark): Specifies a character
> through its index number in the character table identified in the
> subtitle_descriptor. Each reference to the character table is counted
> as a separate character code, even if the resulting
> character is non spacing. For instance floating accents are counted as
> separate character codes.
> 
> Floating accents as separate character codes - looks unusual for any
> encoding i know.
> How exactly the character table should be defined in
> subtitle_descriptor is not defined in the doc, and anywhere else.
> So this looks to be left implementation specific, although i am not sure yet.
> In such case i think it is not probable that a patch for such feature
> could be included into ffmpeg mainstream. But anything is possible in
> private development.

VDR implementation [1] contains the following comment:

// "ETSI EN 300 743 V1.3.1 (2006-11)", chapter 7.2.5 "Object data
segment" specifies
// character_code to be a 16-bit index number into the character table
identified
// in the subtitle_descriptor. However, the "subtitling_descriptor"
<sic> according to
// "ETSI EN 300 468 V1.13.1 (2012-04)" doesn't contain a "character
table identifier".
// It only contains a three letter language code, without any
specification as to how
// this is related to a specific character table.
// Apparently the first "code" in textual subtitles contains the
character table
// identifier, and all codes are 8-bit only. So let's first make Data a
string of
// 8-bit characters

Not sure if this is just one provider doing the character map this way,
or if it is more common despite not being in the standard...

[1] http://projects.vdr-developer.org/git/vdr.git/tree/dvbsubtitle.c

-- 
Anssi Hannula


More information about the ffmpeg-devel mailing list