[FFmpeg-devel] Internal handling of subtitles in ffmpeg

Michael Niedermayer michaelni
Fri Jan 2 17:47:24 CET 2009


On Fri, Jan 02, 2009 at 05:01:39PM +0100, Reimar D?ffinger wrote:
> On Fri, Jan 02, 2009 at 02:32:18PM +0100, Michael Niedermayer wrote:
> > what about?
> > 
> > Index: libavcodec/avcodec.h
> > ===================================================================
> > --- libavcodec/avcodec.h	(revision 16398)
> > +++ libavcodec/avcodec.h	(working copy)
> > @@ -2375,15 +2375,34 @@
> >  
> >  } AVPaletteControl attribute_deprecated;
> >  
> > +enum AVSubtitleType {
> > +    BITMAP_SUBTITLE,                ///< A bitmap, pict will be set
> > +    TEXT_SUBTITLE,                  ///< Plain text, the text and ass fields will be set
> 
> IMO: Plain text, the text field must be set and is authoritative. ass
> and pict fields may contain approximations.
> 
> > +    ASS_SUBTITLE,                   ///< Text+formating, the text and ass fields will be set
> 
> IMO: Formatted text, the ass field must be set and is authoritative. pict
> and text fields may contain approximations.


> 
> Reason: require generation of alternative formats when they are not
> wanted?

did you want simplicity? if so text/ass surely should always be set for
text/ass types, i cant imagine the O(n) loop over n chars where n being
< 1000 once a second could matter speedwise ...

but if you prefer, it surely could be omitted


> Authoritative means that format will be used when the input is in any
> way processed by libav*, all others will be ignored.
> (actually, probably it is better to make authoritative mean that this
> format is preferred if it is supported 

this was my intent, you though spelled it out much nicer than i implicated
it ...


> - I guess your reason for allowing
> multiple ones in one AVSubtitleRect is to allow caching of conversion results or?).

yes


> 
> >  typedef struct AVSubtitleRect {
> >      uint16_t x;
> >      uint16_t y;
> 
> 
> 
> ///< BITMAP_SUBTITLE: top right corner of AVPicture position
>      TEXT_SUBTITLE: optional recommended center position. 0 means N/A
>      ASS_SUBTITLE: no meaning, must be 0.
> 
> I do not like the special meaning of 0 for TEXT_SUBTITLE.
> It would also mean that only one of pict, text and ass can be set.
> 

> >      uint16_t nb_colors;
> 
> Does that one serve any purpose? 

grep nb_colors has hits beyond that iam not the author of that code ...


> Also, it might make sense to also allow a colour for plain text.
> 
> > -    int linesize;
> > -    uint32_t *rgba_palette;
> > -    uint8_t *bitmap;
> > +
> > +    /**
> > +     * data+linesize for the bitmap of this subtitle.
> > +     * can be set for text/ass as well once they where rendered
> > +     */
> > +    AVPicture pict;
> > +    enum AVSubtitleType type;
> > +
> > +    char *text;                     ///< 0 terminated plain UTF-8 text
> > +
> > +    /**
> > +     * 0 terminated ASS/SSA compatible event line.
> > +     * The pressentation of this is unaffected by the other values in this
> > +     * struct.
> > +     */
> > +    char *ass;
> 
> I think I'd tend towards
> union {
> AVSubtitleBitmap pict;
> AVSubtitleText text;
> char *ass;
> }
> 
> with
> struct AVSubtitleBitmap {
> uint16_t x, y, w, h;
> AVPicture pict;
> }
> 
> struct AVSubtitleText {
> int x, y; // center position in percent (0 - 100), -1 not specified
> uint32_t color; // RGBA colour. Full transparency means unspecified.
> char *text;
> }
> 

> Colour is somewhat questionable I admit, why not add bold, font size,
> name etc. until you have reimplemented ASS. I don't really know the
> answer to that.

Dialogue event lines in ASS have 10 fields
if we drop the actual text and effects 8
They are
Layer, Start, End, Style, Name, MarginL, MarginR, MarginV

What is your oppinion about adding the ones that are interger to the struct?

Also style is a string refering to a style which has
fontname, fontsize, PrimaryColour, SecondaryColour, OutlineColor, Bold
Italic, Underline, Strikeout, ScaleX, ScaleY, Spacing, Angle, BorderStyle,
Outline,Shadow,Alignment,MarginL,MarginR,MarginV

Similarly to above the integer ones (which are most) could be added to the
struct, making access easier and faster.


> I suspect you have your good reasons for not wanting a union, still some
> grouping IMO would be good.

a union would directly conflict with the idea of caching, that is the
requirements you set that there be a plain text and a plain ASS char* field
always and directly could then only be met if these where outside the union.

Also as you said some things like width are likely only calculateable by
rendering the whole. Now if we end up needing width for something the
rendering would only be needed once if it can be cached ...


> Also I think AVSubtitle should specify which types might be used in the
> AVSubtitleRects so you can do something simple in the spirit of
> if (subs.types != TEXT_SUBTITLE)
>   av_convert_subs(subs, TEXT_SUBTITLE);

hmm, yes but i dont know how to do that cleanly (in terms of documentation
because if the enum has 1,2,4 values and is set per OR it will be confusing
in relation to the real enum field in AVSubtitleRect)


[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Democracy is the form of government in which you can choose your dictator
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090102/14aad377/attachment.pgp>



More information about the ffmpeg-devel mailing list