[FFmpeg-devel] [RFC] AVSubtitles rework
Clément Bœsch
ubitux at gmail.com
Tue Sep 11 19:45:31 CEST 2012
On Mon, Sep 03, 2012 at 09:40:25PM +0200, Clément Bœsch wrote:
> On Mon, Sep 03, 2012 at 08:01:04PM +0200, Nicolas George wrote:
> > L'octidi 18 fructidor, an CCXX, Clément Bœsch a écrit :
> > > I'm not very fond of introducing a new structure for a few reasons:
> > > - having a AVSubtitle2 will require to maintain both paths for longer,
> > > and the problem is already hard to deal with even if starting from
> > > scratch
> > > - if we do that, it will require duplicating the current public API for a
> > > while, which sounds kind of a pain
> >
> > All that is true, but that is the burden of compatibility. If we do not
> > introduce a new structure, all programs that currently allocate AVSubtitle
> > themselves will break if dynamically linked with a more recent lavc.
> >
> > > - I don't think the current AVSubtitle API is really used apart from
> > > MPlayer, but I may be wrong
> >
> > A Google search for avcodec_decode_subtitle2 shows VLC, XBMC, and a few
> > small projects.
> >
>
> TL;DR: follow up and extend brainstorming after VDD/subtitles talks
>
> Mmh OK. Well then should we introduce an experimental AVSubtitle2 directly
> into libavutil to ease the integration with libavfilter later on?
>
> If we are to start a new structure, we should consider designing it the
> proper way at first, so a subtitle structure being able to store two types
> of subtitles as we already discussed:
>
> == bitmap subtitles ==
>
> For the bitmap stuff I don't have much opinions on how it should be done.
> IIRC, we agreed that the current AVSubtitle structure was mostly fine
> (since AVSubtitle is designed for such kind of subtitles at first) except
> that it it is missing the pixel format information, and we were wondering
> where to put that info (in each AVSubtitle2->rects or at the root of the
> AVSubtitle2 structure).
>
> == styled events for text based subtitles ==
>
> For the styled text events, each AVSubtitle2 would have, instead of a
> AVSubtitle->rects[N]->ass an exploitable N AVSubtitleEvent (or maybe only
> one?). This is what the subtitles decoders would output (in a decode2
> callback for example, depending on how we keep compat with AVSubtitle) and
> what the users would exploit (by reading that AST to use it in their
> rendering engine/converter/etc, or simply pass it along to our encoders
> and muxers). Additionally, we may want to provide a "TEXT" encoder to
> provide a raw text version (stripping all markups) for simple rendering
> engine.
>
> So, here is a suggestion of the classic workflow:
>
> /* common transmuxing/coding path */
> DEMUXER -> [AVPacket] -> DECODER -> [AVSubtitle2] -> ENCODER -> [AVPacket] -> MUXER
> |
> |
> /* lavfi/hardsub or video player path */
> |
> / \
> / \
> custom rendering / \
> engine using the <--------- text? bitmap?
> AVSubtitle2->events / \
> structure / \
> libass to render? bitmap overlay
> / \
> yes / \ no
> / \
> ENCODER:assenc ENCODER:textenc (<== both lavc encoders)
> / \
> AVPacket->data is an ASS / \
> payload (no timing) / \ AVPacket->data is raw text
> (need to mux for timings)/ \
> / \
> libass:parse&render freetype/mplayer-osd/etc
>
>
> At least, that's how I would see the usage from a user perspective.
>
> Now if we agree with such model, we need to focus on how to store the
> events & styles. Basically, each AVSubtitle2 must make available as AST
> the following:
>
> - an accessible header with all the global styles (such as an external
> .css for WebVTT, the event styles in the ASS header, palettes with some
> formats, etc.); maybe that one would belong in the AVCodecContext
> - one (or more?) events with links to styles structure: either in the
> global header, or associated with that specific event. BTW, these
> "styles" info must be able to contain various information such as
> karaoke or ruby stuff (WebVTT supports that,
> https://en.wikipedia.org/wiki/Ruby_character)
>
> We still need to agree on how to store that (and Nicolas already proposed
> something related already), but I'd like to check if everyone would agree
> with such model at first. And then we might engage in the API for text
> styling.
>
Any comment?
When I'm done with the 3 projects I'm working on right now, I will likely
start this work.
--
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120911/9ce98217/attachment.asc>
More information about the ffmpeg-devel
mailing list