[FFmpeg-devel] Subtitles for GSoC

Gerion Entrup gerion.entrup.ffdev at flump.de
Wed Mar 16 17:02:25 CET 2016


On Donnerstag, 10. März 2016 22:47:38 CET Clément Bœsch wrote:
> On Thu, Mar 10, 2016 at 06:12:57PM +0100, Gerion Entrup wrote:
> [...]
> 
> > > - an USF demuxer which extracts the timing and text (with its markup) of
> > > 
> > >   every event, and put them into an AVPacket
> > > 
> > > - introduce an USF codec and write a decoder that will transform the
> > > 
> > >   xml-like markup into ASS markup (see below)
> > 
> > I've implement such a demuxer and decoder, based on SAMI (see other mail).
> > But XML parsing with the builtin tools is a real pain and hard to extend
> > later.
> > 
> > If the GSoC project come off, please let me change this and maybe the SAMI
> > code into code based on a xmllib. Header parsing should be doable then as
> > well.
> I don't mind that much, but keep in mind that SAMI is typically an old
> broken mutant HTML based format, with unclosed tags and various other
> insanities. And aside from the madness of the specifications, you have no
> idea how broken the files in the wild can be. If you want to pick a
> library, you have to make sure it's able to fallback on its feet when
> getting unexpected input. Doing a memory search for the timing marker is
> often the most reliable thing to do when dealing with subtitles, in
> general.
> 
> [...]

Combining this with the idea of Nicolas could work. To specify, write a 
(generic) xml parser, but add the ability to say, the tag XY is more important 
than other tags and also could not be nested. A parser then could parse the 
XML file normally up to point where the file is broken (closing tag not reached 
etc.) outputs a warning, but search for the next "important" tag.

But back to GSoC. I'm assuming that sadly noone is willing to mentor this 
project (have waited quite a time). So I'm thinking to look for another task 
and therefore a question: What do you think will be the (mentored) task, that 
will bring the innermost understanding of the framework?




More information about the ffmpeg-devel mailing list