[FFmpeg-devel] [PATCH v5 00/12] Subtitle Filtering

Fri Sep 17 08:13:40 EEST 2021

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Soft Works
> Sent: Friday, 17 September 2021 06:35
> To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v5 00/12] Subtitle Filtering
> 
> 
> 
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> > Soft Works
> > Sent: Thursday, 16 September 2021 19:46
> > To: FFmpeg development discussions and patches <ffmpeg-
> > devel at ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH v5 00/12] Subtitle Filtering
> >
> 
> [..]
> 
> > > - Sparseness. Subtitles streams have gaps, and synchronization with
> > >   other streams requires a next frame, that can be minutes away or
> > > never
> > >   come. This needs to be solved in a way compatible with
> > processing.
> >
> > I have kept the heartbeat logic from your sub2video implementation.
> > It makes sense, is required and I can't think of any better way to
> > handle this. It's just renamed to subtitle_heartbeat and used for
> > all subtitle formats.
> 
> I'd like to add a few more details about that subject.
> 
> The known and retained sub2video mechanism is used for feeding
> subtitle frames into the graph: it sends an initial empty frame
> (unless there's any actual one) and it repeats the most recent
> subtitle frame (or an empty subtitle frame) at a certain minimum
> interval into the graph.
> 
> What's happening inside the graph is a different story and depends
> on each individual filter's implementation. For an example, let's
> look at the new textsub2video filter which renders ass subtitles
> (input) on transparent video frames (output).
> 
> On the input side, the filter receives subtitle frames containing
> ass lines ("ass events"), but with libass, there's no relation
> between the time when its getting fed event lines and the output.
> These are totally independent: at the input side, you could
> feed the whole set of lines at once, while at the output side,
> you are specifying the ecact point in time for which you would
> like the subtitle images to get rendered.
> 
> As such, the textsub2video filter cannot work in a way like
> producing one video frame at the output side for each subtitle
> text frame at the input side.
> That could be either too many or also too few frames, depending
> on the available compute resources, the source material and
> the desired results:
> 
> Normally, subtitle text is changing at a rather low frequency,
> e.g. no more often than once or twice per second. That means
> that a filter like textsub2video would need to generate a new
> overlay image no more than once or twice per second.
> 
> But wait - ass subtitles can also have animations; in that case,
> every single frame rendered by libass will be different.
> That will And animations
> require a much higher frame-rate for smooth appearance.
> 

Sorry - accidentally sent before completion...

The new textsub2video filter has a 'framerate' parameter. 
This allows to control the frequency at which an overlay 
frame is recreated.

For example:

ffmpeg -i INPUT -filter_complex "[0:2]textsub2video=r=2[sub];[0:1][sub]overlay=repeatlast=0" output.ts

will provide better performance due to overlay frames getting 
recreated only twice per second while

ffmpeg -i INPUT -filter_complex "[0:2]textsub2video=r=25[sub];[0:1][sub]overlay=repeatlast=0" output.ts

will provide smooth animation due to frequent recreation of
overlay frames.

There's no epic conclusion - just another example for the 
wide range of new possibilities.

Regards,
softworkz