[FFmpeg-devel] [PATCH v5 00/12] Subtitle Filtering

Soft Works softworkz at hotmail.com
Fri Sep 17 07:35:06 EEST 2021



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Soft Works
> Sent: Thursday, 16 September 2021 19:46
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v5 00/12] Subtitle Filtering
> 

[..]

> > - Sparseness. Subtitles streams have gaps, and synchronization with
> >   other streams requires a next frame, that can be minutes away or
> > never
> >   come. This needs to be solved in a way compatible with
> processing.
> 
> I have kept the heartbeat logic from your sub2video implementation.
> It makes sense, is required and I can't think of any better way to
> handle this. It's just renamed to subtitle_heartbeat and used for
> all subtitle formats.

I'd like to add a few more details about that subject.

The known and retained sub2video mechanism is used for feeding
subtitle frames into the graph: it sends an initial empty frame
(unless there's any actual one) and it repeats the most recent
subtitle frame (or an empty subtitle frame) at a certain minimum
interval into the graph.

What's happening inside the graph is a different story and depends
on each individual filter's implementation. For an example, let's
look at the new textsub2video filter which renders ass subtitles
(input) on transparent video frames (output).

On the input side, the filter receives subtitle frames containing 
ass lines ("ass events"), but with libass, there's no relation 
between the time when its getting fed event lines and the output.
These are totally independent: at the input side, you could
feed the whole set of lines at once, while at the output side, 
you are specifying the ecact point in time for which you would
like the subtitle images to get rendered.

As such, the textsub2video filter cannot work in a way like 
producing one video frame at the output side for each subtitle 
text frame at the input side.
That could be either too many or also too few frames, depending
on the available compute resources, the source material and 
the desired results:

Normally, subtitle text is changing at a rather low frequency,
e.g. no more often than once or twice per second. That means
that a filter like textsub2video would need to generate a new 
overlay image no more than once or twice per second.

But wait - ass subtitles can also have animations; in that case,
every single frame rendered by libass will be different.
That will And animations
require a much higher frame-rate for smooth appearance.
 


Essentially, this leads to a similar problem like the heartbeat
topic. 





> 
> 
> > > - Part3, avfilter support for subtitles in AVFrames. At this
> point
> > we
> > > have a defined structure to store subtitles in AVFrames, and
> actual
> > > code that can generate or consume them. When approaching this,
> the
> > > same rules apply as before, existing subtitle functionality, as
> > crude
> > > as it may be, has to remain functional as exposed to the user.
> 
> Check.
> 
> > We need to decide which aspects of the subtitles formats are
> > negotiated.
> >
> > At least, obviously, the text or bitmap aspect will be, with a
> > conversion filter inserted automatically where needed.
> 
> I'm inserting the graphicsub2video filter for keeping compatibility
> with sub2video command lines, but I'm not a fan of any other
> automatic filter insertion.
> Let's talk about this in a separate conversation.
> 
> 
> > But depending on the answers to the questions in part 1, we may
> need
> > to
> > negotiate the pixel format and colorspace too.
> 
> At current, bitmap subtitles are always PAL8 and that should be
> (remain to be) the meaning if SUBTITLE_BITMAP.
> 
> It will be easy to add additional subtitle formats which could
> use a different kind of bitmap format.
> 
> 
> > Unfortunately, the current negotiation code is messy and fragile.
> We
> > cannot afford to pile new code on top of it. However good the new
> > code
> > may be, adding it on top of messy code would only make it harder to
> > clean up and maintain later. I absolutely oppose that.
> 
> The situation may be messy for audio and video, but for subtitles
> format negotiation is really simple (SUBTITLE_BITMAP, SUBTITLE_ASS
> or SUBTITLE_TEXT).
> 
> The improvements you are planning can easily be done afterwards
> as the subtitle format negotiation really doesn't add any significant
> technical debt to the situation.
> 
> 
> Kind regards,
> softworkz
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".


More information about the ffmpeg-devel mailing list