[FFmpeg-devel] [RFC] Subtitle Filtering Ramp-Up
softworkz .
softworkz at hotmail.com
Tue Jun 3 17:20:59 EEST 2025
Hello everybody,
[just deleted a whole page of introduction text that was
essentially pointless]
Getting right to the point, I will give it once another try,
rework and resubmit an updated version of the subtitle filtering
patchset, including all improvements and fixes that have been
made in the meantime.
To avoid unnecessary work, it would be great to get one
crucial part settled up-front.
New fields on AVFrame for Subtitle Filtering
============================================
Some of them are used from so many places, that it's a really
tedious job to change them, hence I'd like to get this cleared-up
before.
Let's go through them:
- enum AVMediaType type;
This had never been questioned, it should be clear why it's needed
- unsigned num_subtitle_areas;
- AVSubtitleArea **subtitle_areas;
There were extensive discussions about why the data isn't stored in
the video buffers, but eventually there was no more objection
- AVBufferRef *subtitle_header;
The reason why the ASS header needs to be moved around has been
explained and wasn't objected eventually
- int repeat_sub;
The use case is evident. But I have seen that a number of similar
fields have been removed meanwhile and replaced by flags.
So, the question here would be whether this should also be migrated
to a flag?
- Time Fields
This has been the one most discussed topic, but nothing has
changed in this regard: it requires a separate start and a
separate duration field for subtitles.
There's one improvement I make, which is the addition of a flow_mode
field. I'll explain the details further down below.
My preferred way for including the field in the AVFrame struct
used to be a sub-struct - simply because it unclutters the list
of AVFrame members:
struct SubtitleTiming
{
int64_t start_pts;
int64_t duration;
AVSubtitleFlowMode flow_mode;
} subtitle_timing;
The same could be done for the other fields:
struct Subtitles
{
unsigned num_subtitle_areas;
AVSubtitleArea **subtitle_areas;
AVBufferRef *subtitle_header;
} subtitles;
Or all in one:
struct Subtitles
{
unsigned num_subtitle_areas;
AVSubtitleArea **subtitle_areas;
AVBufferRef *subtitle_header;
int64_t start_pts;
int64_t duration;
AVSubtitleFlowMode flow_mode;
} subtitles;
Of course, they can also be added as flat members, but will
require a prefix then, like:
int64_t subtitle_start_pts;
int64_t subtitle_duration;
AVSubtitleFlowMode subtitle_flow_mode;
Please let me know which way you would deem to be most suitable and
whatever thoughts or questions you might have beyond this.
Subtitle Flow Mode
==================
Please see the list and explanation of flow modes here:
https://github.com/softworkz/SubtitleFilteringDemos/issues/4
This is not something conceptually new. It has always existed, but just
implicitly behind the scenes and you had to know about it in some way
for creating working FFmpeg command lines, because not
every subtitle decoder, neither every subtitle filter nor every subtitle
encoder is compatible with each other, even when there's a match in
subtitle type (text/bitmap). Actually, it also needs the flow modes
to match or be compatible. In the original set, it just failed in those
cases or deadlocked.
Making it explicit now in the form of the AVSubtitleFlowMode enum
provides a number of benefits:
- It finally provides an understandable explanation for why those two
extra timing fields are needed
- It makes clear what I had continuously failed to explain
- By giving names to the modes, it will be much easier to understand
and talk about
- It allows to provide better feedback to users in case of error,
like writing out something like "incompatible flow modes"
- It will be possible to simplify usage: By including this as
a parameter in filter negotiation, we will be able to perform
auto-filter insertions in cases where its reasonable.
Please let me know about your thoughts!
Best regards,
softworkz
More information about the ffmpeg-devel
mailing list