[FFmpeg-devel] [PATCH v5 00/25] Subtitle Filtering 2022

Tue Aug 23 02:08:26 EEST 2022

________________________________________
From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> on behalf of Soft Works <softworkz at hotmail.com>
Sent: Tuesday, August 23, 2022 12:08 AM
To: FFmpeg development discussions and patches
Subject: Re: [FFmpeg-devel] [PATCH v5 00/25] Subtitle Filtering 2022

________________________________________
From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> on behalf of Anton Khirnov <anton at khirnov.net>
Sent: Monday, August 22, 2022 2:18 PM
To: ffmpeg-devel
Subject: Re: [FFmpeg-devel] [PATCH v5 00/25] Subtitle Filtering 2022

Quoting Jean-Baptiste Kempf (2022-08-21 12:41:20)
>
> On Sun, 21 Aug 2022, at 11:41, Paul B Mahol wrote:
> > We should more forward and merge this considerable subtitle work
>
> Are there parts of this work that have reach majority consensus?

Almost exactly identical objections to the basic aspects of the API were
raised independently by me, Lynne, and Hendrik.
IIUC Soft Works still refuses to address them (though it's not so easy
to tell in a 200-email thread).

---

Anton,

thanks for the reply. Please correct me if I'm wrong. From my memory
and understanding, the one any only remaining point of discussion was
the necessity to have a separate field for subtitle start PTS in addition to
the AVFrame's PTS field.

I wasn't refusing to make a change, but I have taken a lot of effort to
explain the reasons for that necessity.
I did that in several chats on IRC, on the ML, and recently, I have written
an article especially to address that concern and better explain the
background:

https://github.com/softworkz/SubtitleFilteringDemos/issues/1

It remained unresponded (but maybe unnoticed?).

Bottom line is that without having the separate subtitle pts field,
the whole patchset cannot work.

@Anton: you said the reason for this is because I would have
designed it like this. But this not the case. The reason why this
field is needed it due to the way how libavfilter is designed to
work.
The only way to avoid that second field would be to fundamentally
rework the scheduling of frames in filter graphs and making changes
to a core part like that (which is working pretty well and reliably btw)
would impose a huge risk of regressions and incompatibilities
(it's more a guarantee for issues rather than an abstract risk only),
which doesn't make any sense to do at this time and in this
context.

My conclusion is that having that one additional field in AVFrame
is by far the better option.

Please let me know what you think should be done and what kind
of concern you have with regards to the additional field in AVFrame.
It might not qualify as a top candidate for "API Beauty", but it's none
of the worst ones either.
Your preceding arguments were based on the assumption that it
could easily be avoided. I hope the referenced article helps to
understand why it can't (without fundamental changes to libavfilter).

Please also let me know whether there are any other objections or
desired changes and I'll try to address them. I am in no way refusing
to make changes, as long as they are feasible.

Thanks,
softworkz

------------

Other alternatives that were discussed:

1. Move the SubtitleStartTime field to AVSubtitleArea

One AVFrame can have multiple AVSubtitleArea instances
(bitmaps in case of graphic subs or text events in case of text subs).

But all AVSubtitleArea instances of an AVFrame must always have the
same start and duration - so it would be semantically incorrect
to have those values at the area level (when a frame has multiple
areas, with different timings, which value is valid? The ones from 
area0 and the values from area1 to areaN are irrelevant?)

2. Put the SubtitleStartTime into AVFrame's user/opaque data field

This is what Lynne had suggested. It avoids adding an additional 
field to AVFrame but it blocks the use of the user/opaque field for 
actual user data in case of subtitle frames.
Also, each subtitle decoder, each subtitle encoder and many subtitle
filters would need to cast the user/opaque field of AVFrame back and
forth to a time value.

Neither of the two options does make much sense to me, but when 
there would be consensus on any of those, then I'd be ok with it and
make those changes.

Thanks,
softworkz