[FFmpeg-devel] [PATCH v20 02/20] avutil/frame: Prepare AVFrame\n for subtitle handling
Paul B Mahol
onemda at gmail.com
Fri Dec 10 10:01:11 EET 2021
On Thu, Dec 9, 2021 at 10:33 PM Daniel Cantarín <canta at canta.com.ar> wrote:
> Hi there.
> This is my first message to this list, so please excuse me if I
> unintendedly break some rule.
>
> I've read the debate between Soft Works and others, and would like to
> add something to it.
> I don't have a deep knowledge of the libs as other people here show. My
> knowledge comes from working with live streams for some years now. And I
> do understand the issue about modifying a public API for some use case
> under debate: I believe it's a legit line of questioning to Soft Works
> patches. However, I also feel we live streaming people are often let
> aside as "border case" when it comes to ffmpeg/libav usage, and this
> bias is present in many subtitles/captions debates.
>
> I work with Digital TV signals as input, and several different target
> outputs more related to live streaming (mobiles, PCs, and so on). The
> target location is Latin America, and thus I need subtitles/captions for
> when we use english spoken audio (we speak mostly Spanish in LATAM). TV
> people send you TV subtitle formats: scte-27, dvb subs, and so on. And
> live streaming people uses other subtitles formats, mostly vtt and ttml.
> I've found that CEA-608 captions are the most compatible caption format,
> as it's understood natively by smart tvs and other devices, as well as
> non-natively by any other device using popular player-side libraries.
> So, I've made my own filter for generating CEA-608 captions for live
> streams, using ffmpeg with the previously available OCR filter. Tried
> VTT first, but it was problematic for live-streaming packaging, and with
> CEA-608 I could just ignore that part of the process.
>
> While doing those filters, besides the whole deal of implementing the
> conversion from text to CEA-608, I struggled with stuff like this:
> - the sparseness of input subtitles, leading to OOM in servers and
> stalled players.
> - the "libavfilter doesn't take subtitle frames" and "it's all ASS
> internally" issues.
> - the "captions timings vs video frame timings vs audio timings"
> problems (people talk a lot about syncing subs with video frames, but
> rarely against actual dialogue audio).
> - other (meta)data problems, like screen positioning or text encoding.
>
> This are all problems Soft Works seems to have faced as well.
>
> But of all the problems regarding live streaming subtitles with ffmpeg
> (and there are LOTS of it), the most annoying problem is always this:
> almost every time someone talked about implementing subtitles in filters
> (in mail lists, in tickets, in other places like stack overflow,
> etcetera), they always asumed input files. When the people specifically
> talked about live streams, their peers always reasoned with files
> mindset, and stated live streaming subtitles/captions as "border case".
>
> Let me be clear: this are not "border case" issues, but actually appear
> in the most common use cases of live streaming transcoding. They all
> appear *inmediatelly* when you try to use subtitles/captions in live
> streams.
>
> I got here (I mean this thread) while looking for ways to fixing some
> issues in my setup. I was reconsidering VTT/TTML generation instead of
> CEA-608 (as rendering behave significantly different from device to
> device), and thus I was about to generate subtitle type output from some
> filter, was about to create my own standalone "heartbeat" filter to
> normalize the sparseness, and so on and so on: again, all stuff Soft
> Works seems to be handling as well. So I was quite happy to find someone
> working on this again; last time I've seen it in ffmpeg's
> mailing/patchwork
> (
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20161102220934.26010-1-u@pkh.me)
>
> the code there seemed to die, and I was already late to say anything
> about it. However, reading the other devs reaction to Soft Works work
> was worrying, as it felt as history wanted to repeat itself (take a look
> at discussions back then).
>
> It has been years so far of this situation. This time I wanted to
> annotate this, as this conversation is still warm, in order to help Soft
> Works's code survive. So, dear devs: I love and respect your work, and
> your opinion is very important to me. I do not claim to know better than
> you do ffmpeg's code. I do not claim to know better what to do with
> libavfilter's API. Please understand: I'm not here to be right, but to
> note my point of view. I'm not better than you; quite on the contrary
> most likely. But I also need to solve some very real problems, and can't
> wait until everything else is in wonderful shape to do it. I can't also
> add lots of conditions in order to just fix the most immediate issues;
> like it's the case with sparseness and heartbeat frames, which was a
> heated debate years ago and seems to still be one, while I find it to be
> the most obvious common sense backwards-compatible solution
> implementation. Stuff like "clean" or "well designed" can't be more
> important than actually working use cases while not breaking previously
> implemented ones: because it's far easier to fix little blocks of "bad"
> code rather than design something everybody's happy with (and history of
> the project seems to be quite eloquent about that, specially when it
> comes to this particular use cases). Also, I have my own patches (which
> I would like to upstream some day), and I can tell the API do change
> quite normally: I understand that should be a curated process, but
> adding a single property for live-streaming subtitles isn't also
> anybody's death, and thus that shouldn't be the kind of issues that
> blocks big and important code implementations like the ones Soft Works
> is working on; I just don't have the time to do myself all that work
> he/she's doing, and it could be another bunch of years until someone
> else have it. I can't tell if Soft Works code is well enough for you, or
> if the ideas behind it are the best there are, but I can tell you the
> implementations are in the right track: as a live streaming worker, I
> know the problems he/she mentions in their exchanges with you all, and I
> can tell you they're all blocking issues when dealing with live
> streaming. Soft Work is not "forcing it" into the API, and this are not
> "border cases" but normal and frequent live streaming issues. So,
> please, if you don't have the time Soft Works have, or the will to
> tackle the issues he/she's tackling, I beg you at least don't kill the
> code this time if it does not breaks working use cases.
>
>
You can not do much than send mails, because this long ago stopped being
technical issue but ego-booster issue.
>
> Thanks,
> Daniel.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>
More information about the ffmpeg-devel
mailing list