[FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix handling of backslashes

Soft Works softworkz at hotmail.com
Fri Feb 4 03:57:48 EET 2022

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Oneric
> Sent: Friday, February 4, 2022 2:01 AM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/{ass, webvttdec}: fix
> handling of backslashes
> On Thu, Feb 03, 2022 at 20:51:16 +0000, Soft Works wrote:
> > I think when you inject that word-joiner as a workaround for ass
> > parsing, you'll also need to make sure that it gets removed
> > when encoding to other formats.
> There's no way of knowing whether the word-joiner comes from
> a conversion performed by ffmpeg in the past or already existed
> in the original source.

That might be true, but I think it's valid to say that such characters
are very unusual "original" subtitle sources and that's why I don't
think it's a good idea for ffmpeg to start injecting them.

Subtitle implementations are often rather minimal, especially in
hardware devices and might not always cover the full range of 
UTF-8 specifics.

> However, the wordjoiner does not alter the visually appearance and
> is unlikely to change line-breaking properties; that's why I chose
> a word-joiner. Therefore I don't think removing (only) the inserted
> word-joiners is possible,

Why not? As it seems to be required for ASS encoding only, all other
output formats should remain unaffected. 

> but also not necessary.

I'm not sure whether all ffmpeg text-sub encoders can handle 
those chars - which could be verified of course.

But what remains is the question about the effect on end devices
which are consuming that output.

Finally, those chars are a pest. I'm using them myself for a 
specific use case, but when you don't know they are there, it can
drive you totally mad, eventually even thinking your system or
software is faulty.


Open your patch file [2/2] and search for the string
"123456\NAscending". You can see the string in two lines, but search
will only find one of them.

Or just look at the two lines directly. They are preceded by + and -
even though both appear identical. 

So, this also needs consideration of the consequences, like how 
many developers (inside and outside of ffmpeg) this would be driving
nuts over the years and make them start hating ffmpeg for doing so 
once they've found out.


More information about the ffmpeg-devel mailing list