[FFmpeg-devel] [PATCH] lavc: make invalid UTF-8 in subtitle output a non-fatal error

wm4 nfxjfg at googlemail.com
Fri Jun 28 10:49:42 CEST 2013

On Fri, 28 Jun 2013 10:42:34 +0200
Nicolas George <nicolas.george at normalesup.org> wrote:

> > If you don't accept his patch, please tell me how to disable the
> > UTF-8 check otherwise.
> Set the sub_charenc options to a reasonable value. I can not help you

Such as?

> more unless you explain exactly how you get the subtitles and what
> you intend to do with them after decoding, but I am pretty sure that
> the current API is useable (while incomplete, missing auto-detection)
> and you are misusing it.

I get them from libavformat demuxers, but also elsewhere. I actually
can perform codepage auto-detection on subs read by libavformat
demuxers (it's really awkward: read a number of subtitle packages,
concatenate their contents, then run the charset detector on it). But
it's disabled by default and doesn't guarantee success anyway. In some
cases, subtitles might be demuxed from interleaved files, in which
auto-detection can't be reasonably performed.

I have the impression that you still believe the charset problem can
be solved perfectly. This is not the case. Such problems are very common
even today, and just showing an error message (or even dropping broken
text) won't help.

More information about the ffmpeg-devel mailing list