[FFmpeg-devel] [PATCH] lavc: make invalid UTF-8 in subtitle output a non-fatal error

Reimar Döffinger Reimar.Doeffinger at gmx.de
Fri Jun 28 18:32:45 CEST 2013


On 28.06.2013, at 09:52, Nicolas George <nicolas.george at normalesup.org> wrote:
> Le decadi 10 messidor, an CCXXI, Reimar Döffinger a écrit :
>> For both audio and video we do not fail with an error by default but
>> instead try to run e.g. error concealment. The subtitle code really is the
>> odd one here that does not behave in a way consistent with the other code.
> 
> The apparent inconsistency comes from the fact that we are talking about a
> completely different problem here.
> 
> Correct me if I am wrong, but error concealment is for when the input data
> is damaged, not for data that was YUV but decoded as RGB. If the input data
> is damaged, there is indeed nothing better to do than concealing the error
> as nicely possible, there is no -please_magically_fix_my_data option.

> For subtitles, the issue is completely different: decoding fail because the
> application neglected to add the "-sub_charenc KOI8R" option.

Well, there are reasons why applications can't easily/always add the correct options.
One of them which this is probably about is when the application wants to do its own detection and conversion, after some additional buffering for example.
Maybe a charenc that says "just dump through whatever raw data you have" would be acceptable to you, too?

>> Note that e.g. MPlayer does kind of error concealment for subtitles,
>> replacing invalid characters by a special symbol.
> 
> I have no objection to that kind of approach, assuming -sub_charenc was set
> (i.e. the subtitles text is actually broken), but wm4's approach has nothing
> to do with that.

Note the reason why I was considering it (and why falling back to some kind of "error concealment" might make sense) is that it is a simple way for the user to manually figure out the encoding.
I believe many native speakers have some idea of how their encoding looks if incorrectly interpreted as ASCII or similar.


More information about the ffmpeg-devel mailing list