[FFmpeg-devel] [PATCH 2/5] lavu: add text_file API.

Nicolas George nicolas.george at normalesup.org
Sat Aug 10 15:03:50 CEST 2013

Le tridi 23 thermidor, an CCXXI, wm4 a écrit :
> This does not work. You're trying to detect legacy 8-bit codepage
> encodings by testing whether conversion to iconv works.
> For example, as far as I remember, ISO-8859-1 is at best a
> subset of WINDOWS-1252, so your code will never detect ISO-8859-1.

You are wrong, once again. windows-1252 uses some codes in the 128-159 range
for extra characters, and the other are undefined, while ISO-8859-1 define
the whole range for control characters.

ISO-8859-1 is placed in default the list of encodings as a catch-all case,
since any binary sequence is valid ISO-8859-1.

> This kind of detection will most likely lead to broken text, which is
> silently accepted without even printing a warning anywhere.

If ISO-8859-1 is in the list of encodings, that is the normal and expected
behaviour. As Reimer pointed out some time ago, users of exotic encodings
are usually familiar about how their language looks when improperly read as
ISO-8859-1, and know how to deal with it.

> Additionally, the application (and not even a ffmpeg.c user) has the
> slightest control over how character set detection and conversion is
> handled

Again, you are wrong, that is exactly what the AVTextFile.encodings field is
there for.

> Not checking for av_calloc result.

Indeed, thanks.

> Why does it create an array of lines? This seems slow, complex, and
> unnecessary.

It makes further line-based processing much easier. That is the whole point
of this kind of API.

>	       Separate \n and \r processing also prevents you from
> treating MacOS line endings correctly.

I believe this kind of file has disappeared more than a decade ago. If it is
not so, adding a flag to deal with them is easy enough.

> Why does this use its own callback facility instead of using avio?

Because avio is in lavf, not lavu.

> All in all, this patch adds lots of code without actually improving
> anything, except testing for UTF-16.

Hopefully you will forgive me for not trusting your judgement after the many
technical mistakes you made.


  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130810/3be5d594/attachment.asc>

More information about the ffmpeg-devel mailing list