[FFmpeg-devel] [PATCH 1/4] avformat/assdec: UTF-16 support

wm4 nfxjfg at googlemail.com
Tue Sep 2 21:09:46 CEST 2014


On Tue, 2 Sep 2014 21:05:08 +0200
Reimar Döffinger <Reimar.Doeffinger at gmx.de> wrote:

> On Tue, Sep 02, 2014 at 08:56:09PM +0200, wm4 wrote:
> > Use the UTF-16 BOM to detect UTF-16 encoding. Convert the file contents
> > to UTF-8 on the fly using FFTextReader, which acts as converting wrapper
> > around AVIOContext. It also can work on a static buffer, needed for
> > format probing. The FFTextReader wrapper now also takes care of skipping
> > the UTF-8 BOM.
> 
> Haven't reviewed it in detail, but shouldn't it also detect anything
> with a 0 byte in the first 2 characters as UTF-16?
> Interpreting it as any other text format is unlikely to work anyway,
> and I think most subtitle formats will start with an ASCII character,
> giving near 100% reliability without any BOM.

Interesting idea, but on the other hand I haven't seen any UTF-16
subtitles without BOM. (My guess is that they're all produced on
Windows...)


More information about the ffmpeg-devel mailing list