[FFmpeg-devel] [PATCH] "Mojibake" in Japanese

Vladimir Mosgalin mosgalin at VM10124.spb.edu
Mon Feb 6 20:33:47 CET 2012


Hi Tetsuya Yoshida!

 On 2012.02.07 at 03:21:24 +0900, Tetsuya Yoshida wrote next:

> > At very least it should be some option to select encoding, where you can
> > pick CP1251, SJIS or some other encoding (I can imagine that this
> > problem exists not only in Russia and Japan).
> 
> I believed it was only Japanese problem,
> because Shift JIS problem is written in ID3 article of Wikipedia(en).
> Thank you for letting me know.

Article on "Mojibake" http://en.wikipedia.org/wiki/Mojibake explains
about same problem in Russian language, actually :)
(as for ID3, russian wiki page about ID3 mentions it, but not english
one as I can see)

And yes, it's very common situation in ID3. For many years before ID3v2
came out everyone was writing tags in cyrillic in CP1251 (usually)
encoding in .mp3 files, and even after ID3v2 came out, it took so many
years to get adopted so lots of people continued to violate standard
still and used CP1251 instead of unicode. I don't think this will ever
be fixed completely, as it's a habit formed by many years - only with
switch from mp3+ID3 to other formats which use better tags formats, like
vorbis tags or APE tags this can be solved.

As for example, here is one: http://www.mediafire.com/?8h84a3b9r481e77

Input #0, mp3, from '06. Иноходец.mp3':
  Metadata:
    title           : Èíîõîäåö
    artist          : Âëàäèìèð Âûñîöêèé
    album           : Àíòèàëêîãîëüíàÿ
    track           : 06/11
    TLEN            : 149000

Here, when treated as CP1251, correct would be:
title: Иноходец
artist: Владимир Высоцкий
album: Антиалкогольная


-- 

Vladimir


More information about the ffmpeg-devel mailing list