[MPlayer-dev-eng] Moving towards UTF-8

Rich Felker dalias at aerifal.cx
Mon Oct 23 17:56:11 CEST 2006


On Mon, Oct 23, 2006 at 03:43:15PM +0800, Zuxy Meng wrote:
> >> Then for GBK encoded Chinese, more than 80% the case, the string won't
> >> be a legal UTF-8 symbol and hence the user will see the correct,
> >> unconverted string.
> >
> >Unacceptable. If the string is GBK but the user has a UTF-8 system, it
> >will print nonsense to the terminal (possibly even corrupt terminal
> >control sequences). Maybe now this is rare, but eventually everyone
> >will be using UTF-8. Conversion must never be bypassed.
> 
> Well, currently, if the string is in GBK but MSG_CHARSET != GBK, then
> the user has no chance to get anything sane on the terminal,
> regardless of his/her locale, because mp_msg() converts the string at
> its best effort: it'll jump to next byte if the previous one has
> failed, while GBK is a two-byte encoding....

Right. I know it doesn't work currently, but replacing a
broken-by-lack-of-sophistication system with a broken-by-design system
is not acceptable. If you're going to try to fix it, fix it right.
Don't add complexity in the form of broken hacks. If complexity is
needed, then spend the complexity on a correct solution rather than
something that will just need to be replaced again..

Rich




More information about the MPlayer-dev-eng mailing list