[MPlayer-users] Playing an URL with special characters, charset problem

Tom Evans tevans.uk at googlemail.com
Fri Sep 10 13:58:30 CEST 2010


On Fri, Sep 10, 2010 at 12:27 PM, arthur at life.net.br <arthur at life.net.br> wrote:
>  Hello,
>
> I wrote a small script that uses Google text-to-speech as a plugin to a
> study program called Anki.
>
> I'm using "subprocess.Popen" in python to run Mplayer. It works all right
> with normal characters. But when I use a special character, it won't.
>
> So I made some tests, if I run mplayer in the terminal like this:
>
>  ~/mplayer-checkout-2010-09-09 $ ./mplayer
> "http://translate.google.com/translate_tts?tl=fr&q=â,ê,î,ô,û"
>
> It will say some weird things that is not right, if I look at the packet it
> sends to google, it shows:
>    0x0040:  5aa9 4745 5420 2f74 7261 6e73 6c61 7465  Z.GET./translate
>    0x0050:  5f74 7473 3f74 6c3d 6672 2671 3dc3 a22c  _tts?tl=fr&q=..,
>    0x0060:  c3aa 2cc3 ae2c c3b4 2cc3 bb20 4854 5450  ..,..,..,...HTTP
>
> but if I use Firefox, and i use the same url, it will play it all right and
> say those letter (in french):
>    0x0040:  aaf8 4745 5420 2f74 7261 6e73 6c61 7465  ..GET./translate
>    0x0050:  5f74 7473 3f74 6c3d 6672 2671 3d25 4333  _tts?tl=fr&q=%C3
>    0x0060:  2541 322c 2543 3325 4141 2c25 4333 2541  %A2,%C3%AA,%C3%A
>    0x0070:  452c 2543 3325 4234 2c25 4333 2542 4220  E,%C3%B4,%C3%BB.
>    0x0080:  4854 5450 2f31 2e31 0d0a 486f 7374 3a20  HTTP/1.1..Host:.
>
> so, am I doing something wrong? Is there any parameter that I should set to
> read the right charset?
>
> sorry that I don't know too much about charsets.
>
> I'm testing on Linux, and I build the latest mplayer version I could find
> (checkout-2010-09-09)
>
> I appreciate any help
>
> Kind regards,
>
> Arthur Helfstein Fragoso
> arthur at life.net.br

URLs dont have really a charset, but any unicode characters should be
percent encoded. Firefox allows you to type in your local charset, and
then does it's magic to turn that into the actual URL, whilst still
displaying what you typed in.

For instance, if you typed in the URL '...?q=â', Firefox requests the
URL '...?q=%C3%A2'. This is non-standard, but seemingly supported by
all browsers/servers.

To do this in python, assuming you have a dictionary of URL arguments
called args, with the keys being ascii strings (or unicode strings
that can convert to ascii), and the values being any unicode string:

from urllib import quote_plus
'&'.join([ '%s=%s' % (k, quote_plus(v.encode('utf-8'))) for k,v in
args.items() ])

Hope that helps

Cheers

Tom


More information about the MPlayer-users mailing list