[MPlayer-users] Playing an URL with special characters, charset problem
Tom Evans
tevans.uk at googlemail.com
Fri Sep 10 13:58:30 CEST 2010
On Fri, Sep 10, 2010 at 12:27 PM, arthur at life.net.br <arthur at life.net.br> wrote:
> Hello,
>
> I wrote a small script that uses Google text-to-speech as a plugin to a
> study program called Anki.
>
> I'm using "subprocess.Popen" in python to run Mplayer. It works all right
> with normal characters. But when I use a special character, it won't.
>
> So I made some tests, if I run mplayer in the terminal like this:
>
> ~/mplayer-checkout-2010-09-09 $ ./mplayer
> "http://translate.google.com/translate_tts?tl=fr&q=â,ê,î,ô,û"
>
> It will say some weird things that is not right, if I look at the packet it
> sends to google, it shows:
> 0x0040: 5aa9 4745 5420 2f74 7261 6e73 6c61 7465 Z.GET./translate
> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a22c _tts?tl=fr&q=..,
> 0x0060: c3aa 2cc3 ae2c c3b4 2cc3 bb20 4854 5450 ..,..,..,...HTTP
>
> but if I use Firefox, and i use the same url, it will play it all right and
> say those letter (in french):
> 0x0040: aaf8 4745 5420 2f74 7261 6e73 6c61 7465 ..GET./translate
> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3d25 4333 _tts?tl=fr&q=%C3
> 0x0060: 2541 322c 2543 3325 4141 2c25 4333 2541 %A2,%C3%AA,%C3%A
> 0x0070: 452c 2543 3325 4234 2c25 4333 2542 4220 E,%C3%B4,%C3%BB.
> 0x0080: 4854 5450 2f31 2e31 0d0a 486f 7374 3a20 HTTP/1.1..Host:.
>
> so, am I doing something wrong? Is there any parameter that I should set to
> read the right charset?
>
> sorry that I don't know too much about charsets.
>
> I'm testing on Linux, and I build the latest mplayer version I could find
> (checkout-2010-09-09)
>
> I appreciate any help
>
> Kind regards,
>
> Arthur Helfstein Fragoso
> arthur at life.net.br
URLs dont have really a charset, but any unicode characters should be
percent encoded. Firefox allows you to type in your local charset, and
then does it's magic to turn that into the actual URL, whilst still
displaying what you typed in.
For instance, if you typed in the URL '...?q=â', Firefox requests the
URL '...?q=%C3%A2'. This is non-standard, but seemingly supported by
all browsers/servers.
To do this in python, assuming you have a dictionary of URL arguments
called args, with the keys being ascii strings (or unicode strings
that can convert to ascii), and the values being any unicode string:
from urllib import quote_plus
'&'.join([ '%s=%s' % (k, quote_plus(v.encode('utf-8'))) for k,v in
args.items() ])
Hope that helps
Cheers
Tom
More information about the MPlayer-users
mailing list