[MPlayer-users] Playing an URL with special characters, charset problem
arthur at life.net.br
arthur at life.net.br
Sat Sep 11 04:56:53 CEST 2010
On 09/10/2010 08:58 AM, Tom Evans wrote:
> On Fri, Sep 10, 2010 at 12:27 PM, arthur at life.net.br<arthur at life.net.br> wrote:
>> Hello,
>>
>> I wrote a small script that uses Google text-to-speech as a plugin to a
>> study program called Anki.
>>
>> I'm using "subprocess.Popen" in python to run Mplayer. It works all right
>> with normal characters. But when I use a special character, it won't.
>>
>> So I made some tests, if I run mplayer in the terminal like this:
>>
>> ~/mplayer-checkout-2010-09-09 $ ./mplayer
>> "http://translate.google.com/translate_tts?tl=fr&q=â,ê,î,ô,û"
>>
>> It will say some weird things that is not right, if I look at the packet it
>> sends to google, it shows:
>> 0x0040: 5aa9 4745 5420 2f74 7261 6e73 6c61 7465 Z.GET./translate
>> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a22c _tts?tl=fr&q=..,
>> 0x0060: c3aa 2cc3 ae2c c3b4 2cc3 bb20 4854 5450 ..,..,..,...HTTP
>>
>> but if I use Firefox, and i use the same url, it will play it all right and
>> say those letter (in french):
>> 0x0040: aaf8 4745 5420 2f74 7261 6e73 6c61 7465 ..GET./translate
>> 0x0050: 5f74 7473 3f74 6c3d 6672 2671 3d25 4333 _tts?tl=fr&q=%C3
>> 0x0060: 2541 322c 2543 3325 4141 2c25 4333 2541 %A2,%C3%AA,%C3%A
>> 0x0070: 452c 2543 3325 4234 2c25 4333 2542 4220 E,%C3%B4,%C3%BB.
>> 0x0080: 4854 5450 2f31 2e31 0d0a 486f 7374 3a20 HTTP/1.1..Host:.
>>
>> so, am I doing something wrong? Is there any parameter that I should set to
>> read the right charset?
>>
>> sorry that I don't know too much about charsets.
>>
>> I'm testing on Linux, and I build the latest mplayer version I could find
>> (checkout-2010-09-09)
>>
>> I appreciate any help
>>
>> Kind regards,
>>
>> Arthur Helfstein Fragoso
>> arthur at life.net.br
>
> URLs dont have really a charset, but any unicode characters should be
> percent encoded. Firefox allows you to type in your local charset, and
> then does it's magic to turn that into the actual URL, whilst still
> displaying what you typed in.
>
> For instance, if you typed in the URL '...?q=â', Firefox requests the
> URL '...?q=%C3%A2'. This is non-standard, but seemingly supported by
> all browsers/servers.
>
> To do this in python, assuming you have a dictionary of URL arguments
> called args, with the keys being ascii strings (or unicode strings
> that can convert to ascii), and the values being any unicode string:
>
> from urllib import quote_plus
> '&'.join([ '%s=%s' % (k, quote_plus(v.encode('utf-8'))) for k,v in
> args.items() ])
>
> Hope that helps
>
> Cheers
>
> Tom
> _______________________________________________
> MPlayer-users mailing list
> MPlayer-users at mplayerhq.hu
> https://lists.mplayerhq.hu/mailman/listinfo/mplayer-users
Tom,
Thank you for the clarification, but I tried and no success:
by terminal: (only the comma(, %2C) was passed right)
mplayer
"http://translate.google.com/translate_tts?tl=fr&q=%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB"
0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a225 _tts?tl=fr&q=..%
0x0060: 3243 c3aa 2532 43c3 ae25 3243 c3b4 2532 2C..%2C..%2C..%2
0x0070: 43c3 bb20 4854 5450 2f31 2e30 0d0a 486f C...HTTP/1.0..Ho
I thought to try to escape the % so I tried:
mplayer
"http://translate.google.com/translate_tts?tl=fr&q=\%C3\%A2,\%C3\%AA,\%C3\%AE,\%C3\%B4,\%C3\%BB"
0x0050: 5f74 7473 3f74 6c3d 6672 2671 3d25 3543 _tts?tl=fr&q=%5C
0x0060: c325 3543 a22c 2535 43c3 2535 43aa 2c25 .%5C.,%5C.%5C.,%
0x0070: 3543 c325 3543 ae2c 2535 43c3 2535 43b4 5C.%5C.,%5C.%5C.
0x0080: 2c25 3543 c325 3543 bb20 4854 5450 2f31 ,%5C.%5C..HTTP/1
without the "" and I had to escape the &
mplayer
http://translate.google.com/translate_tts?tl=fr\&q=%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB
0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a225 _tts?tl=fr&q=..%
0x0060: 3243 c3aa 2532 43c3 ae25 3243 c3b4 2532 2C..%2C..%2C..%2
0x0070: 43c3 bb20 4854 5450 2f31 2e30 0d0a 486f C...HTTP/1.0..Ho
escaping the % again
mplayer
http://translate.google.com/translate_tts?tl=fr\&q=\%C3\%A2\%2C\%C3\%AA\%2C\%C3\%AE\%2C\%C3\%B4\%2C\%C3\%BB
0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a225 _tts?tl=fr&q=..%
0x0060: 3243 c3aa 2532 43c3 ae25 3243 c3b4 2532 2C..%2C..%2C..%2
0x0070: 43c3 bb20 4854 5450 2f31 2e30 0d0a 486f C...HTTP/1.0..Ho
So I also tried on python:
address =
'http://translate.google.com/translate_tts?tl='+TTS_language+'&q='+
quote_plus(text.encode('utf-8'))
#http://translate.google.com/translate_tts?tl=fr&q=%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB
0x0050: 5f74 7473 3f74 6c3d 6672 2671 3dc3 a225 _tts?tl=fr&q=..%
0x0060: 3243 c3aa 2532 43c3 ae25 3243 c3b4 2532 2C..%2C..%2C..%2
0x0070: 43c3 bb20 4854 5450 2f31 2e30 0d0a 486f C...HTTP/1.0..Ho
I also tried with quotes around the arg "q"
address =
'http://translate.google.com/translate_tts?tl='+TTS_language+'&q="'+
quote_plus(text.encode('utf-8'))+'"'
#http://translate.google.com/translate_tts?tl=fr&q="%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB"
0x0050: 5f74 7473 3f74 6c3d 6672 2671 3d25 3232 _tts?tl=fr&q=%22
0x0060: c3a2 2532 43c3 aa25 3243 c3ae 2532 43c3 ..%2C..%2C..%2C.
0x0070: b425 3243 c3bb 2532 3220 4854 5450 2f31 .%2C..%22.HTTP/1
I'm using this line to run mplayer on python
subprocess.Popen(['mplayer', '-slave', address], stdin=PIPE,
stdout=PIPE, stderr=STDOUT)
And if I try to run on firefox3.6 the url
http://translate.google.com/translate_tts?tl=fr&q=%C3%A2%2C%C3%AA%2C%C3%AE%2C%C3%B4%2C%C3%BB
it will refresh the address bar and it will be:
http://translate.google.com/translate_tts?tl=fr&q=â%2Cê%2Cî%2Cô%2Cû
and It will play it all right:
0x0050: 5f74 7473 3f74 6c3d 6672 2671 3d25 4333 _tts?tl=fr&q=%C3
0x0060: 2541 3225 3243 2543 3325 4141 2532 4325 %A2%2C%C3%AA%2C%
0x0070: 4333 2541 4525 3243 2543 3325 4234 2532 C3%AE%2C%C3%B4%2
0x0080: 4325 4333 2542 4220 4854 5450 2f31 2e31 C%C3%BB.HTTP/1.1
so I don't know how to make mplayer make the right request. :/
any idea?
--
Arthur Helfstein Fragoso
arthur at life.net.br
More information about the MPlayer-users
mailing list