[MPlayer-dev-eng] [PATCH] Enable mp3lib's SSE routines on AMD64.
Guillaume Poirier
gpoirier at mplayerhq.hu
Sun May 6 00:32:51 CEST 2007
Hi,
On May 5, 2007, at 11:28 , Attila Kinali wrote:
> On Fri, 4 May 2007 17:52:38 +0200
> Guillaume Poirier <gpoirier at mplayerhq.hu> wrote:
>
>> Attached patch allows to $SUBJ
>> It does so by moving array costab_mmx from decode_MMX.c (which
>> can't be
>> compiled on AMD64) to a separate file costab_MMX.c.
>>
>> I've played a couple of MP3s on my Hi-Fi stereo and as far as I hear,
>> there aren't any artifacts, which isn't surprising.
>>
>> I'm sure it can be improved, so I'm interested in hearing from your
>> guys.
>
> Patch works, output is correct (binary compare), but it's slightly
> slower on my machine(AMD Athlon(tm) 64 Processor 3700+):
>
> ./mplayer -ao pcm:file=/dev/null /tmp/01\ -\ Standing\ in\ the\
> Sunset\ Glow.mp3 -benchmark -quiet
>
> w/o patch:
> BENCHMARKs: VC: 0.000s VO: 0.000s A: 5.746s Sys: 0.018s
> = 5.764s
>
> w/ patch:
> BENCHMARKs: VC: 0.000s VO: 0.000s A: 5.807s Sys: 0.017s
> = 5.824s
>
> This is a difference of 1%
Ouch! This wasn't intended! I haven't benchmarked it on AMD64, now
that you bring this up, I realise I just haven't benched it at all
and assumed that it simply could only be faster.
Here are the performance figures on the 2 other CPUs that support
x86-64 mode:
Core2:
--- without:
BENCHMARKs: VC: 0.000s VO: 0.000s A: 1.668s Sys: 0.005s =
1.674s
BENCHMARK%: VC: 0.0000% VO: 0.0000% A: 99.6815% Sys: 0.3185% =
100.0000%
real 0m1.687s
user 0m1.676s
sys 0m0.012s
--- with:
BENCHMARKs: VC: 0.000s VO: 0.000s A: 1.677s Sys: 0.005s =
1.682s
BENCHMARK%: VC: 0.0000% VO: 0.0000% A: 99.6801% Sys: 0.3199% =
100.0000%
Exiting... (End of file)
real 0m1.696s
user 0m1.688s
sys 0m0.004s
P4:
--- without:
BENCHMARKs: VC: 0.000s VO: 0.000s A: 2.283s Sys: 0.006s =
2.289s
BENCHMARK%: VC: 0.0000% VO: 0.0000% A: 99.7250% Sys: 0.2750% =
100.0000%
real 0m2.326s
user 0m2.292s
sys 0m0.028s
--- with:
BENCHMARKs: VC: 0.000s VO: 0.000s A: 2.293s Sys: 0.006s =
2.299s
BENCHMARK%: VC: 0.0000% VO: 0.0000% A: 99.7325% Sys: 0.2675% =
100.0000%
real 0m2.336s
user 0m2.312s
sys 0m0.024s
I don't see much reason why the SSE version would be slower on all
CPUs. I have have fumbled the patch somewhere....
I also wonder if these SSE routines have ever been faster on common
CPUs....
Guillaume
More information about the MPlayer-dev-eng
mailing list