[MPlayer-dev-eng] [PATCH] x86_64 mmx/sse/3dnow optimisation support

Reimar Döffinger Reimar.Doeffinger at stud.uni-karlsruhe.de
Tue Oct 12 22:51:59 CEST 2004


Hi,

> I personnaly prefer considering x86_64 as a really different arch
> (having 64 bits addressing instead of 32 bits, is a big enough
> difference IMHO).

Well, the code will be hardly different after you patch... ;-)

>>>-               : "S" (y), "D" (dst), "a" (u), "b" (v), "d" (&us),
>>>"c" (w/16)
>>>+               : "S" (y), "D" (dst), "a" (u), "b" (v),
>>>"c" (w/16),
>>>+#ifdef ARCH_X86_64
>>>+               "d" (us), "r" (vs)
>>>+#else
>>>+               "d" (&us)
>>>+#endif
>>
>>Weird. Why was this done this way? Because your X86_64 version works
>>on my AMD Athlon, too... I think this difference shouldn't be there...
> 
> I thought it was done that way because all registers were allready used.
> In fact esp and ebp was not used but I thought they couldn't be used.

I think ecx isn't used... you could consider specifying it explicitly

> I used anoter way on x86_64 because of the bigger number of GPR.

Just out of interest: which registers does it have in addition?

>>I don't think this is correct. Without #undef ARCH_X86 you won't get
>>the pure C versions of some functions AFAIK... You should add #undef 
>>ARCH_X86_64 instead.
> 
> No. The pure C version will only be built when CAN_COMPILE_X86_ASM is
> not defined, which mean that neither ARCH_X86 nor ARCH_X86_64 is
> defined at that point.

Ok, convinced...

>>also adding comments on the non-obvious part/changes will greatly 
>>increase the probability of getting this applied.
> 
> Unfortunatly, after writting this code, everything seem obvious ;-)
> Sorry, but I don't really see what I could comment.

Just in case you spot something ;-)

>>Btw: I'd expect there was some reason why "addl"/"popl" etc. were used
>>in the original code and not "add"/"pop" etc. Can somebody explain?
> 
> They were there to explicitly show that this was a 32 bits operation,
> But 'as' is smart enough to deduce the operation size from the type
> of operand (eax => 32 bits, rax => 64 bits).
> Moreover this modification (addl => add) was suggested by Michael on
> ffmpeg-devel, and he accepted my patch for ffmpeg, so I can't see how
> it could be wrong ;-)

Ok then.

Thanks for explaining.

Greetings,
Reimar Döffinger




More information about the MPlayer-dev-eng mailing list