[MPlayer-dev-eng] [Patch] pullup.c: Add MMX2 and SSE2 optimization

Alexander Strange astrange at ithinksw.com
Wed Mar 24 17:47:55 CET 2010


On Mar 24, 2010, at 11:26 AM, Zhou Zongyi wrote:

> Hi all,
> 
> See attached patch.
> 
> Zhou Zongyi
> 2010-03-24
> <pullup_mmx2_sse2.patch>


> +static int diff_y_mmx2(unsigned char *a, unsigned char *b, int s)
> +{
> +	int ret;
> +	__asm__ volatile (
> +		"movq (%2), %%mm0 \n\t"
> +		"movq (%2,%4), %%mm1 \n\t"
> +		"psadbw (%3), %%mm0 \n\t"

psadbw is in SSE, not MMX2.

pullup's functions could just call emms_c instead of doing emms once per function, but that's a pre-existing problem.

> +	int ret;
> +	__asm__ volatile(
> +		"xorps %%xmm6, %%xmm6 \n\t"
> +		"xorps %%xmm7, %%xmm7 \n\t"

pxor should be faster on AMD if you use integer operations later.





More information about the MPlayer-dev-eng mailing list