[MPlayer-dev-eng] [Patch] pullup.c: Add MMX2 and SSE2 optimization
Alexander Strange
astrange at ithinksw.com
Wed Mar 24 17:47:55 CET 2010
On Mar 24, 2010, at 11:26 AM, Zhou Zongyi wrote:
> Hi all,
>
> See attached patch.
>
> Zhou Zongyi
> 2010-03-24
> <pullup_mmx2_sse2.patch>
> +static int diff_y_mmx2(unsigned char *a, unsigned char *b, int s)
> +{
> + int ret;
> + __asm__ volatile (
> + "movq (%2), %%mm0 \n\t"
> + "movq (%2,%4), %%mm1 \n\t"
> + "psadbw (%3), %%mm0 \n\t"
psadbw is in SSE, not MMX2.
pullup's functions could just call emms_c instead of doing emms once per function, but that's a pre-existing problem.
> + int ret;
> + __asm__ volatile(
> + "xorps %%xmm6, %%xmm6 \n\t"
> + "xorps %%xmm7, %%xmm7 \n\t"
pxor should be faster on AMD if you use integer operations later.
More information about the MPlayer-dev-eng
mailing list