[MPlayer-dev-eng] [PATCH] yadif SSE2/SSSE3 optimization

Michael Niedermayer michaelni at gmx.at
Tue Nov 25 18:36:52 CET 2008


On Mon, Nov 24, 2008 at 08:21:38PM +0800, Zhou, Zongyi wrote:
> >cosmetic patch:
> >- "paddw %%mm3, "%%mm2 \n\t"\
> >+ "paddw "MM"3, "MM"2 \n\t"\
> >...
> >+#define MM "%%mm"
> >...
> >+FILTER_LINE_ROUTINE(mmx2)
> >
> >
> >SSE2 adding patch:
> >+#undef MM
> >+#define MM "%%xmm"
> >...
> >+FILTER_LINE_ROUTINE(sse2)
> 
> This is the cosmetics patch. I will make the SSE2 adding patch when this gets applied.
> 
> ZZ

[...]
> -#define CHECK(pj,mj) \
> -            "movq "#pj"(%[cur],%[mrefs]), %%mm2 \n\t" /* cur[x-refs-1+j] */\
> -            "movq "#mj"(%[cur],%[prefs]), %%mm3 \n\t" /* cur[x+refs-1-j] */\
> -            "movq      %%mm2, %%mm4 \n\t"\
> -            "movq      %%mm2, %%mm5 \n\t"\
> -            "pxor      %%mm3, %%mm4 \n\t"\
> -            "pavgb     %%mm3, %%mm5 \n\t"\
> -            "pand     %[pb1], %%mm4 \n\t"\
> -            "psubusb   %%mm4, %%mm5 \n\t"\
> -            "psrlq     $8,    %%mm5 \n\t"\
[...]

> +            PSRL1\

this should not include the register like that but instead
include it explicitly like MOVA does too.


> +            "punpcklbw   "MM"7, "MM"5 \n\t" /* (cur[x-refs+j] + cur[x+refs-j])>>1 */\
> +            MOVA MM"2, "MM"4 \n\t"\
> +            "psubusb     "MM"3, "MM"2 \n\t"\
> +            "psubusb     "MM"4, "MM"3 \n\t"\
> +            "pmaxub      "MM"3, "MM"2 \n\t"\
> +            MOVA MM"2, "MM"3 \n\t"\
> +            MOVA MM"2, "MM"4 \n\t" /* ABS(cur[x-refs-1+j] - cur[x+refs-1-j]) */\

please vertically align these
for example like:

            "punpcklbw   "MM"7, "MM"5 \n\t" /* (cur[x-refs+j] + cur[x+refs-j])>>1 */\
            MOVA          MM"2, "MM"4 \n\t"\
            "psubusb     "MM"3, "MM"2 \n\t"\
            "psubusb     "MM"4, "MM"3 \n\t"\
            "pmaxub      "MM"3, "MM"2 \n\t"\
            MOVA          MM"2, "MM"3 \n\t"\
            MOVA          MM"2, "MM"4 \n\t" /* ABS(cur[x-refs-1+j] - cur[x+refs-1-j]) */\

[...]

> +            PSHUF \

same issue as PSRL1


[...]
> +#define MOV "movd "
> +#define MOVA "movq "
> +#define MOVU "movq "

i think including the " " in the instruction is ugly


[...]
> +#undef MM
> +#undef MOV
> +#undef MOVA
> +#undef MOVU
> +#undef STEP
> +#undef PSRL1
> +#undef PSRL2
> +#undef PSHUF
> +
> +

not yet needed


[..]
> @@ -364,7 +386,7 @@
>          }
>      }
>  #if defined(HAVE_MMX) && defined(NAMED_ASM_ARGS)
> -    if(gCpuCaps.hasMMX2) __asm__ volatile("emms \n\t" : : : "memory");
> +    __asm__ volatile("emms \n\t" : : : "memory");
>  #endif
>  }

this is unrelated, thus does not belong in this patch

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

There will always be a question for which you do not know the correct awnser.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20081125/11de62df/attachment.pgp>


More information about the MPlayer-dev-eng mailing list