[MPlayer-dev-eng] [PATCH] yadif SSE2/SSSE3 optimization
Michael Niedermayer
michaelni at gmx.at
Tue Nov 25 18:36:52 CET 2008
On Mon, Nov 24, 2008 at 08:21:38PM +0800, Zhou, Zongyi wrote:
> >cosmetic patch:
> >- "paddw %%mm3, "%%mm2 \n\t"\
> >+ "paddw "MM"3, "MM"2 \n\t"\
> >...
> >+#define MM "%%mm"
> >...
> >+FILTER_LINE_ROUTINE(mmx2)
> >
> >
> >SSE2 adding patch:
> >+#undef MM
> >+#define MM "%%xmm"
> >...
> >+FILTER_LINE_ROUTINE(sse2)
>
> This is the cosmetics patch. I will make the SSE2 adding patch when this gets applied.
>
> ZZ
[...]
> -#define CHECK(pj,mj) \
> - "movq "#pj"(%[cur],%[mrefs]), %%mm2 \n\t" /* cur[x-refs-1+j] */\
> - "movq "#mj"(%[cur],%[prefs]), %%mm3 \n\t" /* cur[x+refs-1-j] */\
> - "movq %%mm2, %%mm4 \n\t"\
> - "movq %%mm2, %%mm5 \n\t"\
> - "pxor %%mm3, %%mm4 \n\t"\
> - "pavgb %%mm3, %%mm5 \n\t"\
> - "pand %[pb1], %%mm4 \n\t"\
> - "psubusb %%mm4, %%mm5 \n\t"\
> - "psrlq $8, %%mm5 \n\t"\
[...]
> + PSRL1\
this should not include the register like that but instead
include it explicitly like MOVA does too.
> + "punpcklbw "MM"7, "MM"5 \n\t" /* (cur[x-refs+j] + cur[x+refs-j])>>1 */\
> + MOVA MM"2, "MM"4 \n\t"\
> + "psubusb "MM"3, "MM"2 \n\t"\
> + "psubusb "MM"4, "MM"3 \n\t"\
> + "pmaxub "MM"3, "MM"2 \n\t"\
> + MOVA MM"2, "MM"3 \n\t"\
> + MOVA MM"2, "MM"4 \n\t" /* ABS(cur[x-refs-1+j] - cur[x+refs-1-j]) */\
please vertically align these
for example like:
"punpcklbw "MM"7, "MM"5 \n\t" /* (cur[x-refs+j] + cur[x+refs-j])>>1 */\
MOVA MM"2, "MM"4 \n\t"\
"psubusb "MM"3, "MM"2 \n\t"\
"psubusb "MM"4, "MM"3 \n\t"\
"pmaxub "MM"3, "MM"2 \n\t"\
MOVA MM"2, "MM"3 \n\t"\
MOVA MM"2, "MM"4 \n\t" /* ABS(cur[x-refs-1+j] - cur[x+refs-1-j]) */\
[...]
> + PSHUF \
same issue as PSRL1
[...]
> +#define MOV "movd "
> +#define MOVA "movq "
> +#define MOVU "movq "
i think including the " " in the instruction is ugly
[...]
> +#undef MM
> +#undef MOV
> +#undef MOVA
> +#undef MOVU
> +#undef STEP
> +#undef PSRL1
> +#undef PSRL2
> +#undef PSHUF
> +
> +
not yet needed
[..]
> @@ -364,7 +386,7 @@
> }
> }
> #if defined(HAVE_MMX) && defined(NAMED_ASM_ARGS)
> - if(gCpuCaps.hasMMX2) __asm__ volatile("emms \n\t" : : : "memory");
> + __asm__ volatile("emms \n\t" : : : "memory");
> #endif
> }
this is unrelated, thus does not belong in this patch
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
There will always be a question for which you do not know the correct awnser.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20081125/11de62df/attachment.pgp>
More information about the MPlayer-dev-eng
mailing list