[Ffmpeg-devel] a little optim for a SSE version of H263_LOOP_FILTER
skal
skal65535
Sat Nov 4 19:48:04 CET 2006
Hi everybody,
in case, it seems to me a SSE version of
H263_LOOP_FILTER is possible by replacing
"psubusb %%mm4, %%mm2 \n\t"\
"movq %%mm2, %%mm3 \n\t"\
"psubusb %%mm4, %%mm3 \n\t"\
"psubb %%mm3, %%mm2 \n\t"\
at dsputil_mmx.c:587 (fresh cvs), by:
"psubusb %%mm4, %%mm2 \n\t"\
"pminub %%mm4, %%mm2 \n\t"\
+maybe a little re-org of the loop (mm3 is gone).
Well, this is just for the fun of it, since the speed-up
(if any) might not be worth a special version...
bye!
Skal
(gotta love these saturated instructions. All of h263's
UpDownRamp() with 2 instructions is quite fun)
More information about the ffmpeg-devel
mailing list