[FFmpeg-devel] [PATCH] Add x86-optimized function ac3_or_abs_int16() and use in log2_tab().
Loren Merritt
lorenm
Sat Feb 12 06:52:36 CET 2011
>+%macro PABSW2_MMX 6 ; dst1, dst2, src1, src2, temp1, temp2
>+ mova %1, %3
>+ mova %2, %4
>+ mova %5, %1
>+ mova %6, %2
>+ psraw %5, 15
>+ psraw %6, 15
>+ pxor %1, %5
>+ pxor %2, %6
>+ psubw %1, %5
>+ psubw %2, %6
>+%endmacro
>+
>+%macro PABSW2_SSSE3 6 ; dst1, dst2, src1, src2, unused, unused
>+ pabsw %1, %3
>+ pabsw %2, %4
>+%endmacro
Already in x86util.asm
But you don't actually want to compute (bit-or of abs), right? You want
to compute (log2 of max of abs). Since MMX has min/max instructions and
doesn't have abs, try running signed min/max first and doing abs only
once in the tail.
That way might be faster in C too, on cpus with scalar cmov/min/max and
without scalar abs.
--Loren Merritt
More information about the ffmpeg-devel
mailing list