[FFmpeg-devel] [PATCH] vp9: initial attempt at a idct_idct_4x4 12bpp x86 simd (sse2) impl.

Henrik Gramner henrik at gramner.com
Sat Oct 10 18:31:52 CEST 2015


On Tue, Oct 6, 2015 at 9:59 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> +cglobal vp9_idct_idct_4x4_add_12, 4, 4, 6, dst, stride, block, eob
[...]
> +    movd                m0, coefd
> +    punpcklwd           m0, m0
> +    pshufd              m0, m0, q0000

pshuflw + punpcklqdq is faster on some older CPUs, such as Conroe.


More information about the ffmpeg-devel mailing list