[FFmpeg-devel] [PATCH] x86/vp9lpf: add ff_vp9_loop_filter_[vh]_88_16_sse2()

Christophe Gisquet christophe.gisquet at gmail.com
Tue Jan 28 12:05:41 CET 2014


Hi,

2014-01-28 James Almer <jamrial at gmail.com>:
> +%if cpuflag(ssse3)
>      mova                m0, [mask_mix]
> +%endif
>      movd                m2, Id
>      movd                m3, Ed
> -    pshufb              m2, m0
> -    pshufb              m3, m0
> +    SPLATB_MASK         m2, m0
> +    SPLATB_MASK         m3, m0

Is there any gain in loading mask_mix into m0, in particular considering that:

>  %endif
>      mova                m0, [pb_80]
>      pxor                m2, m0
> @@ -456,7 +469,7 @@ SECTION .text
>      SPLATB_REG          m7, H, m0                       ; H H H H ...
>  %else
>      movd                m7, Hd
> -    pshufb              m7, [mask_mix]
> +    SPLATB_MASK         m7, [mask_mix]
>  %endif

It is not loaded here?

I'm asking because I have noticed it sometimes (not in vp9 scope) does
not matter, or is even 1 cycle faster.

-- 
Christophe


More information about the ffmpeg-devel mailing list