[FFmpeg-devel] [PATCH 3/7] x86/hevc: use CLIPW macro when possible

James Almer jamrial at gmail.com
Fri Feb 6 01:28:41 CET 2015


On 05/02/15 4:20 PM, Christophe Gisquet wrote:
> From: Mickaël Raulet <mraulet at insa-rennes.fr>
> 
> Conflicts:
> 	libavcodec/x86/hevc_mc.asm
> ---
>  libavcodec/x86/hevc_mc.asm | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/libavcodec/x86/hevc_mc.asm b/libavcodec/x86/hevc_mc.asm
> index efb4d1f..e8a5032 100644
> --- a/libavcodec/x86/hevc_mc.asm
> +++ b/libavcodec/x86/hevc_mc.asm
> @@ -665,11 +665,9 @@ QPEL_TABLE 10, 8, w, avx2
>  %if %2 == 8
>      packuswb          %3, %4
>  %else
> -    pminsw            %3, [max_pixels_%2]
> -    pmaxsw            %3, [zero]
> +    CLIPW             %3, [zero], [max_pixels_%2]
>  %if (%1 > 8 && notcpuflag(avx)) || %1 > 16
> -    pminsw            %4, [max_pixels_%2]
> -    pmaxsw            %4, [zero]
> +    CLIPW             %4, [zero], [max_pixels_%2]

Many (But not all) of the functions calling these macros have free regs where max_pixels_%2 
and zero (or in that case a simple pxor m*, m*) could be stored.
It'll probably be faster than reloading these constants inside a loop.

But again, that's for a different patch.

>  %endif
>  %endif
>  %endmacro
> @@ -1467,8 +1465,7 @@ cglobal hevc_put_hevc_uni_w%1_%2, 6, 6, 7, dst, dststride, src, srcstride, heigh
>  %if %2 == 8
>      packuswb          m0, m0
>  %else
> -    pminsw            m0, [max_pixels_%2]
> -    pmaxsw            m0, [zero]
> +    CLIPW             m0, [zero], [max_pixels_%2]
>  %endif
>      PEL_%2STORE%1   dstq, m0, m1
>      add             dstq, dststrideq             ; dst += dststride
> @@ -1539,8 +1536,7 @@ cglobal hevc_put_hevc_bi_w%1_%2, 5, 7, 10, dst, dststride, src, srcstride, src2,
>  %if %2 == 8
>      packuswb          m0, m0
>  %else
> -    pminsw            m0, [max_pixels_%2]
> -    pmaxsw            m0, [zero]
> +     CLIPW            m0, [zero], [max_pixels_%2]
>  %endif
>      PEL_%2STORE%1   dstq, m0, m1
>      add             dstq, dststrideq             ; dst += dststride
> 

lgtm otherwise.


More information about the ffmpeg-devel mailing list