[FFmpeg-devel] [PATCH] avcodec/v210: add avx2 version of the line encoder

Henrik Gramner henrik at gramner.com
Thu Jan 14 20:21:24 CET 2016


On Wed, Jan 13, 2016 at 4:55 PM, James Darnley <james.darnley at gmail.com> wrote:
> diff --git a/libavcodec/x86/v210enc.asm b/libavcodec/x86/v210enc.asm
> index 859e2d9..a8f3d3c 100644
> --- a/libavcodec/x86/v210enc.asm
> +++ b/libavcodec/x86/v210enc.asm
> -cextern pb_FE
> -%define v210_enc_max_8 pb_FE
> +;cextern pb_FE
> +local_pb_FE: times 32 db 0xfe
> +%define v210_enc_max_8 local_pb_FE

You could change ff_pb_FE to be 32-byte instead of duplicating it.

> +%if cpuflag(avx2)
> +    movu        xm1, [yq+widthq*2]
> +    vinserti128 m1,   m1, [yq+widthq*2+12], 1
> +%else
>      movu    m1, [yq+2*widthq]
> +%endif

xmN can be used unconditionally which gets rid of the %else. E.g.

    movu       xm1, [yq+widthq*2]
%if cpuflag(avx2)
    vinserti128 m1, m1, [yq+widthq*2+12], 1
%endif

> +%if cpuflag(avx2)
> +    movq         xm3, [uq+widthq]
> +    movhps       xm3, [vq+widthq]
> +    movq         xm7, [uq+widthq+6]
> +    movhps       xm7, [vq+widthq+6]
> +    vinserti128  m3,   m3, xm7, 1
> +%else
>      movq    m3, [uq+widthq]
>      movhps  m3, [vq+widthq]
> +%endif

Ditto. Also use xm2 instead of xm7 since it's unused at this point and
it avoids having to use an extra vector register in the AVX2 version.

> +%if cpuflag(avx2)
> +    movu         [dstq],    xm0
> +    movu         [dstq+16], xm1
> +    vextracti128 [dstq+32], m0, 1
> +    vextracti128 [dstq+48], m1, 1
> +%else
>      movu    [dstq], m0
>      movu    [dstq+mmsize], m1
> +%endif

Ditto.

Otherwise LGTM.


More information about the ffmpeg-devel mailing list