[FFmpeg-devel] [PATCH 2/2] x86/idctdsp: port {put, add}_pixels_clamped to yasm

Michael Niedermayer michaelni at gmx.at
Wed Sep 24 23:54:10 CEST 2014


On Wed, Sep 24, 2014 at 05:44:17PM -0300, James Almer wrote:
> Also add sse2 versions for both.
> put_pixels_clamped port and sse2 version originally written by Timothy Gu.
> 
> Signed-off-by: James Almer <jamrial at gmail.com>
> ---
>  libavcodec/x86/Makefile       |   3 +-
>  libavcodec/x86/idctdsp.asm    | 103 ++++++++++++++++++++++++++++++++
>  libavcodec/x86/idctdsp.h      |   4 ++
>  libavcodec/x86/idctdsp_init.c |   7 ++-
>  libavcodec/x86/idctdsp_mmx.c  | 133 ------------------------------------------
>  5 files changed, 112 insertions(+), 138 deletions(-)
>  delete mode 100644 libavcodec/x86/idctdsp_mmx.c
> 
> diff --git a/libavcodec/x86/Makefile b/libavcodec/x86/Makefile
> index 7bf0e82..9f34abd 100644
> --- a/libavcodec/x86/Makefile
> +++ b/libavcodec/x86/Makefile
> @@ -66,8 +66,7 @@ OBJS-$(CONFIG_WEBP_DECODER)            += x86/vp8dsp_init.o
>  # subsystems
>  MMX-OBJS-$(CONFIG_DIRAC_DECODER)       += x86/dirac_dwt.o
>  MMX-OBJS-$(CONFIG_FDCTDSP)             += x86/fdct.o
> -MMX-OBJS-$(CONFIG_IDCTDSP)             += x86/idctdsp_mmx.o             \
> -                                          x86/simple_idct.o
> +MMX-OBJS-$(CONFIG_IDCTDSP)             += x86/simple_idct.o
>  
>  # decoders/encoders
>  MMX-OBJS-$(CONFIG_MPEG4_DECODER)       += x86/xvididct_mmx.o            \
> diff --git a/libavcodec/x86/idctdsp.asm b/libavcodec/x86/idctdsp.asm
> index 44a1a6e..b816e84 100644
> --- a/libavcodec/x86/idctdsp.asm
> +++ b/libavcodec/x86/idctdsp.asm
> @@ -78,3 +78,106 @@ INIT_MMX mmx
>  PUT_SIGNED_PIXELS_CLAMPED 0
>  INIT_XMM sse2
>  PUT_SIGNED_PIXELS_CLAMPED 3
> +
> +;--------------------------------------------------------------------------
> +; void ff_put_pixels_clamped(const int16_t *block, uint8_t *pixels,
> +;                            int line_size);
> +;--------------------------------------------------------------------------
> +; %1 = block offset
> +%macro PUT_PIXELS_CLAMPED_HALF 1
> +    mova     m0, [blockq+mmsize*0+%1]
> +    mova     m1, [blockq+mmsize*2+%1]
> +%if mmsize == 8
> +    mova     m2, [blockq+mmsize*4+%1]
> +    mova     m3, [blockq+mmsize*6+%1]
> +%endif
> +    packuswb m0, [blockq+mmsize*1+%1]
> +    packuswb m1, [blockq+mmsize*3+%1]
> +%if mmsize == 8
> +    packuswb m2, [blockq+mmsize*5+%1]
> +    packuswb m3, [blockq+mmsize*7+%1]
> +    movq           [pixelsq], m0
> +    movq    [lsizeq+pixelsq], m1
> +    movq  [2*lsizeq+pixelsq], m2
> +    movq   [lsize3q+pixelsq], m3
> +%else
> +    movq           [pixelsq], m0
> +    movhps  [lsizeq+pixelsq], m0
> +    movq  [2*lsizeq+pixelsq], m1
> +    movhps [lsize3q+pixelsq], m1
> +%endif
> +%endmacro
> +
> +%macro PUT_PIXELS_CLAMPED 0
> +cglobal put_pixels_clamped, 3, 4, 2, block, pixels, lsize, lsize3
> +    lea lsize3q, [lsizeq*3]
> +    PUT_PIXELS_CLAMPED_HALF 0

this doesnt match the prototype
line_size is 32bit in the prototype but the code treats it as 64bit
this will crash if its negative

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When the tyrant has disposed of foreign enemies by conquest or treaty, and
there is nothing more to fear from them, then he is always stirring up
some war or other, in order that the people may require a leader. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140924/60588168/attachment.asc>


More information about the ffmpeg-devel mailing list