[FFmpeg-devel] [PATCH v3 14/17] swscale/x86: add SIMD backend

Michael Niedermayer michael at niedermayer.cc
Fri May 30 05:23:12 EEST 2025


On Tue, May 27, 2025 at 09:55:33AM +0200, Niklas Haas wrote:
> From: Niklas Haas <git at haasn.dev>
> 
> This covers most 8-bit and 16-bit ops, and some 32-bit ops. It also covers all
> floating point operations. While this is not yet 100% coverage, it's good
> enough for the vast majority of formats out there.
> 
> Of special note is the packed shuffle fast path, which uses pshufb at vector
> sizes up to AVX512.
> ---
>  libswscale/ops.c              |    4 +
>  libswscale/x86/Makefile       |    3 +
>  libswscale/x86/ops.c          |  722 +++++++++++++++++++++++
>  libswscale/x86/ops_common.asm |  305 ++++++++++
>  libswscale/x86/ops_float.asm  |  389 ++++++++++++
>  libswscale/x86/ops_int.asm    | 1049 +++++++++++++++++++++++++++++++++
>  6 files changed, 2472 insertions(+)
>  create mode 100644 libswscale/x86/ops.c
>  create mode 100644 libswscale/x86/ops_common.asm
>  create mode 100644 libswscale/x86/ops_float.asm
>  create mode 100644 libswscale/x86/ops_int.asm

seems to break on x86-32 linux

...
src/libswscale/x86/ops_float.asm:389: error: symbol `m9' undefined
src/libswscale/x86/ops_float.asm:378: ... from macro `linear_fns' defined here
src/libswscale/x86/ops_float.asm:339: ... from macro `linear_mask' defined here
src/libswscale/x86/ops_float.asm:330: ... from macro `linear_inner' defined here
src/libswscale/x86/ops_common.asm:296: ... from macro `IF' defined here
src//libavutil/x86/x86inc.asm:1639: ... from macro `movdqa' defined here
src//libavutil/x86/x86inc.asm:1501: ... from macro `RUN_AVX_INSTR' defined here
src//libavutil/x86/x86inc.asm:1996: ... from macro `vmovdqa' defined here
src/libswscale/x86/ops_float.asm:389: error: symbol `m10' undefined
src/libswscale/x86/ops_float.asm:378: ... from macro `linear_fns' defined here
src/libswscale/x86/ops_float.asm:339: ... from macro `linear_mask' defined here
src/libswscale/x86/ops_float.asm:331: ... from macro `linear_inner' defined here
src/libswscale/x86/ops_common.asm:296: ... from macro `IF' defined here
src//libavutil/x86/x86inc.asm:1639: ... from macro `movdqa' defined here
src//libavutil/x86/x86inc.asm:1501: ... from macro `RUN_AVX_INSTR' defined here
src//libavutil/x86/x86inc.asm:1996: ... from macro `vmovdqa' defined here
src/libswscale/x86/ops_float.asm:389: error: symbol `m11' undefined
src/libswscale/x86/ops_float.asm:378: ... from macro `linear_fns' defined here
src/libswscale/x86/ops_float.asm:339: ... from macro `linear_mask' defined here
src/libswscale/x86/ops_float.asm:332: ... from macro `linear_inner' defined here
src/libswscale/x86/ops_common.asm:296: ... from macro `IF' defined here
src//libavutil/x86/x86inc.asm:1639: ... from macro `movdqa' defined here
src//libavutil/x86/x86inc.asm:1501: ... from macro `RUN_AVX_INSTR' defined here
src//libavutil/x86/x86inc.asm:1996: ... from macro `vmovdqa' defined here
make: *** [src/ffbuild/common.mak:103: libswscale/x86/ops_float.o] Error 1



[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Old school: Use the lowest level language in which you can solve the problem
            conveniently.
New school: Use the highest level language in which the latest supercomputer
            can solve the problem without the user falling asleep waiting.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250530/8f7b6f0d/attachment.sig>


More information about the ffmpeg-devel mailing list