[FFmpeg-devel] [PATCH v3 14/17] swscale/x86: add SIMD backend
Michael Niedermayer
michael at niedermayer.cc
Fri May 30 05:23:12 EEST 2025
On Tue, May 27, 2025 at 09:55:33AM +0200, Niklas Haas wrote:
> From: Niklas Haas <git at haasn.dev>
>
> This covers most 8-bit and 16-bit ops, and some 32-bit ops. It also covers all
> floating point operations. While this is not yet 100% coverage, it's good
> enough for the vast majority of formats out there.
>
> Of special note is the packed shuffle fast path, which uses pshufb at vector
> sizes up to AVX512.
> ---
> libswscale/ops.c | 4 +
> libswscale/x86/Makefile | 3 +
> libswscale/x86/ops.c | 722 +++++++++++++++++++++++
> libswscale/x86/ops_common.asm | 305 ++++++++++
> libswscale/x86/ops_float.asm | 389 ++++++++++++
> libswscale/x86/ops_int.asm | 1049 +++++++++++++++++++++++++++++++++
> 6 files changed, 2472 insertions(+)
> create mode 100644 libswscale/x86/ops.c
> create mode 100644 libswscale/x86/ops_common.asm
> create mode 100644 libswscale/x86/ops_float.asm
> create mode 100644 libswscale/x86/ops_int.asm
seems to break on x86-32 linux
...
src/libswscale/x86/ops_float.asm:389: error: symbol `m9' undefined
src/libswscale/x86/ops_float.asm:378: ... from macro `linear_fns' defined here
src/libswscale/x86/ops_float.asm:339: ... from macro `linear_mask' defined here
src/libswscale/x86/ops_float.asm:330: ... from macro `linear_inner' defined here
src/libswscale/x86/ops_common.asm:296: ... from macro `IF' defined here
src//libavutil/x86/x86inc.asm:1639: ... from macro `movdqa' defined here
src//libavutil/x86/x86inc.asm:1501: ... from macro `RUN_AVX_INSTR' defined here
src//libavutil/x86/x86inc.asm:1996: ... from macro `vmovdqa' defined here
src/libswscale/x86/ops_float.asm:389: error: symbol `m10' undefined
src/libswscale/x86/ops_float.asm:378: ... from macro `linear_fns' defined here
src/libswscale/x86/ops_float.asm:339: ... from macro `linear_mask' defined here
src/libswscale/x86/ops_float.asm:331: ... from macro `linear_inner' defined here
src/libswscale/x86/ops_common.asm:296: ... from macro `IF' defined here
src//libavutil/x86/x86inc.asm:1639: ... from macro `movdqa' defined here
src//libavutil/x86/x86inc.asm:1501: ... from macro `RUN_AVX_INSTR' defined here
src//libavutil/x86/x86inc.asm:1996: ... from macro `vmovdqa' defined here
src/libswscale/x86/ops_float.asm:389: error: symbol `m11' undefined
src/libswscale/x86/ops_float.asm:378: ... from macro `linear_fns' defined here
src/libswscale/x86/ops_float.asm:339: ... from macro `linear_mask' defined here
src/libswscale/x86/ops_float.asm:332: ... from macro `linear_inner' defined here
src/libswscale/x86/ops_common.asm:296: ... from macro `IF' defined here
src//libavutil/x86/x86inc.asm:1639: ... from macro `movdqa' defined here
src//libavutil/x86/x86inc.asm:1501: ... from macro `RUN_AVX_INSTR' defined here
src//libavutil/x86/x86inc.asm:1996: ... from macro `vmovdqa' defined here
make: *** [src/ffbuild/common.mak:103: libswscale/x86/ops_float.o] Error 1
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Old school: Use the lowest level language in which you can solve the problem
conveniently.
New school: Use the highest level language in which the latest supercomputer
can solve the problem without the user falling asleep waiting.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250530/8f7b6f0d/attachment.sig>
More information about the ffmpeg-devel
mailing list