[FFmpeg-devel] [PATCH] avcodec/x86/vp9: Add AVX-512ICL for 16x16 and 32x32 10bpc inverse transforms

Henrik Gramner henrik at gramner.com
Wed May 21 18:48:41 EEST 2025


Tested to pass FATE on Linux and Windows.

Checkasm numbers vs the existing SSE2 code on Zen 5 (Strix Halo):
vp9_inv_adst_adst_16x16_sub16_add_10_sse2:       1041.8 ( 1.92x)
vp9_inv_adst_adst_16x16_sub16_add_10_avx512icl:   132.5 (15.06x)

vp9_inv_dct_adst_16x16_sub16_add_10_sse2:         901.0 ( 1.98x)
vp9_inv_dct_adst_16x16_sub16_add_10_avx512icl:    120.8 (14.79x)

vp9_inv_dct_dct_16x16_sub16_add_10_sse2:          750.6 ( 2.10x)
vp9_inv_dct_dct_16x16_sub16_add_10_avx512icl:     110.9 (14.18x)

vp9_inv_dct_dct_32x32_sub32_add_10_sse2:         3922.6 ( 2.24x)
vp9_inv_dct_dct_32x32_sub32_add_10_avx512icl:     506.6 (17.37x)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vp9_itx_10_avx512.patch
Type: application/octet-stream
Size: 48167 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250521/2acb5e54/attachment.obj>


More information about the ffmpeg-devel mailing list