[FFmpeg-devel] [PATCH] avcodec/x86/vp9: Add AVX-512ICL for 16x16 and 32x32 10bpc inverse transforms
Henrik Gramner
henrik at gramner.com
Wed May 21 18:48:41 EEST 2025
Tested to pass FATE on Linux and Windows.
Checkasm numbers vs the existing SSE2 code on Zen 5 (Strix Halo):
vp9_inv_adst_adst_16x16_sub16_add_10_sse2: 1041.8 ( 1.92x)
vp9_inv_adst_adst_16x16_sub16_add_10_avx512icl: 132.5 (15.06x)
vp9_inv_dct_adst_16x16_sub16_add_10_sse2: 901.0 ( 1.98x)
vp9_inv_dct_adst_16x16_sub16_add_10_avx512icl: 120.8 (14.79x)
vp9_inv_dct_dct_16x16_sub16_add_10_sse2: 750.6 ( 2.10x)
vp9_inv_dct_dct_16x16_sub16_add_10_avx512icl: 110.9 (14.18x)
vp9_inv_dct_dct_32x32_sub32_add_10_sse2: 3922.6 ( 2.24x)
vp9_inv_dct_dct_32x32_sub32_add_10_avx512icl: 506.6 (17.37x)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vp9_itx_10_avx512.patch
Type: application/octet-stream
Size: 48167 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250521/2acb5e54/attachment.obj>
More information about the ffmpeg-devel
mailing list