[FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC
Ronald S. Bultje
rsbultje at gmail.com
Mon May 20 18:52:56 EEST 2024
Hi,
one more, I forgot.
On Sun, May 19, 2024 at 8:46 PM Stone Chen <chen.stonechen at gmail.com> wrote:
> +pw_1: dw 1
>
[..]
> + vpbroadcastw m4, [pw_1]
>
We typically suggest to use vpbroadcastd, not w (and then pw_1: times 2 dw
1). agner shows that on e.g. Haswell, the former (d) is 1 uops with 5
cycles latency, whereas the latter (w) is 3 uops with 7 cycles latency, or
more generally d is faster then w.
Ronald
More information about the ffmpeg-devel
mailing list