[FFmpeg-devel] [PATCH v2 0/2] swscale: neon aarch64 rgb_to_yuv optimizationsj
Dmitriy Kovalenko
dmtr.kovalenko at outlook.com
Fri May 30 00:34:22 EEST 2025
This is a follow up based on the review feedbakc by Martin Storsjö. I
fixed all the identation issues and added post load increment to the all
used macros and macroed back all the code that was unmacroed in the
previous version.
Per the prefetching instructions: They definetely not giving visible
difference on a new CPUs like Macbook Pro, but I see a noticable 3-5% performance
difference on my tests for the more mobile devices for example IPhone 8
with A11 Bionic CPU or Amazon Fire HD which I am interested especially
to optimize for. Per my checkasm tests they are not slowing down nethier
macos nor linux arm builds so why not to keep them?
Dmitriy Kovalenko (2):
swscale: rgb_to_yuv neon optimizations
swscale: Neon rgb_to_yuv_half process 32 pixels at a time
libswscale/aarch64/input.S | 212 +++++++++++++++++++++++++++----------
1 file changed, 155 insertions(+), 57 deletions(-)
--
2.49.0
More information about the ffmpeg-devel
mailing list