[FFmpeg-devel] [PATCH] Moves yuv2yuvX_sse3 to yasm, unrolls main loop and other small optimizations for ~20% speedup. AVX2 version is ready and tested, although local tests show a significant speed-up in this function using avx2, swscale code slows down overall probably due cpu frequency scaling.

Michael Niedermayer michael at niedermayer.cc
Sat Oct 24 15:20:08 EEST 2020


On Fri, Oct 23, 2020 at 03:34:18PM +0200, Alan Kelly wrote:
>  Fixed. The wrong step size was used causing a write passed the end of
>  the buffer. yuv2yuvX_mmxext is now called if there are any remaining
> pixels.
> 
>  There is currently no checkasm for these functions. Is this required for
> submission?
> 
>  (Apologies for the double mail, I used git send-email but it didn't
> respond to the correct thread)
> ---
>  libswscale/x86/Makefile     |   1 +
>  libswscale/x86/swscale.c    |  75 ++++----------------------
>  libswscale/x86/yuv2yuvX.asm | 105 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 116 insertions(+), 65 deletions(-)
>  create mode 100644 libswscale/x86/yuv2yuvX.asm

error: corrupt patch at line 18

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The worst form of inequality is to try to make unequal things equal.
-- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20201024/645b837e/attachment.sig>


More information about the ffmpeg-devel mailing list