[FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

Martin Vignali martin.vignali at gmail.com
Sun Mar 18 18:46:04 EET 2018


2018-03-18 17:37 GMT+01:00 Paul B Mahol <onemda at gmail.com>:

> On 3/18/18, Nicolas George <george at nsup.org> wrote:
> > Martin Vignali (2018-03-18):
> >> I run the test again with a bigger width (512 instead of 128)
> >> This is my result :
> >> shuffle_bytes_0321_c: 128.6
> >> shuffle_bytes_0321_ssse3: 41.6
> >> shuffle_bytes_0321_avx2: 23.4
> >
> > IIUC, these benchmarks are expressed in CPU cycles. But what James says
> > is that it can cause the CPU frequency to be throttled: if that happens,
> > less cycles can use more time, and even worse, cause other unrelated to
> > take more time. A benchmark in actual time and typical use case would be
> > needed to decide.
>
> Yes, always also test overall with typical code usecase.
>
>
I tested it using a "benchmark" command line, who test two shuffle func
./ffmpeg -benchmark -f lavfi -i rgbtestsrc=size=3840x2160:duration=10 -vf
format=argb,format=rgba -f null -

With the patch :
bench: utime=3.611s
With only SSSE 3 (disable AVX2 part), i have similar result.

Without the patch :
bench: utime=6.972s

Martin


More information about the ffmpeg-devel mailing list