[FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

Martin Vignali martin.vignali at gmail.com
Sun Mar 18 17:08:17 EET 2018


2018-03-03 18:20 GMT+01:00 Martin Vignali <martin.vignali at gmail.com>:

> Hello,
>
> Patch in attach add SIMD for the 5 shuffle_bytes func for rgb2rgb
> The new SIMD are write using external ASM.
>
> Also add checkasm test for theses func
> Restricted to x86_64, because the scalar part doesn't compile on x86_32
>
> I consider for the scalar part that the src_size value is a multiple of 4
> (because the shuffle is for 4 bytes)
>
> Pass fate test on X86_64 and X86_32 (os 10.12)
>
>
>
>
> New patchs in attach :
- Now compile on x86_32 and x86_64
- Add cosmetic patch to put all shuffle_bytes declaration in the same place

Tested on X86_64 and X86_32 (os 10.12)

Checkasm result :  ./tests/checkasm/checkasm --test=sw_rgb --bench

checkasm: using random seed 292997963
MMX:
 - sw_rgb.shuffle_bytes_2103 [OK]
MMXEXT:
 - sw_rgb.shuffle_bytes_2103 [OK]
SSSE3:
 - sw_rgb.shuffle_bytes_2103 [OK]
 - sw_rgb.shuffle_bytes_0321 [OK]
 - sw_rgb.shuffle_bytes_1230 [OK]
 - sw_rgb.shuffle_bytes_3012 [OK]
 - sw_rgb.shuffle_bytes_3210 [OK]
AVX2:
 - sw_rgb.shuffle_bytes_2103 [OK]
 - sw_rgb.shuffle_bytes_0321 [OK]
 - sw_rgb.shuffle_bytes_1230 [OK]
 - sw_rgb.shuffle_bytes_3012 [OK]
 - sw_rgb.shuffle_bytes_3210 [OK]
checkasm: all 12 tests passed
shuffle_bytes_0321_c: 51.4
shuffle_bytes_0321_ssse3: 18.7
shuffle_bytes_0321_avx2: 12.7
shuffle_bytes_1230_c: 126.9
shuffle_bytes_1230_ssse3: 16.7
shuffle_bytes_1230_avx2: 12.9
shuffle_bytes_2103_c: 52.4
shuffle_bytes_2103_mmx: 76.7
shuffle_bytes_2103_mmxext: 197.2
shuffle_bytes_2103_ssse3: 17.4
shuffle_bytes_2103_avx2: 12.4
shuffle_bytes_3012_c: 127.4
shuffle_bytes_3012_ssse3: 14.7
shuffle_bytes_3012_avx2: 12.4
shuffle_bytes_3210_c: 127.4
shuffle_bytes_3210_ssse3: 18.2
shuffle_bytes_3210_avx2: 12.9


Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-swscale-rgb-add-X86-SIMD-SSSE3-AVX2-for.patch
Type: application/octet-stream
Size: 5184 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20180318/9820ffec/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-swscale-rgb-add-X86-SIMD-SSSE3-AVX2-for.patch
Type: application/octet-stream
Size: 8106 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20180318/9820ffec/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-checkasm-swscale-add-test-for-rgb-shuffle_bytes-func.patch
Type: application/octet-stream
Size: 5752 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20180318/9820ffec/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-swscale-rgb2rgb-cosmetic-move-shuffle_bytes-func-dec.patch
Type: application/octet-stream
Size: 1954 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20180318/9820ffec/attachment-0003.obj>


More information about the ffmpeg-devel mailing list