[FFmpeg-devel] swscale/arm/yuv2rgb: make the code bitexact with its aarch64 counter part

Matthieu Bouron matthieu.bouron at gmail.com
Fri Mar 25 23:45:55 CET 2016


The following patchset aims to make bitexact the yuv->rgba armv7 neon code path
with the aarch64 one. It also aims to make the two code bases as close as
possible.

[PATCH 01/10] swscale/arm/yuv2rgb: remove 32bit code path

The current 32bit code path which is unused is removed.

[PATCH 06/10] swscale/arm/yuv2rgb: only process one line at a time

The code process only one line at a time for the yuv420p,nv12 and nv21 formats
with no regression in performance observed on a rpi2 (I've even observed a
slight increase of performance for the nv12 and nv21 formats).

[PATCH 10/10] swscale/arm/yuv2rgb: make the code bitexact with its

The last patch of the serie makes the code bitexact with the aarch64 version.
The increase of precision (which introduces a performance loss) is compensated
by a refactor/optimisation that saves quite a few mov,vdup and vqdmulh.

./ffmpeg_g -nostats -f lavfi -i testsrc2=1920x1080:d=5,format=nv12,bench=start,format=bgra,bench=stop -f null -

without patchset :
[bench @ 0x3eb6a0] t:0.020660 avg:0.020813 max:0.039399 min:0.020605

with patchset:
[bench @ 0xe5f6a0] t:0.018924 avg:0.019075 max:0.037472 min:0.018846

Matthieu


More information about the ffmpeg-devel mailing list