[FFmpeg-devel] [PATCH] PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 VSX SIMD.

Carl Eugen Hoyos cehoyos at ag.or.at
Mon Jul 4 12:20:03 EEST 2016


Dan Parrot <dan.parrot <at> mail.com> writes:

> The dataset used was the entire FATE regression suite.

I don't think this is a particularly useful testcase:
It takes very long but mostly tests other things.

Did you test if using ffmpeg -benchmark -f rawvideo -i /dev/zero... 
showed different results?
I believe this should be both easier and faster to test.

> name: rgb24ToY_c_vsx. 
> no. of calls: 9999. min: 3832 ns. avg: 4709 ns. max: 37550 ns. 
> total: 47093533 ns. 
> 
> name: rgb24ToY_c. 
> no. of calls: 9999. min: 3809 ns. avg: 4707 ns. max: 29041 ns. 
> total: 47072923 ns.

Without any data, I would have thought that this is the most 
important function (and "no. of calls" seems to confirm this).

Why is this not faster?
Can you confirm with START_TIMER / STOP_TIMER that there is no 
gain?

Thank you, Carl Eugen



More information about the ffmpeg-devel mailing list