[FFmpeg-devel] [PATCH] avfilter/vf_w3fdif: add x86 SIMD
jamrial at gmail.com
Wed Oct 7 22:44:19 CEST 2015
On 10/7/2015 5:27 PM, Paul B Mahol wrote:
> +cglobal w3fdif_simple_high, 5, 10, 8, 0, work_line, in_lines_cur0, in_lines_adj0, coef, linesize
> +cglobal w3fdif_complex_high, 5, 14, 8, 0, work_line, in_lines_cur0, in_lines_adj0, coef, linesize
All the values in coeff_hf fit in words, so you should be able to get
these two functions working with pmaddwd instead of pmulld.
It will be faster and will also work on older CPUs.
You can probably also replace the pshufb with some punpk* instructions,
since (unless i'm reading this wrong) you're just doing a zero extend,
and effectively get these functions down to sse2.
If pshufb is measurably faster, then you can keep both sse2 and ssse3
versions and let the init code choose the best at runtime.
More information about the ffmpeg-devel