[FFmpeg-devel] [PATCH 4/7] x86: sbrdsp: implement SSE hf_apply_noise
christophe.gisquet at gmail.com
Sun Apr 7 12:25:13 CEST 2013
2013/4/6 Michael Niedermayer <michaelni at gmx.at>:
Argh, lapsus codae I guess :D
>> +%if cpuflag(sse2)
>> + punpckhdq m4, m3, m3
>> + punpckldq m3, m3, m3
>> + unpckhps m4, m3, m3
>> + unpcklps m3, m3, m3
Same suggestion as Diego, which makes sense.
> it might make sense to do something in some header with a macro
> maybe so that punpckl/dq get turned into unpck* on SSE1
Like mova/movh for instance, as the isns are indeed completely similar.
Jason and Loren suggest using and upgrading SBUTTERFLY. But:
1) There may be cases where we still want to use those unpck for something else
2) pxor/xorps could also to be considered (pshufd/shufps are very
similar but probably not enough to go through this
Regarding SBUTTERFLY, the AVX1 _f128 version would need to be integrated.
But here the noise table addressing wraps around so I think, to use
ymm, it would require unrolling the function to handle this. There I'd
prefer to focus on xmm versions and leave ymm to someone that really
wants to write an avx version and test it (eg for gsoc/whatever).
More information about the ffmpeg-devel