[FFmpeg-devel] [PATCH] av_filter/x86/idet: MMX/SSE2 implementation of 16bits filter_line()
James Almer
jamrial at gmail.com
Tue Sep 9 19:31:32 CEST 2014
On 09/09/14 9:52 AM, Pascal Massimino wrote:
> + mova m2, m_sum
> +%if mmsize == 16
> + psrldq m2, 4
> + paddd m_sum, m2
> + psrldq m2, 4
> + paddd m_sum, m2
> + psrldq m2, 4
> + paddd m_sum, m2
> +%else
> + psrlq m2, 32
> + paddd m_sum, m2
> +%endif
The SSE2 version is using three instructions more than necessary here.
You could use the HADDD macro to replace the code above, which expands
to a more optimized SSE2 version.
And now that i check the old stuff again, you could also use it in the
IDET_FILTER_LINE macro. It will be one less instruction for the mmxext
version.
More information about the ffmpeg-devel
mailing list