[FFmpeg-devel] [PATCH] x86/swr: make int32_to_int32 un/pack_2ch functions SSE

Michael Niedermayer michaelni at gmx.at
Wed Jan 14 17:59:54 CET 2015


On Wed, Jan 14, 2015 at 01:53:48AM -0300, James Almer wrote:
> unpack_2ch is already using sse float ops only, and pack_2ch is a trivial change.
> Rename both to float_to_float for consistency.
> 
> Signed-off-by: James Almer <jamrial at gmail.com>
> ---
>  libswresample/x86/audio_convert.asm    | 14 ++++++++------
>  libswresample/x86/audio_convert_init.c | 11 +++++++----
>  2 files changed, 15 insertions(+), 10 deletions(-)
> 
> diff --git a/libswresample/x86/audio_convert.asm b/libswresample/x86/audio_convert.asm
> index 1617e0b..c13c26f 100644
> --- a/libswresample/x86/audio_convert.asm
> +++ b/libswresample/x86/audio_convert.asm
> @@ -60,8 +60,8 @@ pack_2ch_%2_to_%1_u_int %+ SUFFIX
>      punpcklwd m0, m2
>      punpckhwd m1, m2
>  %else
> -    punpckldq m0, m2
> -    punpckhdq m1, m2
> +    unpcklps  m0, m2
> +    unpckhps  m1, m2
>  %endif
>      %6 m0,m1,m2,m3,m4,m5
>  %else

did you benchmark this ?
ive just checked and on Pentium M, Core Solo and Core Duo these are
listed as having only 1/5 the throughput
on sandybridge they are still listed with half the throughput than
their integer counterparts
i didnt benchmark it though

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

During times of universal deceit, telling the truth becomes a
revolutionary act. -- George Orwell
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20150114/def3f391/attachment.asc>


More information about the ffmpeg-devel mailing list