[FFmpeg-devel] [PATCH] x86/dsputil: implement 3DNow version of vector_clipf

James Almer jamrial at gmail.com
Tue May 27 21:08:30 CEST 2014


On 27/05/14 3:17 PM, Clément Bœsch wrote:
> On Tue, May 27, 2014 at 03:16:03PM -0300, James Almer wrote:
>> Signed-off-by: James Almer <jamrial at gmail.com>
>> ---
>> Those old k6-2 and k7 need some love
>>
>>  libavcodec/x86/dsputil.asm    | 47 +++++++++++++++++++++++++++++++++----------
>>  libavcodec/x86/dsputil_init.c | 11 ++++++++++
>>  libavcodec/x86/dsputil_x86.h  |  2 ++
>>  3 files changed, 49 insertions(+), 11 deletions(-)
>>
>> diff --git a/libavcodec/x86/dsputil.asm b/libavcodec/x86/dsputil.asm
>> index 4804682..36c9258 100644
>> --- a/libavcodec/x86/dsputil.asm
>> +++ b/libavcodec/x86/dsputil.asm
>> @@ -630,19 +630,35 @@ PUT_SIGNED_PIXELS_CLAMPED 3
>>  ;void ff_vector_clipf(float *dst, const float *src,
>>  ;                     float min, float max, int len)
>>  ;-----------------------------------------------------
>> -INIT_XMM sse
>> +%macro CLIPF_3DNOW 3
>> +    pfmin   %1, %3
>> +    pfmax   %1, %2
>> +%endmacro
>> +
>> +%macro CLIPF_SSE 3
>> +    minps   %1, %3
>> +    maxps   %1, %2
>> +%endmacro
>> +
> 
> Maybe do twice at a time so you can do some pairing (which might help with
> performance)?

I didn't bench, but wouldn't out-of-order execution deal with this anyway?


More information about the ffmpeg-devel mailing list