[FFmpeg-devel] [PATCH] x86/dsputil: port clear_block functions to yasm

James Almer jamrial at gmail.com
Wed May 21 18:42:42 CEST 2014


On 21/05/14 4:43 AM, Christophe Gisquet wrote:
> Hi,
> 
> 2014-05-21 8:53 GMT+02:00 James Almer <jamrial at gmail.com>:
>> +INIT_XMM sse
>> +%define ZERO xorps
>> +CLEAR_BLOCK 1, 1
> [...]
>> +INIT_XMM sse
>> +%define ZERO xorps
>> +CLEAR_BLOCKS 1
> 
> Maybe it crossed your mind and then you crossed it out for lack of
> benefit, but a sse2 and even maybe an avx version might make sense?

Tried an AVX version, but it seems the blocks are 16-byte aligned because 
it crashed on me.
Didn't look too much into it, though.

And not sure if an SSE2 version is worth it. The function is not a critical 
one (and mostly used by vc1) and xorps -> pxor, movaps -> movdqa will probably 
not make that much of a difference.

> 
>> +#if HAVE_YASM
>> +#if HAVE_SSE_EXTERNAL
> 
> From the discussion on HAVE_MMX_EXTERNAL, I would expect
> HAVE_SSE_EXTERNAL implies HAVE_YASM.
> Probably needs a confirmation from someone knowing what he's talking
> about (i.e. not me).

Yes it does, but the HAVE_YASM check there covers other stuff below the 
initializations i added, so i left it in place.


> Otherwise OK, this is a straightforward conversion.
> 
> Best regards,
> 



More information about the ffmpeg-devel mailing list