[FFmpeg-devel] [PATCH 3/3] x86/vp9lpf: use fewer instructions in SPLATB_MIX

James Almer jamrial at gmail.com
Mon Aug 4 18:17:28 CEST 2014


On 04/08/14 10:27 AM, Ronald S. Bultje wrote:
> Hi,
> 
> 
> On Sun, Aug 3, 2014 at 10:53 PM, James Almer <jamrial at gmail.com> wrote:
> 
>> Signed-off-by: James Almer <jamrial at gmail.com>
>> ---
>>  libavcodec/x86/vp9lpf.asm | 5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/libavcodec/x86/vp9lpf.asm b/libavcodec/x86/vp9lpf.asm
>> index c5db0ca..def7d5a 100644
>> --- a/libavcodec/x86/vp9lpf.asm
>> +++ b/libavcodec/x86/vp9lpf.asm
>> @@ -302,9 +302,8 @@ SECTION .text
>>      pshufb     %1, %2
>>  %else
>>      punpcklbw  %1, %1
>> -    punpcklqdq %1, %1
>> -    pshuflw    %1, %1, 0
>> -    pshufhw    %1, %1, 0x55
>> +    punpcklwd  %1, %1
>> +    punpckldq  %1, %1
> 
> 
> Doesn't this miss the upper half of the register?
> 
> Ronald

Using the example above the macro

..............AB (start value)
punpcklbw
............AABB
punpcklwd
........AAAABBBB
punpckldq
AAAAAAAABBBBBBBB


More information about the ffmpeg-devel mailing list