[FFmpeg-devel] [PATCH 3/3] x86/vp9lpf: use fewer instructions in SPLATB_MIX

Ronald S. Bultje rsbultje at gmail.com
Mon Aug 4 18:20:37 CEST 2014


Hi,

On Mon, Aug 4, 2014 at 12:17 PM, James Almer <jamrial at gmail.com> wrote:

> On 04/08/14 10:27 AM, Ronald S. Bultje wrote:
> > Hi,
> >
> >
> > On Sun, Aug 3, 2014 at 10:53 PM, James Almer <jamrial at gmail.com> wrote:
> >
> >> Signed-off-by: James Almer <jamrial at gmail.com>
> >> ---
> >>  libavcodec/x86/vp9lpf.asm | 5 ++---
> >>  1 file changed, 2 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/libavcodec/x86/vp9lpf.asm b/libavcodec/x86/vp9lpf.asm
> >> index c5db0ca..def7d5a 100644
> >> --- a/libavcodec/x86/vp9lpf.asm
> >> +++ b/libavcodec/x86/vp9lpf.asm
> >> @@ -302,9 +302,8 @@ SECTION .text
> >>      pshufb     %1, %2
> >>  %else
> >>      punpcklbw  %1, %1
> >> -    punpcklqdq %1, %1
> >> -    pshuflw    %1, %1, 0
> >> -    pshufhw    %1, %1, 0x55
> >> +    punpcklwd  %1, %1
> >> +    punpckldq  %1, %1
> >
> >
> > Doesn't this miss the upper half of the register?
> >
> > Ronald
>
> Using the example above the macro
>
> ..............AB (start value)
> punpcklbw
> ............AABB
> punpcklwd
> ........AAAABBBB
> punpckldq
> AAAAAAAABBBBBBBB


Oh I see not a byte-splat, my bad, sorry please ignore my comment.

Ronald


More information about the ffmpeg-devel mailing list