[FFmpeg-devel] [PATCH 02/10] diracdsp: add dequantization SIMD

Rostislav Pehlivanov atomnuker at gmail.com
Wed Jun 29 19:53:09 CEST 2016


On 27 June 2016 at 22:38, James Almer <jamrial at gmail.com> wrote:

> On 6/27/2016 8:53 AM, Rostislav Pehlivanov wrote:
> > I've attached another patch which should work fine now.
> > I did this after the put_signed_rect so it does require the first patch,
> > but if this patch is okay I'll amend and tidy things before I push.
> > For some reason changing dstq to be stored at r4 or r3 broke it and I've
> no
> > idea why. Neither is used after loading m2 and m3. Should work on x86_32
> > now, but I'm wondering why I can't save that register.
>
> [...]
>
> > diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
> > index c5cc530..4bc8b2d 100644
> > --- a/libavcodec/x86/diracdsp.asm
> > +++ b/libavcodec/x86/diracdsp.asm
> > @@ -266,9 +266,45 @@ HPEL_FILTER sse2
> >  ADD_OBMC 32, sse2
> >  ADD_OBMC 16, sse2
> >
> > -%if ARCH_X86_64 == 1
> >  INIT_XMM sse4
> >
> > +; void dequant_subband_32(uint8_t *src, uint8_t *dst, ptrdiff_t stride,
> const int qf, const int qs, int tot_v, int tot_h)
> > +cglobal dequant_subband_32, 7, 8, 4, src, dst, stride, qf, qs, tot_v,
> tot_h
>
> x86_32 has 8 gprs but you can only use 7 as the last one is reserved
> to keep the stack pointer.
>
> > +
> > +    movd   m2, qfd
> > +    movd   m3, qsd
> > +    SPLATD m2
> > +    SPLATD m3
> > +    mov    r4, tot_hq
> > +    mov    r7, dstq
> > +
> > +    .loop_v:
> > +    mov    tot_hq, r4
> > +    mov    dstq,   r7
> > +
> > +    .loop_h:
> > +    movu   m0, [srcq]
> > +
> > +    pabsd  m1, m0
> > +    pmulld m1, m2
> > +    paddd  m1, m3
> > +    psrld  m1,  2
> > +    psignd m1, m0
> > +
> > +    movu   [dstq], m1
> > +
> > +    add    srcq, mmsize
> > +    add    dstq, mmsize
> > +    sub    tot_hd, 4
> > +    jg     .loop_h
> > +
> > +    add    r7, strideq
> > +    dec    tot_vd
> > +    jg     .loop_v
> > +
> > +    RET
>
> I'm not sure why you say using r3 instead of r7 here didn't work for
> you. I just tried it (after applying all patches up to 6/10) and fate
> at least still passes, on both x86_64 and x86_32.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>

Odd, works fine now. I guess it just needed a clean build.
Attached a working patch.

I'd like to get some feedback on the other patches before I push though,
particularly the Golomb reader.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-diracdsp-add-dequantization-SIMD.patch
Type: text/x-patch
Size: 6575 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160629/2e1c42fd/attachment.bin>


More information about the ffmpeg-devel mailing list