[FFmpeg-devel] [PATCH] Dsputilize some functions from APE decode 2/2 - SSE2
Kostya
kostya.shishkov
Fri Jul 11 06:46:21 CEST 2008
On Thu, Jul 10, 2008 at 10:36:07PM +0200, Michael Niedermayer wrote:
> On Thu, Jul 10, 2008 at 06:17:59PM +0300, Kostya wrote:
> > On Thu, Jul 10, 2008 at 01:20:18PM +0200, Michael Niedermayer wrote:
> > > On Thu, Jul 10, 2008 at 01:48:46PM +0300, Kostya wrote:
> > > > On Thu, Jul 10, 2008 at 11:48:14AM +0200, Michael Niedermayer wrote:
> > > > > On Thu, Jul 10, 2008 at 11:16:01AM +0300, Kostya wrote:
> > > > > [...]
> > > > > > +static void add_int16_sse2(int16_t * v1, int16_t * v2, int order)
> > > > > > +{
> > > > > > + x86_reg o = -(order << 1);
> > > > > > + v1 += order;
> > > > > > + v2 += order;
> > > > > > + asm volatile(
> > > > > > + "1: \n\t"
> > > > >
> > > > > > + "movdqu (%1,%2), %%xmm0 \n\t"
> > > > > > + "paddw (%0,%2), %%xmm0 \n\t"
> > > > > > + "movdqa %%xmm0, (%0,%2) \n\t"
> > > > > > + "add $16, %2 \n\t"
> > > > > > + "movdqu (%1,%2), %%xmm0 \n\t"
> > > > > > + "paddw (%0,%2), %%xmm0 \n\t"
> > > > > > + "movdqa %%xmm0, (%0,%2) \n\t"
> > > > > > + "add $16, %2 \n\t"
> > > > >
> > > > > is that faster than:
> > > > > "movdqu (%1,%2), %%xmm0 \n\t"
> > > > > "paddw (%0,%2), %%xmm0 \n\t"
> > > > > "movdqa %%xmm0, (%0,%2) \n\t"
> > > > > "movdqu 16(%1,%2), %%xmm0 \n\t"
> > > > > "paddw 16(%0,%2), %%xmm0 \n\t"
> > > > > "movdqa %%xmm0, 16(%0,%2) \n\t"
> > > > > "add $32, %2 \n\t"
> > > > >
> > > > > ?
> > > >
> > > > It was the first thing I've tried. It was slower (on Core2).
> > >
> > > and:
> > >
> > > "movdqu (%1,%2), %%xmm0 \n\t"
> > > "movdqu 16(%1,%2), %%xmm1 \n\t"
> > > "paddw (%0,%2), %%xmm0 \n\t"
> > > "paddw 16(%0,%2), %%xmm1 \n\t"
> > > "movdqa %%xmm0, (%0,%2) \n\t"
> > > "movdqa %%xmm1, 16(%0,%2) \n\t"
> > > "add $32, %2 \n\t"
> >
> > It's on par. Patch attached for reference.
>
> patch looks ok
applied
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> The misfortune of the wise is better than the prosperity of the fool.
> -- Epicurus
More information about the ffmpeg-devel
mailing list