[Ffmpeg-devel] PATCH Blackfin optimized byte swapping mechanism
Michael Niedermayer
michaelni
Mon Apr 23 15:04:57 CEST 2007
Hi
On Tue, Apr 17, 2007 at 08:49:40AM -0400, Marc Hoffman wrote:
> Michael Niedermayer writes:
> > Hi
> >
> > On Tue, Apr 17, 2007 at 07:40:47AM -0400, Marc Hoffman wrote:
> > Content-Description: message body text
> > >
> > > > Low level bswap primitive for the Blackfin Architecture.
> > >
> > > sorry mangled patch wrong encoding last time.
> >
> > what advantage do these functions have over the default?
> > are they faster? if so you should provide some benchmarks
>
> Sorry about the top post please forgive me
>
> The current 32bit byte swap routine produces this code sequence
>
>
> R1 = 255 (X);
> R1 <<= 16;
> R1 = R0 & R1;
> R2 = R0 >> 24;
> R1 >>= 8;
> R2 = R2 | R1;
> R1 = 65280 (Z);
> R1 = R0 & R1;
> R1 <<= 8;
> R0 <<= 24;
> R1 = R1 | R0;
> R2 = R2 | R1;
>
> R0 = R2; <<--- result
>
> The suggested replacement is
>
> asm("%1 = %0 >> 8 (V);\n\t"
> "%0 = %0 << 8 (V);\n\t"
> "%0 = %0 | %1;\n\t"
> "%0 = PACK(%0.L, %0.H);\n\t"
>
> So I guess this is about 300% improvement in performance for this function.
guess is good, hard benchmark is better, its just 5min work to write a
loop of bswap and do a time myprog
also dont forget to set proper -mcpu / -march and -O3 with gcc
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
No snowflake in an avalanche ever feels responsible. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070423/7e280d31/attachment.pgp>
More information about the ffmpeg-devel
mailing list