[FFmpeg-devel] libavutil simd
Luca Barbato
lu_zero
Tue Oct 2 21:22:35 CEST 2007
Michael Niedermayer wrote:
> you are comparing naive C code against optimized altivec
Comparing barely optimized scalar code with barely optimized simd code.
you may do lots of strange tricks using certain operators in altivec
(one is vec_perm)
> you can easily work with 4 bytes at a time, try something like
> (totally untested and iam certain it does contain some bugs its just to
> demonstrate how it can be done and yes it can be opimized further)
4 bytes -> 32bit -> a register
8 bytes -> 64bit -> a register (on certain arches)
16bytes ->128bit -> a register (on certain arches)
ppc/ppc64 offer you 32 registers each, Cell adds 8x128 128bit registers
to the mix...
I can work with at least 16 byte at time and I have plenty of register
to try my best not starving the pipeline ^^;
Maybe simd could be useful isn't it?
lu
--
Luca Barbato
Gentoo/linux Gentoo/PPC
http://dev.gentoo.org/~lu_zero
More information about the ffmpeg-devel
mailing list