[Ffmpeg-devel] [RFC] smallcpy for h264

Michael Niedermayer michaelni
Sat Oct 7 14:56:05 CEST 2006


Hi

On Sat, Oct 07, 2006 at 02:18:52PM +0200, Luca Barbato wrote:
> Michael Niedermayer wrote:
> > but before i will agree to this i want
> > 1. to know why we spend a significant time doing small memcpys
> 
> Loren do you have time to have a look on it? The on x86simd codepath has
> many of them...
> 
> > 2. why ppc doesnt inline memcpy like x86 does
> 
> inlined memcpy are triggered with -O3 iirc, so having them doesn't help
> speed at all (see the threads about avoiding -O3 to get better speed)

-O2 vs. -O3 gains where about dsputil* and these are already arch specific
so no small_cpy is needed

furthermore if -O2 is faster its likely not changing memcpy inlining
behavior or memcpy is so insignifcant that it doesnt matter


> I'll dig glibc to see if we have inlined variants available.

this belongs to gcc not libc, libc cannot detect fixed size mempcy
and inline a special optimized version


> 
> > 
> > furthermore these aligment related changes must be split,reviewed
> > and applied before any benchmarking makes sense (= your benchmark
> > of missaliged arrays with memcpy vs. your code with aligned arrays
> > might show more the speed difference of alignment and less that
> > of the actual code)
> 
> please check the attached code.

rejected, the only memcpy i have found which touch these are in the init code

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is




More information about the ffmpeg-devel mailing list