[Ffmpeg-devel] [PATCH] put_mpeg4_qpel16_h_lowpass altivec, take 2

Brian Foley bfoley
Mon Nov 27 02:26:56 CET 2006


On Sun, Nov 26, 2006 at 09:15:26PM +0100, Guillaume POIRIER wrote:
> Hi,
> 
> >I suspect Shark and simg4 are probably as good or better for
> >profiling anyway.
> 
> Well, one thing that no Apple CHUD tools seem to be able to do is
> tracking down access to un-initialized memory, which valgrind does to
> the best of my knowledge.

Yes, I think Apple's tools are the way to go. I'm just not terribly
familiar with using them for optimisation on the single instruction
level like this.

Valgrind's great for certain kinds of profiling, but it's not much use
here. The sort of thing we care about here are pipeline stalls,
functional unit utilisation, and of course cache behaviour. Thanks to
its translate-everything-into-uops approach, Valgrind can only really
give meaningful information about the cache behaviour.

Besides, my main development machine is running MacOS, which doesn't
have a fully working Valgrind port anyways. The only access I have to
Linux/PPC is through emulation using qemu, and that's both slow and not
cycle-accurate either.

Cheers,
Brian.




More information about the ffmpeg-devel mailing list