[MPlayer-dev-eng] [PATCH] (new version) AltiVec: dct64 for mp3lib, IMDCT for liba52, detection code

Romain Dolbeau dolbeau at irisa.fr
Sat Jan 18 18:07:38 CET 2003


Daniel Egger wrote:

> This is strange because a Radeon supports both of them.

The radeon surely. Apple's driver and in particular
the YUV overlay bit... apparently not :-(

 > This is completely upside down; I wouldn't remotely rely on a profile
 > dump where decoding the whole movie takes less time than copying chunks
 > of memory.

Yet most expected functions can be seen in the profile,
so they are profiled.

Also, the machine where the profile was generated was
a 800 Mhz PPC7450 w/o L3 cache, so as soon as you're
out of the 128KB L2, memory accesses are very expensive.
(it's regular PC133, not even DDR on this box...)

If the heavy computations are in-cache and the memory
copies are off-cache, then the copies will eat up
plenty of time. If you (copy, compute_on_copy),
then most of the memory latency will be seen in the
copy instead of the computations.

I don't have a G4 w/ L3 to verify this theory :-(

> You need to compile the whole application with profiling AND link it
> against a proper libc.

I added -pg to config.mak, isn't that enough ? (after removing
all .a and .o, of course). I'm not sure what is a proper libc
for profiling on MacOSX ...

-- 
Romain Dolbeau



More information about the MPlayer-dev-eng mailing list