[Mplayer-users] Question: K6-2 and fastmemcpy

Christoph Lampert lampert at math.chalmers.se
Thu Apr 19 17:40:37 CEST 2001

>On Thu, 19 Apr 2001 16:05:46 +0200 (MET DST), Christoph Lampert wrote:
>>>>On my Duron MMX2 optimized version has v2-v1=318426
>>>>but version with standard memcpy has v2-v1=517387
>>>on my K6-2 450MHz, running linux 2.2.16:
>>>v1 = 30557687593183 v2 = 30557688368732 v2-v1=775549
>>Is that clock-cycles?  Cool, on my K6 200 MHz (so MMX, but no MMX2)
>>the regular memcpy-version takes only 330000 in the mean...
>>Too bad I can't overclock it to 1GHz. 
>It indicates how many CPU cycles were spent per tested block. CPU frequency 
don't matter
>in this case and I intentionally didn't say my CPU frequency.

Yeah, I know that, but it tells me, that from K6 to Duron, AMD did not increase 
efficiency, say, transfered bytes/cycle. In fact, the ratio got worse, but with 
optimized code (prefetch it is...) you can get it back about the same level. 

Which is somehow great anyway, since not CPU frequency but speed of memory 
transfer is the bottleneck. So I should have written, _with my "old" 66 MHz 
memory bus_ somewhere. 

>Example: 2 NOP insns will execute on K6 at 1 cpu cycle on any cpu frequency.
>Numbers are interested only from point of relative measuring (MMX and nonMMX 
optimized versions).
Yes, but we are not talking nops, are we? We are talking memcpy which should
consists of as few "nops" as possible :-) 

Of course the code is just for relative measurement... and my results are 
not interesting to anybody else, because the K6 has no alternative to standard
memcpy. But still... counting cycles is fun!


Dipl. math. Christoph Lampert (complex analysis, integral formulae)
Email: gruel at gmx.de                |     Email: lampert at math.chalmers.se

Mplayer-users mailing list
Mplayer-users at lists.sourceforge.net

More information about the MPlayer-users mailing list