[Mplayer-users] fastmemcpy benchmark

Nick Kurshev nickols_k at mail.ru
Sun Apr 22 16:10:53 CEST 2001


Hello, Arpi!

On Sun, 22 Apr 2001 04:34:22 +0200 (CEST), Arpi wrote:

>> My results on celeron-2 600 overclocked to 900, 256MB PC133 ram:
>> ./fastmem-k6:  Illegal instruction
>> ./fastmem-k7:  v2-v1=55224677
>> ./fastmem-mmx: v2-v1=58553622
>> ./fastmem-sse: v2-v1=54496382
>> 
>> In MMX version the standrd glibc memcpy() was used.
>> (so it isn't really MMX code, except if glibc has MMX asm)
>> 
>> I think, with video memory these results differs a lot!
>> I'll make some tests with G400's memory as destination.
>It's done. Allocates an 1024*768*2 bytes buffer in video ram
>(64k aligned) and copy some data from system memory to there
>100 times, measuring cpu clocks and microseconds (timeGetTime):
>
>./fastmem.sh: line 4:  6172 Illegal instruction     ./fastmem-k6
>k7 : v2-v1=590389086 = 654561us  (152.774fps)
>mmx: v2-v1=636227889 = 705310us  (141.782fps)
>sse: v2-v1=593803067 = 658278us  (151.912fps)
>
>I ran it several times, but results are near same.
>The only interesting thing with this, that k7 version is
>faster copying to video mem, while sse is faster to system ram.
>(running on a P3... why doesn't stop k7 version with Illegal instruction?)
>
Because there are no illegal instruction in case of K7. I intentionally coded only valid opcodes for both K7 
and P3 cpus. In case of K6 there is illegal instruction for your cpu: FEMMS. 
(Since 3DNOW opcode PREFETCH for some reasons works on your Celeron-II (was 2 week ago))

Only last think you forgot to do: SFENCE. It have no effect for fastmemcpy but it have effect after 
fastmemcpy and slightly speedup (3-5%) of mplayer at least on my Duron.
Please see below:

--- fastmemcpy.old	Sat Apr 21 21:49:28 2001
+++ fastmemcpy.h	Sun Apr 22 14:08:06 2001
@@ -139,6 +139,9 @@
 #if defined( HAVE_3DNOW ) && !defined( HAVE_MMX2 )
 		__asm__ __volatile__ ("femms":::"memory");
 #else
+                /* since movntq is weakly-ordered, a "sfence"
+		 * is needed to become ordered again. */
+		__asm__ __volatile__ ("sfence":::"memory");
 		__asm__ __volatile__ ("emms":::"memory");
 #endif
 	}

Best regards! Nick


_______________________________________________
Mplayer-users mailing list
Mplayer-users at lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/mplayer-users



More information about the MPlayer-users mailing list