[Mplayer-users] [mplayer PATCH] SSE fastmemcpy improvements

Nick Kurshev nickols_k at mail.ru
Fri Apr 20 23:10:16 CEST 2001


On Fri, 20 Apr 2001 18:30:00 +0200, Felix B nemann wrote:

>commented out line 102 to line 120:
>#if 0
>	if((unsigned long)from) & 15)
>	else 

Sorry! There is missed one bracked. Correct version should be:
-if((unsigned long)from) & 15)
+if(((unsigned long)from) & 15)

>But with new code x11, xv and sdl out crash, dga works but doesn't use 
>Here is a backtrace from x11 out:
>Start playing...
>Program received signal SIGSEGV, Segmentation fault.
>[Switching to Thread 1024 (LWP 6647)]
>0x80a3c28 in draw_frame (src=0x81d8d78) at fastmemcpy.h:130
>130                     __asm__ __volatile__ (

It points to MOVAPS. Therefore on P3 prefetch does not aligned cache line.

>Test results (each 10 runs after compile):
>System is PIII 750 Coppermine, 124MHz FSB so CPU is at 930MHz and 512MB PC133 
>SD-RAM (timing 3-3-3) with linux 2.2.18 with fxsr and xmm support (regarding 
>to /proc/cpuinfo), test command issued was:
>perl -e 'for($i = 0; $i < 11; $i++) { system "./fastmembench"; }'
>v1 = 28231618005329 v2 = 28231618285403 v2-v1=280074
>v1 = 28367967036396 v2 = 28367967316508 v2-v1=280112
>old sse:
>v1 = 28510748533056 v2 = 28510748820014 v2-v1=286958
>new sse:
>v1 = 28416198780569 v2 = 28416199068093 v2-v1=287524

Very strange results even unbelievable. May be would be better to increase size of array up to 1 000 000?
Do you can to slightly modify test.c program and replace rdtsc with rdpmc?
On P2 rdtsc contains a bug which wrongly displayed TSC when non integer instructions are
executed. For example for standalone emms insn it indicates 1000 cpu cycles when for NOP - 0.

O'k. Patch can be applied after modifing 'if(((unsigned long)from) & 15)'. But benchmarks
require additional investigations.
Anyway SFENCE should be applied and if you want - new comments.

Best regards! Nick

Mplayer-users mailing list
Mplayer-users at lists.sourceforge.net

More information about the MPlayer-users mailing list