[MPlayer-dev-eng] swscale question

Tue Oct 30 20:23:36 CET 2001

Hi

On Tuesday 30 October 2001 19:14, Nick Kurshev wrote:
> Hello, Michael!
>
> I've looking on your code and have some question for you:
> 1. For what reason you've added "normal" asm optimization?
>  #endif
> 	//NO MMX just normal asm ...
> 	asm volatile(
> 		"xorl %%eax, %%eax		\n\t" // i
> 		"xorl %%ebx, %%ebx		\n\t" // xx
> 		"xorl %%ecx, %%ecx		\n\t" // 2*xalpha
> 		"1:				\n\t"
> 		"movzbl  (%0, %%ebx), %%edi	\n\t" //src[xx]
> 		"movzbl 1(%0, %%ebx), %%esi	\n\t" //src[xx+1]
> For what cpu it's optimized (pent, pent-mmx, ppro or k6, k7)?
hmm, mine ;) ... (P3 at 500)
it was written before the mmx2 code

> IMHO we should not ignore optimizing possibilities of gcc which
> produces enough optimized code for targeted architectures.
> (Even if you've win 1-2% on your cpu it doesn't mean that
> we'll get the same speedup on every cpu).
i fully agree with useing gcc if it outputs sane code, although a simple
./mplayer -vo x11 -pp 0 -zoom -xy 2 ~/ff.mpg  -benchmark shows:
gcc: 16.532 sec
asm: 12.467 sec
looking at the output of gcc, it seems there is at least one partial register 
stall (5 cycles loss on ppro,p2,p3)
the c functions are not really optimized (this one does 2 multiplies 
allthough 1 would be enough, ...)
gcc neither used add/adc, ...

> From other side - togheter withh gcc exists other compilers which
> can produce better code that gcc now, but I hope that in the
> future gcc will be improved enough for that.
>
> 2. Althrough your code was enough well scheduled but first lines could be
> scheduled better (in addition they are first thing which watch everyone) :
i cant see a difference with -benchmark, did u try it?

...
>
> Friendly! Nick

Michael