[FFmpeg-devel] [PATCH] M68K: Optimized MUL64/MULH/MULLfunctions for 68060

Måns Rullgård mans
Sun Aug 2 00:43:22 CEST 2009


ami_stuff <ami_stuff at o2.pl> writes:

>> >     :"=d"(lo), "=d"(hi)
>> 
>> Those should be marked early-clobber (&).
>
> Ok.
>
>> >     :"0"(a), "1"(b)
>> 
>> Do these have to be the same regs?  Allowing different registers
>> theoretically gives the compiler better room for optimal register
>> allocation.  On the other hand, it gives the compiler more room to
>> mess up.
>
> It looks like GCC 4.4.1 generates better code with defined registers
> (2 move.ls less):

See below.

>> >     :"d2", "d3", "d4", "d5");
>> 
>> Avoid using hardcoded registers, and prefer explicitly declared temp
>> variables.
>
> Hmm, I don't know how to do it

int t1, t2, t3, t4;
asm("..." : "=&d"(t1), "=&d"(t2), "=&d"(t3), "=&d"(t4));

> and what code GCC will generate after this change.

Try and see.

> Now the output asm code looks pefrect without any unneeded
> instructions.

That's because you're looking at this function in isolation.  When
inlined in a larger function, those registers may well already be in
use with some others free.

>> Out of interest, what does gcc do when left to its own devices?
>
> You mean how output asm code looks alike without asm inlines? In
> this situation GCC uses slow _muldi3.

Oh...

>> > #define MULL(a,b,s)	(MUL64(a, b) >> s)
>> 
>> Can gcc really be trusted with this?
>
> inline int MULL(int a, int b, unsigned s){
>     return MUL64(a,b)>>s;
> }
>
> Here is output from asm-optimized function:
>
> #NO_APP
> [...]
> #NO_APP
> 	lea (-32,a0),a1
> 	tst.l a1
> 	jlt L2
> 	move.l a1,d1
> 	asr.l d1,d0
> 	movem.l (sp)+,#60
> 	rts
> L2:
> 	move.l d0,d2
> 	add.l d2,d2
> 	moveq #31,d0
> 	sub.l a0,d0
> 	lsl.l d0,d2
> 	move.l d1,d0
> 	move.l a0,d3
> 	lsr.l d3,d0
> 	or.l d2,d0

That's quite a lot for a right shift.  We also happen to know the
shift is always a constant and less than 32.  GCC will of course
theoretically have this information when the function is inlined, so
we should be looking at code generated by such a call, not this
function compiled standalone.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list