[Ffmpeg-devel] [PATCH] Snow mmx+sse2 asm optimizations

Robert Edele yartrebo
Wed Mar 8 01:07:06 CET 2006


On Tue, 2006-03-07 at 15:34 -0800, Loren Merritt wrote:
> On Tue, 7 Mar 2006, Robert Edele wrote:
> > On Mon, 2006-03-06 at 02:06 +0100, Michael Niedermayer wrote:
> >> On Sun, Mar 05, 2006 at 06:09:09PM -0500, Robert Edele wrote:
> >>> +        ::
> >>> +        "m"(b0),"m"(b1),"m"(b2),"m"(b3),"m"(b4),"m"(b5),"d"(end_w2):
> >>> +        "%"REG_a"","%"REG_b"","%"REG_c"");
> >>
> >> this code is not valid, REG_d is changed but neither output nor on the clobber list
> >
> > REG_d is on the input list, so GCC recognizes it as clobbered? GCC
> > also refuses that I put it REG_d into the clobber list. I believe the
> > code is good as is?
> 
> If it's both input and clobbered, put it on the output list with "+d".

Putting "+d"(end_w2) pr "=d"(end_w2) in the output list causes the
program to crash. The variable is not used after the asm block.

Looking at that code has brought up a potential AMD-64 bug, which I have
fixed (ints which should have been longs) and I have attached an updated
copy of the patch to this e-mail.

Sincerely,
Robert Edele
-------------- next part --------------
A non-text attachment was scrubbed...
Name: snow_mmx.patch
Type: text/x-patch
Size: 93275 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20060307/b8a22c57/attachment.bin>



More information about the ffmpeg-devel mailing list