[MPlayer-dev-eng] Improved remove-logo filter

Trent Piepho xyzzy at speakeasy.org
Tue Oct 31 22:13:29 CET 2006


On Tue, 31 Oct 2006, Guillaume POIRIER wrote:

> Hi Trent,
>
> Were you able to work on improving your patch?

What was there to do?  Change the asm to use "+" constraints
even though it doesn't always work with gcc 2.7.2?

>
> On 9/15/06, Trent Piepho <xyzzy at speakeasy.org> wrote:
> > On Thu, 14 Sep 2006, Loren Merritt wrote:
> > > On Thu, 14 Sep 2006, Trent Piepho wrote:
> > > > So I tried writing the inner loop (over one line) in asm so accumulator
> > > > would be kept in mm1 for the whole loop.  Gcc still spills and loads
> > > > accumulator for no reason on each outer loop (for each line).  This ended
> > > > up being about the same speed.
> > >
> > > Not that there's anything wrong with writing the loops in asm, but you
> > > don't have to do that just to keep the accumulator in an mmreg. "y"
> > > constraints are not needed, unless you _want_ gcc to load/spill values.
> >
> > Good idea, I hadn't thought of trying that.  It only works as long as gcc
> > doesn't touch the mmx register.  Which is true I think, even if you enable
> > -fmmx gcc won't generate any code that uses mmx registers unless you
> > explictly write some (with asm, builtins or vector types).  If it was
> > another general purpose register, that wouldn't work.
> >
> > > Or with both loops as one asm block, you can bring back "+y"(accumulator)
> > > instead of the explicit movd.
> >
> > I had that originally, since I wrote gcc 3.1+ code with symbolic register
> > names.  I translated it back to gcc 2.95 asm, and the gcc 2.95 code
> > benchmarked at the same speed.  So I figured there was no point in having
> > two versions.
> >
> > > > : "=m" (accumulator), "=r" (i), "=g" (j), "=r" (mask), "=r" (image)
> > > > : "m" (accumulator), "1" (i), "2" (j), "3" (mask), "4" (image),
> > > >   "g" (logo_mask->width), "g" (stride)
> > >
> > >    : "+m" (accumulator), "+r" (i), "+g" (j), "+r" (mask), "+r" (image)
> > >    : "g" (logo_mask->width), "g" (stride)
> >
> > I've read several places that you can't use "+" to indicate an input/output
> > arguments in inline asm, it only works in machine descriptions.  I think it
> > may have changed for newer versions of gcc.
> >
> > I've tried it, before and gcc doesn't complain about it, but it doesn't
> > always work.  With broken constraints you will often get lucky and have
> > everything work, and then some random change to some peice of unrelated
> > code will have the optimizer make a choice that breaks the asm.  So, it's
> > very had to make a test case that shows it, but I had gcc not load the
> > value into a "+r" constraint, so I decided to believe the docs and use "=r"
> > / "0" instead.
>



More information about the MPlayer-dev-eng mailing list