[MPlayer-dev-eng] Improved remove-logo filter

Fri Nov 3 00:22:29 CET 2006

On Thu, 2 Nov 2006, Michael Niedermayer wrote:
> > > >
> > > > Sorry, I meant 2.95, but I'm not sure if that's the case.  Some older gcc
> >
> > I'm now more sure of how this worked.  In 2.7.2, '+' wasn't allowed.  In
> > 2.95, it was allowed but didn't work correctly.  I'm not sure when, or if,
> > it was fixed.
>
> well fact be mplayer uses "+" frequently and mplayer works very well with
> 2.95.3+, there where some big bugs in 2.95.2 related to asm though i dunno
> if they where related to "+"

Comments like these make me think everything doesn't always work that well:
//FIXME this is fragile gcc either runs out of registers or misscompiles it (for
example if "+a"(bit) or "+m"(*state) is used
//Note "+bm" and "+mb" are buggy too (with gcc 3.2.2 at least) and cant be used

And then all the hard coded registers...

If one of the gcc developers says that there were problems with '+m' in asm
constraints, I'd be inclined to believe him.  Freebsd people seem to have
found the same thing:
http://www.mail-archive.com/cvs-all@freebsd.org/msg49149.html

> > > > You might also want to look at these threads:
> > > > http://marc.theaimsgroup.com/?l=linux-kernel&m=107475162200773&w=2
> >
> > > > > does not prevent %0 == %1 if you want an output to not be able to use the
> > > > > same register or memory location as an random input then you must use "=&..."
> > > > > iam not sure if that could cause any problems with your code as i didnt look
> > > > > at it, just the constraints quoted above in which "=m" (accumulator) and
> > > > > "m" (accumulator) could be in the same memory location or a different one
> > > > > or "=m" (accumulator) and "g" (stride) could be in the same memory location
> > > >
> > > > How could accumulator be in two different memory locations?
> > >
> > > an opimizing compiler can make a copy, for example it could copy it to
> > > the stack, gcc may or may not be capable of that but that doesnt matter
> > > for the validity of the code ...
> >
> > It isn't allowed do that.
>
> wrong, a optimizing c compiler is allowed to do anything which doesnt
> violate the c standard

This is gcc's extended asm construct, that's rather beyond the relm of the
c standard.

> > There would be no way to write something like a
> > spin-lock.
>
> iam no spinlock expert but i think you cannot write a spinlock in pure
> iso/ansi C, you need asm for that and thats then no longer related to
> what a c complier can or cannot do

Of course not.  That's why you need to use asm.  And in order to be able
to write one in asm, you need to be able to modify a non-copy of a
memory location.

> > > > Before you say gcc could allot some new memory and copy the variables in
> > > > and out, please read the message from Richard Henderson that I linked to
> > > > that explains why this can't be done.
> > >
> > > yes i read it, i also read linus reply which basically said
> > > "Please fix the compiler"
> >
> > Linus is saying allow "+m"(x) constraints by making them the same as
> > "=m"(x):"m"(x).
>
> ... with the additional comment "since you say it is equivalent anyway."
>
> which ive no problem with either, but please dont twist the facts,
> fact seems to be gcc due to limitations of its architecture cannot
> copy or move a variable to another memory locations, and with this
> additional limitation of the current gcc implementation
> "=m"(a), "0"(a) is equivalent to "=m"(a), "m"(a)

Did you read the same thread I did?

"=m"(x) : "0"(y)  isn't allowed, as a and b can't be in the same location.

"=m"(x) : "0"(x) will generate a warning and may not work correctly!  Go
try it if you don't want to believe me.  It requires that the optimizer
figure out that both the input and the output refer to the same thing.
That is impossible with arbitrarily comprex expressions.

Here's the relevant part, I don't see how this can be interpreted to mean
that "=m":"0" is preferred:

	if you write "=m"(x) : "0"(x) it *requires* that the optimizer be
	run and that it *must* identify the common sub-expression.  Failure
	to do so means that the compiler has to assume we have the x/y
	situation above, which of course results in a diagnostic.

	Obviously such a computational requirement is impossible with
	arbitrarily complex expressions, so that begs the question of "how
	complex is too complex".  Drawing an arbitrary line that you can
	explain to users is impossible.  It is easier to simply disallow it
	entirely.

> this is NOT guranteed by the gcc docs and neither is it gurateed to
> work in the future, so it MUST NOT be used in code, OTOH gcc as long
> as it has this limitation can change "=m"(a), "0"(a) and "+m"(a) to
> "=m"(a), "m"(a) if that has any benefit for it

	Given the lvalue semantics, you get *exactly* the same effect from
	"=m"(x) : "m"(x) Since this works for any version of gcc, at any
	optimization level, on any arbitrarily complex expression, we
	strongly recommend (ahem) that code be modified to use this form.

You're saying that this means "=m":"m" must not be used?  Because that's
not how I interpret it.

> >  He's not saying gcc should be able to copy variables in
> > and out of a temporary memory location for asm constructs.  In fact, he
> > specificly says something like "=m"(x) :  "0"(y) that would require two
> > different variables to be in the same place in memory shouldn't be allowed.
>
> in the thread you linked linus just agrees that the meaning of
> "=m"(x) :  "0"(y) is unclear
> not that it matters that much what he says, but its just not what
> you claim ...

He's agreeing with this statement:

	what in the world does "=m"(x) : "0"(y) mean?  Logically, this
	makes no sense.  The only way it can be resolved is to create a new
	memory, copy y in, and copy x out.  But that violates the lvalue
	promises we've made that make memory constraints useful for atomic
	operations.

That's not saying the meaning is unclear, it's saying that the construct is
broken.  And if you try it, gcc will give you an error.