[FFmpeg-devel] [PATCH] VC-1 MMX DSP functions

Sat Jul 7 22:38:16 CEST 2007

On Sat, 7 Jul 2007, Zuxy Meng wrote:
> > > and changed libavcodec/i386/vc1dsp_mmx.c line 252-253 to:
> > >
> > >          : "g"((intptr_t)sstr), "g"((intptr_t)dstr),
> > > "r"((intptr_t)offset), "r"((intptr_t)3*offset), \
> > >            "g"((intptr_t)rnd)
> >
> > The crash occurs in vc1_put_shift2_mmx, so line
> >        : "m"(src_stride), "m"(dst_stride),
> > becomes:
> >        : "m"((intptr_t)src_stride), "m"((intptr_t)dst_stride),
>
> "m" can't be used here since (intptr_t)src_stride is no longer
> directly addressable.

In other words, "m" can only be used with an lvalue, and a cast is not an
lvalue (usually).  You can't write "(intptr_t)src_stride = 42;" for the same
reason.

The reason this code is crashing on x86-64, is because of this:

+        "add  %3, %1                       \n\t"
+        "add  %4, %2                       \n\t"

One of those add instructions turns into something like:
	addq 32(%rsp), %rdx

Notice that it's "addq", not "addl".  It's a 64-bit add because %rdx is
64-bit.  But src_stride is only 32-bits!

Your best bet to fix this, would be to change the function prototype so that
dst_stride and src_stride are long or intptr_t, and the same with the calling
function.  Since the strides are going to be added to a pointer, it makes
sense to have them be intptr_t instead of int.  It makes no difference on
ILP32 archs, but on I32 LP64 it avoids converting the stride back and forth
from 32 to 64 bit.

You could also change:
       : "m"(src_stride), "m"(dst_stride),
to:
       : "g"((intptr_t)src_stride), "g"((intptr_t)dst_stride),

gcc has an extension that a cast that doesn't actually do anything is still an
lvalue.  On ia32, the (intptr_t) cast has no effect, so gcc will be able to
use a memory reference because of the "g".  For example:

foo(int x) {
    asm volatile("# %0" : : "g"((intptr_t)x));
}
Becomes:
        pushl   %ebp    #
        movl    %esp, %ebp      #,
        # 8(%ebp)       # x
	leave
	ret

You can see that gcc just provided a memory reference to the function's
argument, it didn't need to copy it or put it in a register.  On x86-64,
gcc would probably code something like:
        pushq   %rbp
        movq    %rsp, %rbp
        movl    16(%rbp), %eax   <- convert int to intptr_t
        # %rax                   <- must use a register now
        leave
        ret