[Ffmpeg-devel] x86/SSE fix to ensure 16-byte alignment on local variables

Zuxy Meng zuxy.meng
Sat Nov 11 07:39:34 CET 2006


Hi,

2006/11/11, Rich Felker <dalias at aerifal.cx>:
> On Fri, Nov 10, 2006 at 10:30:20AM +0100, Guillaume Poirier wrote:
> > Hi,
> >
> > Christophe Mutricy wrote:
> > >>Attached the patch against today's svn from the files Thorsten sent.
> > >
> > >
> > > Arghh. I manage to send the worng one.
> > > Here's the good one.
> >
> > I think I can safely say that this patch is rejected in its corrent
> > form. It leads to unreadable code (at least, to my eyes) to support a
> > corner case.
> >
> > I'm not even sure if a patch that would do smth like:
> >
> > #ifndef BROKEN_STACK
> >     DECLARE_ALIGNED_16(DCTELEM, d1[64]);
> > #elif
> >     // TJ: force alignment to 16.
> >     //DCTELEM is short
> >     //DECLARE_ALIGNED_16(DCTELEM, d1[64]);
> >     DCTELEM d1_[64+8];
> >     DCTELEM* d1 = (DCTELEM*)((((unsigned long)d1_)+0xf)&~0xf);
> >     // end force
> > #end
> >
> > would be accepted. The maintainer would have to tell.
>
> With gcc, BROKEN_STACK is _always_ the case. gcc simply does not align
> the stack. However the code above is wrong in any case. It should
> read:
>
> DCTELEM d1_buf[64+8];
> DCTELEM *d1 = d1_buf + (-(unsigned)d1_buf & 15)/sizeof(DCTELEM);
>
> or similar. This is the ONLY reliable way to get aligned data on the
> stack since gcc developers refuse to fix their broken crap and make
> gcc actually align the stack...

IIRC stack will be aligned to 16-byte boundary on x86-64 by default,
so the hack is only needed for x86-32, on which an ABI change will be
required.

ICC aligns stack by using %ebx to save the original (possibly
unaligned) stack pointer, but that means another valuable general
register will be wasted.

The upcoming gcc 4.2.0 has a function attribute
force_align_arg_pointer which might help, but I haven't tried it.

Following are 3 bugs related to this issue:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13685
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27537
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28069
-- 
Zuxy
Beauty is truth,
While truth is beauty.
PGP KeyID: E8555ED6




More information about the ffmpeg-devel mailing list