[FFmpeg-devel] Question about -fPIC usage for some files

Michael Niedermayer michaelni
Sat Feb 9 02:28:39 CET 2008


On Fri, Feb 08, 2008 at 04:48:43PM -0500, Alexander Strange wrote:
> On Feb 8, 2008 4:26 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Fri, Feb 08, 2008 at 01:03:56PM -0800, Trent Piepho wrote:
> > > On Fri, 8 Feb 2008, Thorsten Jordan wrote:
> > > > > Why does it fail with pic for you?
> > > > the same problem that was discussed several times on this list, gcc
> > > > fails to generate the code because it runs out of registers (ebx is used
> > > > with -fPIC):
> > >
> > > Since version 3 something, gcc can use other registers besides ebx, and
> > > might not use ebx at all if the function doesn't do anything that requires
> > > access to the pic pointer.  If a function accesses no globals, does not
> > > take the address of a fuction, or call a function in another shared
> > > library, it shouldn't need to load the pic register.
> > >
> > > The real problem isn't ebx, it's accessing globals.  In non-PIC code, a
> > > memory reference to a global takes zero registers.  In PIC code, it takes
> > > one register.  In some cases multiple global references can share the same
> > > register(s), so gcc doesn't always need one per global.  But this could
> > > still easily add a half dozen extra registers to an asm block.
> >
> > [...]
> >
> > Why does it fail?
> > This is not a compiler being presented with a situation too complex to solve
> > This is a compiler failing to fit a grain of sand in a bus.
> 
> apply_welch_window_sse2 has six constraints:
>         :"+&r"(i), "+&r"(j)\
>         :"r"(w_data+n2), "r"(w_data+len-2-n2),\
>          "r"(data+n2), "r"(data+len-2-n2)\
> 
> It only fails if that's inlined; I think the VLA use is breaking it:
>     double tmp[len + lag + 2];

btw, it seems that whoever wrote this missed a quite obvious optimization
opertunity, that is:
len-n2 = n2
which leads to:
         :"+&r"(i), "+&r"(j)\
         :"r"(w_data+n2), "r"(w_data+n2-2),\
          "r"(data+n2), "r"(data+n2-2)\

And reduces the number of registers by 2 as the -2*sizof can be added as
an offset to all uses ...

id test and commit it if i had a sse2 cpu here ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I wish the Xiph folks would stop pretending they've got something they
do not.  Somehow I fear this will remain a wish. -- M?ns Rullg?rd
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080209/5c0ebdc4/attachment.pgp>



More information about the ffmpeg-devel mailing list