[FFmpeg-devel] Why 'You can only build one library type at once on MinGW'?

Fri May 11 22:35:57 CEST 2007

On Fri, 11 May 2007, Michael Niedermayer wrote:
> On Thu, May 10, 2007 at 10:09:04PM -0700, Trent Piepho wrote:
> > There's no opcode variant for 64-bit displacements.  The mod field in the
> > ia32/x86-64 operand format is =2-bits.  0 = no displacement, 1 = one byte
> > displacement, 2 = 32-bit displacement, 3 = operand is register, not memory
> > address.
> >
> > In x86-64, this isn't changed.  There is no option for a 64 bit displacement.
>
> yes, if you have more then 2^32 of something to address then x86-64 will have
> its problems with that no relation to PIC here
> if all the libs a process needs fit in 2^32 then everything is fine (in
> theory, that is it is easyly possible but the loader doesnt do it)
>
> if a single lib doesnt fit in 2^32 then things cannot be addressed with
> displacements but the address rather has to be build by loading a 64bit
> constant into a register, this iam sure you will agree can be relocated
> to any 64bit address

The situation on x86-64 is that a single library will fit into 4GB.  In fact,
gcc doesn't support creating objects larger than 4GB.  It is possible to use
32-bit displacements with SIB addressing in non-pic, non-relocatable objects.
In pic objects, it's possible to use 32-bit displacements plus rip relative
addressing to compute a position independent address without thunking.

All objects in total may be greater than 4GB, or may be loaded at an address
above 4GB even if the total is less.  32-bit displacements still work.

> > I imagine that AMD figured out that making displacements 64 bits instead of 32
> > bits was a net loosing proposition.  Using position independent code when
> > necessary is probably faster than bloating all code with 64-bit displacements
>
> as if PIC was anything else then bloating all the code with 64bit displacements
> they are just stored in the GOT and need an extra indirection and extra register
> to be read
> (note the RIP relative addressing is limited to 2^32 so no it cannot be used,
> an extra register must be used, after all we do speak about the corner case of
> more than 2^32 bit addressing and the compiler does NOT know if the address
> of something after linking will be within 2^32 bit so i dont think it can
> selectively choose RIP relative addressing)

You're wrong about that.  RIP relative addressing _is_ used.  A single object
is limited to 4GB, so any address in that object can be described as 32-bit
displacement to the instruction pointer.  The resulting address, 32-bit
displacement + 64-bit instruction pointer, can and usually will be above 4GB.

> > so text relocations are possible.  Even on ia32, were no instruction bloating
> > is necessary, PIC is almost always better than text relocations.
>
> its as much better as 10% fewer registers and 2x slower memory accesses due to
> the double indirection over the GOT

And not dirtying every single page of the library doing relocations.

I also suspect you don't know how PIC works on ia32.  There is no double
indirection.  Once the pic register is loaded, it can be used through an
entire function, it doesn't need to be re-loaded for each access.

C code:
static int last;
static __attribute__((noinline)) int next(void) { return ++last; }

non-PIC asm for next:
  mov    0x2c,%eax
  inc    %eax
  mov    %eax,0x2c
  ret

PIC asm for next():
  call   25 <next+0x5>		;\
  pop    %ecx			;| = load ecx as pic register
  add    $0x3,%ecx		;/  (only done once per function)

  mov    0x4c(%ecx),%eax	; code for ++last
  inc    %eax
  mov    %eax,0x4c(%ecx)
  ret

The pic code has a 3-instruction thunk sequence to load eip into a register.
Once this is done, the exact same instructions are used with the addressing
changed from "0x2c" to "0x4c(%ecx)".  On x86-64, no thunking is needed and it
would look something like this:
  mov    0x4c(%rip),%eax
  inc    %eax
  mov    %eax,0x4c(%rip)
  ret