[Ffmpeg-devel] Getting rid of inlining failure warnings

Panagiotis Issaris takis.issaris
Thu Nov 9 16:20:49 CET 2006


Hi,

On Thu, 2006-11-09 at 15:41 +0100, Michael Niedermayer wrote:
>[...]
> rdtsc and emms should be smaller if inlined then if called, if you are brave
> and after verifying that the instructions are smaller or equal then a call
> then submit a bugreport to the gcc devels

Here's a little bit  of testcode I used to verify this. If this makes
sense, I'll try to write a bugreport reporting that GCC sometimes does
not inline code claiming the function has grown to large, while inlining
it would have _decreased_ the codesize.

#include <stdio.h>
static inline long long read_time(void) {
        long long l;
        asm volatile(   "rdtsc\n\t"
                : "=A" (l)
        );
        return l;
}
int main()
{
    long long l = read_time();
    printf("%Ld\n", l);
}

#include <stdio.h>
static __attribute__ ((noinline)) long long read_time(void) {
        long long l;
        asm volatile(   "rdtsc\n\t"
                : "=A" (l)
        );
        return l;
}
int main() {
    long long l = read_time();
    printf("%Ld\n", l);
}

With a commandline equal to what FFmpeg is using here (without the
FFmpeg specific stuff):
gcc -c -I. -fomit-frame-pointer -g -Wdeclaration-after-statement -Wall
-Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls
-Winline -O3  rdtsc.c

The inlined version is indeed smaller:
size inlinerdtsc.o 
   text    data     bss     dec     hex filename
     51       0       0      51      33 inlinerdtsc.o
size rdtsc.o 
   text    data     bss     dec     hex filename
     68       0       0      68      44 rdtsc.o

I do not think it is specific to this short block of code, as
the generated assembly shows rdtsc being only 2 bytes long, while
the call instruction by itself already occupies 5 bytes:

Not inlined:
00000000 <read_time>:
   0:   0f 31                   rdtsc  
   2:   c3                      ret    
   3:   8d b6 00 00 00 00       lea    0x0(%esi),%esi
   9:   8d bc 27 00 00 00 00    lea    0x0(%edi),%edi

00000010 <main>:
  10:   8d 4c 24 04             lea    0x4(%esp),%ecx
  14:   83 e4 f0                and    $0xfffffff0,%esp
  17:   ff 71 fc                pushl  0xfffffffc(%ecx)
  1a:   51                      push   %ecx
  1b:   83 ec 18                sub    $0x18,%esp
  1e:   e8 dd ff ff ff          call   0 <read_time>
  23:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
  2a:   89 44 24 04             mov    %eax,0x4(%esp)
  2e:   89 54 24 08             mov    %edx,0x8(%esp)
  32:   e8 fc ff ff ff          call   33 <main+0x23>
  37:   83 c4 18                add    $0x18,%esp
  3a:   31 c0                   xor    %eax,%eax
  3c:   59                      pop    %ecx
  3d:   8d 61 fc                lea    0xfffffffc(%ecx),%esp
  40:   c3                      ret    

Inlined:
00000000 <main>:
   0:   8d 4c 24 04             lea    0x4(%esp),%ecx
   4:   83 e4 f0                and    $0xfffffff0,%esp
   7:   ff 71 fc                pushl  0xfffffffc(%ecx)
   a:   51                      push   %ecx
   b:   83 ec 18                sub    $0x18,%esp
   e:   0f 31                   rdtsc  
  10:   89 44 24 04             mov    %eax,0x4(%esp)
  14:   89 54 24 08             mov    %edx,0x8(%esp)
  18:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
  1f:   e8 fc ff ff ff          call   20 <main+0x20>
  24:   83 c4 18                add    $0x18,%esp
  27:   31 c0                   xor    %eax,%eax
  29:   59                      pop    %ecx
  2a:   8d 61 fc                lea    0xfffffffc(%ecx),%esp
  2d:   c3                      ret    


> but they will probably close it with some comment like gcc cant count
> instructions in an asm (probably claiming that it is fundamentally impossible
> to do or some other similar ridiculous statement)
> 
> in the meanwhile rdtsc & emms could be marked always_inline but that could
> cause other random functions to fail to be inlined (this should be checked
> a diff of "nm foobar.o" before and afterwards should give a definite awnser
> tough
I'll have a look at this, but the other patch I posted does not touch
the rdtsc & emms functions. I'm still unsure if they can influence each
other. In the sense that I might have removed some inline specifiers,
which might not have been needed if rdtsc would have been forced to
inline or something like that... Could something like that happen?

> 
> also functions which are called just from one spot or just from one spot 
> per filetype should be marked as always_inline

With friendly regards,
Takis
-- 
vCard: http://www.issaris.org/pi.vcf
Public key: http://www.issaris.org/pi.key





More information about the ffmpeg-devel mailing list