[Ffmpeg-devel] Getting rid of inlining failure warnings
Panagiotis Issaris
takis.issaris
Thu Nov 9 16:20:49 CET 2006
Hi,
On Thu, 2006-11-09 at 15:41 +0100, Michael Niedermayer wrote:
>[...]
> rdtsc and emms should be smaller if inlined then if called, if you are brave
> and after verifying that the instructions are smaller or equal then a call
> then submit a bugreport to the gcc devels
Here's a little bit of testcode I used to verify this. If this makes
sense, I'll try to write a bugreport reporting that GCC sometimes does
not inline code claiming the function has grown to large, while inlining
it would have _decreased_ the codesize.
#include <stdio.h>
static inline long long read_time(void) {
long long l;
asm volatile( "rdtsc\n\t"
: "=A" (l)
);
return l;
}
int main()
{
long long l = read_time();
printf("%Ld\n", l);
}
#include <stdio.h>
static __attribute__ ((noinline)) long long read_time(void) {
long long l;
asm volatile( "rdtsc\n\t"
: "=A" (l)
);
return l;
}
int main() {
long long l = read_time();
printf("%Ld\n", l);
}
With a commandline equal to what FFmpeg is using here (without the
FFmpeg specific stuff):
gcc -c -I. -fomit-frame-pointer -g -Wdeclaration-after-statement -Wall
-Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls
-Winline -O3 rdtsc.c
The inlined version is indeed smaller:
size inlinerdtsc.o
text data bss dec hex filename
51 0 0 51 33 inlinerdtsc.o
size rdtsc.o
text data bss dec hex filename
68 0 0 68 44 rdtsc.o
I do not think it is specific to this short block of code, as
the generated assembly shows rdtsc being only 2 bytes long, while
the call instruction by itself already occupies 5 bytes:
Not inlined:
00000000 <read_time>:
0: 0f 31 rdtsc
2: c3 ret
3: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
9: 8d bc 27 00 00 00 00 lea 0x0(%edi),%edi
00000010 <main>:
10: 8d 4c 24 04 lea 0x4(%esp),%ecx
14: 83 e4 f0 and $0xfffffff0,%esp
17: ff 71 fc pushl 0xfffffffc(%ecx)
1a: 51 push %ecx
1b: 83 ec 18 sub $0x18,%esp
1e: e8 dd ff ff ff call 0 <read_time>
23: c7 04 24 00 00 00 00 movl $0x0,(%esp)
2a: 89 44 24 04 mov %eax,0x4(%esp)
2e: 89 54 24 08 mov %edx,0x8(%esp)
32: e8 fc ff ff ff call 33 <main+0x23>
37: 83 c4 18 add $0x18,%esp
3a: 31 c0 xor %eax,%eax
3c: 59 pop %ecx
3d: 8d 61 fc lea 0xfffffffc(%ecx),%esp
40: c3 ret
Inlined:
00000000 <main>:
0: 8d 4c 24 04 lea 0x4(%esp),%ecx
4: 83 e4 f0 and $0xfffffff0,%esp
7: ff 71 fc pushl 0xfffffffc(%ecx)
a: 51 push %ecx
b: 83 ec 18 sub $0x18,%esp
e: 0f 31 rdtsc
10: 89 44 24 04 mov %eax,0x4(%esp)
14: 89 54 24 08 mov %edx,0x8(%esp)
18: c7 04 24 00 00 00 00 movl $0x0,(%esp)
1f: e8 fc ff ff ff call 20 <main+0x20>
24: 83 c4 18 add $0x18,%esp
27: 31 c0 xor %eax,%eax
29: 59 pop %ecx
2a: 8d 61 fc lea 0xfffffffc(%ecx),%esp
2d: c3 ret
> but they will probably close it with some comment like gcc cant count
> instructions in an asm (probably claiming that it is fundamentally impossible
> to do or some other similar ridiculous statement)
>
> in the meanwhile rdtsc & emms could be marked always_inline but that could
> cause other random functions to fail to be inlined (this should be checked
> a diff of "nm foobar.o" before and afterwards should give a definite awnser
> tough
I'll have a look at this, but the other patch I posted does not touch
the rdtsc & emms functions. I'm still unsure if they can influence each
other. In the sense that I might have removed some inline specifiers,
which might not have been needed if rdtsc would have been forced to
inline or something like that... Could something like that happen?
>
> also functions which are called just from one spot or just from one spot
> per filetype should be marked as always_inline
With friendly regards,
Takis
--
vCard: http://www.issaris.org/pi.vcf
Public key: http://www.issaris.org/pi.key
More information about the ffmpeg-devel
mailing list