[FFmpeg-devel] Subject: Re: swscale-test segfault with 64-bit icc 11.1

Winterton, Richard richard.winterton
Wed Jul 21 01:08:32 CEST 2010


Hi,

I believe I was able to duplicate the issue described replicating the segment fault with a small snippet.  I checked with a compiler engineer at and he replied  with the following:

The compiler is unable to detect which stack spaces the users uses in inlined asm, and avoid them. As a workaround, you can use -mno-red-zone to disable the optimization where we use the lower part of ESP in leaf functions, but this will disable red-zone for all other leaf functions also, and may cost performance.

I can look into a modification of the assembly to work around the problem if you still have the issue.

Thanks
Rich

> On Sat, Jul 17, 2010 at 04:50:10PM -0300, Ramiro Polla wrote:
> Hi,
>
> swscale-test segfaults when built with 64-bit icc 11.1 (20100414). The
> function that fails is hyscale_fast_MMX2(). Here's a disassembly of
> the function:
>     a4b0:       53                      push   %rbx
>     a4b1:       48 8b 87 c8 30 00 00    mov    0x30c8(%rdi),%rax
>     a4b8:       4c 8b 9f a8 30 00 00    mov    0x30a8(%rdi),%r11
>     a4bf:       48 89 74 24 d8          mov    %rsi,-0x28(%rsp)
>     a4c4:       45 89 ca                mov    %r9d,%r10d
>     a4c7:       48 89 54 24 e0          mov    %rdx,-0x20(%rsp)
>     a4cc:       41 f7 da                neg    %r10d
>     a4cf:       83 bf 10 31 00 00 00    cmpl   $0x0,0x3110(%rdi)
>     a4d6:       48 89 4c 24 e8          mov    %rcx,-0x18(%rsp)
>     a4db:       48 89 44 24 d0          mov    %rax,-0x30(%rsp)
>     a4e0:       48 8b 87 00 31 00 00    mov    0x3100(%rdi),%rax
>     a4e7:       4c 89 5c 24 f0          mov    %r11,-0x10(%rsp)
>     a4ec:       48 89 44 24 f8          mov    %rax,-0x8(%rsp)
>     a4f1:       0f 84 05 01 00 00       je     a5fc <hyscale_fast_MMX2+0x14c>
>     a4f7:       0f ef ff                pxor   %mm7,%mm7
>     a4fa:       48 8b 4c 24 e8          mov    -0x18(%rsp),%rcx
>     a4ff:       48 8b 7c 24 d8          mov    -0x28(%rsp),%rdi
>     a504:       48 8b 54 24 f0          mov    -0x10(%rsp),%rdx
>     a509:       48 8b 5c 24 d0          mov    -0x30(%rsp),%rbx
>     a50e:       48 31 c0                xor    %rax,%rax
>     a511:       0f 18 01                prefetchnta (%rcx)
>     a514:       0f 18 41 20             prefetchnta 0x20(%rcx)
>     a518:       0f 18 41 40             prefetchnta 0x40(%rcx)
>     a51c:       8b 33                   mov    (%rbx),%esi
>     a51e:       ff 54 24 f8             callq  *-0x8(%rsp)
>     a522:       8b 34 03                mov    (%rbx,%rax,1),%esi
>     a525:       48 01 f1                add    %rsi,%rcx
>     a528:       48 01 c7                add    %rax,%rdi
>     a52b:       48 31 c0                xor    %rax,%rax
>     a52e:       8b 33                   mov    (%rbx),%esi
>     a530:       ff 54 24 f8             callq  *-0x8(%rsp)
> [...]
>
> Since no functions are being called in C inside hyscale_fast_MMX2(),
> the compiler decides it's ok to use -0x8(%rsp) instead of properly
> sub'ing rsp, as it supposedly won't get overwritten. But in this case
> we call the mmx2 code inside asm, overwriting -0x8(%rsp). The second
> callq goes to a522, and when run again, it tries to run some random
> code that was the next pointer on the stack. gcc does the same thing,
> but it seems it leaves -0x8(%rsp) alone and uses the stack -0x10(%rsp)
> and below.
>
> Is this a compiler bug (as in should it detect a call inside asm)?
> Could (or should) we hint to the compiler that a call is being made
> inside the asm block (I don't even know if this is possible)?
I would suggest that you ask intel (after checking the manual).
its surely possible to workaround this in various ways but this
feels unclean.

[...]

--
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The bravest are surely those who have the clearest vision
of what is before them, glory and danger alike, and yet
notwithstanding go out to meet it. -- Thucydides

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iD8DBQFMQmXSYR7HhwQLD6sRApN0AJ9GTPxfdwZr981F7vDAchoiAn6IIACfeFKe
lb+KY9Z+gexEnBt9+yJqvBs=
=z9qV
-----END PGP SIGNATURE-----

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel at mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel




More information about the ffmpeg-devel mailing list