[FFmpeg-devel] [RFC] clobbers for XMM registers
Thu Sep 30 18:52:37 CEST 2010
On Thursday, September 30, 2010, M?ns Rullg?rd <mans at mansr.com> wrote:
> "Ronald S. Bultje" <rsbultje at gmail.com> writes:
>> 2010/9/28 M?ns Rullg?rd <mans at mansr.com>:
>>> Michael Niedermayer <michaelni at gmx.at> writes:
>>>> On Tue, Sep 28, 2010 at 09:36:40AM -0400, Ronald S. Bultje wrote:
>>>>> On Tue, Sep 28, 2010 at 8:34 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>>>>> > you want to execute code from vp3dsp_sse2.c on a pre SSE cpu?
>>>>> All _sse2 files are templates files that are included in dsputil_mmx.c
>>>>> or similar.
>>>> we could add the flags to dsputil_mmx then
>>> That would allow the compiler to use SSE instructions in functions
>>> that should be MMX only.
>> I'm gonna start kicking this subject until it's solved. Come on guys,
>> keep this moving. Why don't we make it (the clobbering) a macro and
>> only enable this on x86-64. Don't forget all xmm registers are
>> caller-save on x86-32 and x86-64 has no issues with marking clobbers
> The issue is not fundamentally about caller vs callee saved
> registers. It is about telling the compiler which registers are
> clobbered, so that it can save and restore them if necessary.
> The missing clobber lists caused the FFT to fail with suncc, despite
> all the used registers being caller-saved. Apparently the compiler
> was using them for something outside the asm block.
>> (and even if it did, -msse is fine, there is no single x86-64 CPU that
>> does not support SSE). We could consider making it as simple as :::
>> CLOBBER_IF_X86_64("%xmm6", "%xmm7",) "%eax" which evaluates to the
>> string in it (including commas) on x86-64 and nothing on x86-32 (and
>> omit the comma if that's the only thing in the clobberlist).
> We obviously need a conditional of some kind, but it should be tested
> in configure and applied whenever the compiler recognises xmm registers.
> It is, however, not quite as straight forward as you make it out.
> Stray commas are not allowed, nor is an empty list.
> One possible solution is to have the macro always include "cc". Most
> of the asm blocks do clobber the condition flags, and for any that do
> not, it is unlikely to make any difference. It also seems that
> including the stack pointer in the clobber list is ignored, although
> relying on this seems dubious at best.
asm blocks always clobber cc whether or not you put it in the list, so
the "cc" clobber is a no-op.
This is one reason to avoid asm one-liners in headers, since this
slightly hurts code quality (like by inhibiting combining add/test
More information about the ffmpeg-devel