[FFmpeg-devel] [RFC] Instruction set selection on x86 (was: Re: [PATCH] some SIMD write-combining for h264)

Måns Rullgård mans
Mon Jan 18 14:19:35 CET 2010

Alexander Strange <astrange at ithinksw.com> writes:

> On Jan 17, 2010, at 8:27 PM, M?ns Rullg?rd wrote:
>> Alexander Strange <astrange at ithinksw.com> writes:
>>> On Sun, Jan 17, 2010 at 7:54 PM, Carl Eugen Hoyos <cehoyos at ag.or.at> wrote:
>>>> Alexander Strange <astrange <at> ithinksw.com> writes:
>>>>>>>> also what sets __MMX__ ? we have our own defines for that
>>>>>>> It's a gcc builtin define, set based on ./configure --cpu=x adding
>>>>>>> -march.  HAVE_MMX is for the build and not the host cpu family, and
>>>>>>> this is inlined asm, so it can't use it.
>>>>>> Huh?  Host... build???
>>>>> Oh, that was supposed to be "target"...
>>>>> Anyway, this is MMX being used like the cmov/clz inlines, so it depends on the
>>>>> given --cpu and not on the build system's capabilities.
>>>> Could you explain once more why this shouldn't be HAVE_MMX?
>>>> If the user passes --disable-mmx to configure, he imo expects MMX to be disabled.
>>>> Carl Eugen
>>> HAVE_MMX isn't enough to enable it - './configure --cpu=i586' enables
>>> HAVE_MMX, but i586 doesn't have it.
>> Not anymore.
> I think that was wrong. --cpu is the minimum required cpu, not the
> only expected cpu, but that turned off building dsputil mmx which is
> runtime-cpudetected. (even if runtime-cpudetection is disabled,
> actually)

How the heck do you want it then?  I am of the opinion that runtime
detection is a flawed idea from the start.  I'm not under the illusion
that I'll be able to change the minds of everyone, so I have an
alternative proposal.

First we need to look at the precise requirements:

1. With --cpu=foo code shall be built which will run on CPU foo, but have
   all optimised functions in place for activation on more capable CPUs.

2. With --disable-{mmx,sse,...} code shall be tuned for --cpu, but all
   {mmx,sse,...} shall be disabled in both hand-written asm and compiled
   code.  This implies adding compiler flags such -mno-mmx in some cases.

This means that for each of the extensions, we will have three

1. Not allowed.
2. Allowed only in functions activated at runtime.
3. Allowed everywhere.

The current HAVE_* preprocessor symbols are boolean, so a change is
needed.  I see two possibilities here.

1. Use two symbols for each extension, one indicating option 2 above,
   the other option 3 (the latter obviously implying the former).

2. Extend the system to allow any (non-negative) number for feature
   macros, with the values 0, 1, and 2 indicating the options above.
   This would require more changes to the configure script, but might
   prove useful in future.

Did I miss anything?

M?ns Rullg?rd
mans at mansr.com

