[FFmpeg-devel] [PATCH] WIP AVX support

Jason Garrett-Glaser jason
Thu Feb 3 12:47:10 CET 2011


2011/2/3 M?ns Rullg?rd <mans at mansr.com>:
> Jason Garrett-Glaser <jason at x264.com> writes:
>
>> 2011/2/3 M?ns Rullg?rd <mans at mansr.com>:
>>> Jason Garrett-Glaser <jason at x264.com> writes:
>>>
>>>> Need someone who does configure/etc stuff to finish this.
>>>> ---
>>>> ?configure ? ? ? ? ? | ? ?2 ++
>>>> ?libavutil/cpu.h ? ? | ? ?1 +
>>>> ?libavutil/x86/cpu.c | ? 15 +++++++++++++++
>>>> ?3 files changed, 18 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/configure b/configure
>>>> index 46f4e44..c6ceaa8 100755
>>>> --- a/configure
>>>> +++ b/configure
>>>> @@ -223,6 +223,7 @@ Advanced options (experts only):
>>>> ? ?--disable-mmx2 ? ? ? ? ? disable MMX2 optimizations
>>>> ? ?--disable-sse ? ? ? ? ? ?disable SSE optimizations
>>>> ? ?--disable-ssse3 ? ? ? ? ?disable SSSE3 optimizations
>>>> + ?--disable-avx ? ? ? ? ? ?disable AVX optimizations
>>>> ? ?--disable-armv5te ? ? ? ?disable armv5te optimizations
>>>> ? ?--disable-armv6 ? ? ? ? ?disable armv6 optimizations
>>>> ? ?--disable-armv6t2 ? ? ? ?disable armv6t2 optimizations
>>>> @@ -1181,6 +1182,7 @@ mmx_deps="x86"
>>>> ?mmx2_deps="mmx"
>>>> ?sse_deps="mmx"
>>>> ?ssse3_deps="sse"
>>>> +avx_deps="ssse3"
>>>>
>>>> ?aligned_stack_if_any="ppc x86"
>>>> ?fast_64bit_if_any="alpha ia64 mips64 parisc64 ppc64 sparc64 x86_64"
>>>
>>> I suppose we'll need to test for AVX support in gcc/yasm. ?Do
>>> something similar to the ssse3 test.
>>
>> We don't do that in yasm though. ?We just test if SSSE3 is supported:
>> if not, we error out.
>>
>> AVX was added in yasm 0.7.0. ?Should we just fail with older versions,
>> or disable HAVE_AVX?
>
> I didn't know AVX specs had been around that long. ?0.7 is so old it's
> reasonable to require it. ?Still, why the SSSE3 dependency? ?Does AVX
> code naturally use SSSE3 instructions as well?

Any "AVX" function we write that doesn't use ymm registers will
probably mix AVX and SSE code (x264 already has written and uses an
abstraction layer for this).  AVX implies SSE.

>>>> diff --git a/libavutil/cpu.h b/libavutil/cpu.h
>>>> index 71cc265..2ddde26 100644
>>>> --- a/libavutil/cpu.h
>>>> +++ b/libavutil/cpu.h
>>>> @@ -36,6 +36,7 @@
>>>> ?#define AV_CPU_FLAG_SSSE3 ? ? ? ?0x0080 ///< Conroe SSSE3 functions
>>>> ?#define AV_CPU_FLAG_SSE4 ? ? ? ? 0x0100 ///< Penryn SSE4.1 functions
>>>> ?#define AV_CPU_FLAG_SSE42 ? ? ? ?0x0200 ///< Nehalem SSE4.2 functions
>>>> +#define AV_CPU_FLAG_AVX ? ? ? ? ?0x0400 ///< Sandy Bridge AVX functions
>>>> ?#define AV_CPU_FLAG_IWMMXT ? ? ? 0x0100 ///< XScale IWMMXT
>>>> ?#define AV_CPU_FLAG_ALTIVEC ? ? ?0x0001 ///< standard
>>>
>>> BTW, shouldn't that be s/functions/instructions/?
>>
>> I didn't change the existing code. ?That should be a separate patch.
>>
>>>> diff --git a/libavutil/x86/cpu.c b/libavutil/x86/cpu.c
>>>> index 4b6cb0d..a34e18f 100644
>>>> --- a/libavutil/x86/cpu.c
>>>> +++ b/libavutil/x86/cpu.c
>>>> @@ -35,6 +35,12 @@
>>>> ? ? ? ? ? ? "=c" (ecx), "=d" (edx)\
>>>> ? ? ? ? ? : "0" (index));
>>>>
>>>> +#define xgetbv(index,eax,ebx,ecx,edx)\
>>>> + ? ?__asm__ volatile\
>>>> + ? ? ? ?("xgetbv\n\t"\
>>>> + ? ? ? ? : "=a" (eax), (edx)\
>>>> + ? ? ? ? : "c" (index));
>>>
>>> This should be something like
>>>
>>> ?__asm__ ("xgetbv" : "=a"(eax), "=d"(edx) : "c"(index));
>>>
>>> The volatile should not be needed since all effects of the block are
>>> expressed in the constraints. ?You should also make the macro
>>> arguments match how you're calling it below.
>>
>> I'm just copying cpuid. ?Changing the style should be in a separate patch.
>
> What you wrote will not compile, or at any rate should not, and I see
> no reason to copy an ugly style.

I'll fix that then, but patches welcome to fix cpuid...

Jason



More information about the ffmpeg-devel mailing list