[FFmpeg-devel] gcc: Remove auto-vectorization limitation.

Martin Storsjö martin at martin.st
Wed May 21 15:09:22 EEST 2025


On Wed, 21 May 2025, Andreas Rheinhardt wrote:

> Jiawei:
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
>> capabilities through extensive optimizations in loop analysis and SIMD
>> code generation. The explicit -fno-tree-vectorize flag originally added
>> in commit 973859f (2009) to workaround early GCC vectorization instability
>> is no longer necessary.
>>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
>>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> ---
>>  configure | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..b9e95ce4ec 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,6 @@ if enabled icc; then
>>              disable aligned_stack
>>      fi
>>  elif enabled gcc; then
>> -    check_optflags -fno-tree-vectorize
>>      check_cflags -Werror=format-security
>>      check_cflags -Werror=implicit-function-declaration
>>      check_cflags -Werror=missing-prototypes
>
> FYI: The last discussion about auto-vectorization is here:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299405.html
> It contains a report about a failing build with vectorization enabled:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299421.html
> I don't know whether this is still reproducible with the latest GCC.

The issue which was reported last time, when compiling for i686 mingw32 
with --cpu=haswell, seems to have gone away in 
182663a58a7a099e02e76da3b0f96d63e5c26a6d, where we made the whole 
problematic x86 inline cabac assembly noinline on i386. (That whole inline 
assembly block has been problematic in a large number of cases anyway.)

// Martin



More information about the ffmpeg-devel mailing list