[FFmpeg-devel] gcc: Remove auto-vectorization limitation.
Jiawei
jiawei at iscas.ac.cn
Wed May 21 13:26:07 EEST 2025
在 2025/5/21 17:04, Zhao Zhili 写道:
>
>> On May 21, 2025, at 14:17, Jiawei <jiawei at iscas.ac.cn> wrote:
>>
>> This patch modifies the FFmpeg build system to remove the explicit disabling
>> of GCC's auto-vectorization feature.
>>
>> Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
>> capabilities through extensive optimizations in loop analysis and SIMD
>> code generation. The explicit -fno-tree-vectorize flag originally added
>> in commit 973859f (2009) to workaround early GCC vectorization instability
>> is no longer necessary.
> This isn’t the whole story.
>
> The flag was added by 973859f in 2009.
> Then it was reverted by cb8646af in 2016.
> Shortly after that, the revert was reverted again by fd6dbc5 in 2016.
>
>> Key improvements justifying this change:
>> 1. Enhanced heuristics for loop vectorization cost models
>> 2. Mature handling of alignment and memory access patterns
>> 3. Robust fallback mechanisms for unsupported architectures
>>
>> This change allows FFmpeg to benefit from automated SIMD optimizations
>> when built with -O3 optimization level, particularly improving
>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
> Those flags can only be enabled in tightly controlled environments (e.g., built and run on the same
> machine), while FFmpeg has hand written assembly, runtime cpu probe and dynamic binding/dispatch.
>
> Those auto-vectorization and ARCH flags can be enabled manually, but be careful.
Thank you point this out, since I am using x64 AVX2 and RISC-V RVV, when
I enable the vector feature
by -O3 -mavx(-march=rv64gcv for RV). This configure will adds the
`-fno-tree-vectorize` option automatically.
It will still add the vector load/store instructions in the result, but
no vector operation here.
GCC import the explicit option to controll if there need generate the
vectorized instructions. It's okay to use -O3
but not do auto-vectorization.
>
>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191
>>
>> ---
>> configure | 1 -
>> 1 file changed, 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 3730b0524c..b9e95ce4ec 100755
>> --- a/configure
>> +++ b/configure
>> @@ -7656,7 +7656,6 @@ if enabled icc; then
>> disable aligned_stack
>> fi
>> elif enabled gcc; then
>> - check_optflags -fno-tree-vectorize
>> check_cflags -Werror=format-security
>> check_cflags -Werror=implicit-function-declaration
>> check_cflags -Werror=missing-prototypes
>> --
>> 2.43.0
>>
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel at ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
More information about the ffmpeg-devel
mailing list