[FFmpeg-devel] [PATCH] enable auto vectorization for gcc 7 and higher

Andreas Rheinhardt andreas.rheinhardt at outlook.com
Wed Jul 27 20:49:41 EEST 2022


James Almer:
> On 7/27/2022 2:34 PM, Swinney, Jonathan wrote:
>> I recognize that this patch is going to be somewhat controversial. I'm
>> submitting it mostly to see what the opinions are and evaluate
>> options. I am working on improving performance for aarch64. On that
>> architecture, there are fewer hand written assembly implementations of
>> hot functions than there are for x86_64 and allowing gcc to
>> auto-vectorize yields noticeable improvements.
>>
>> Gcc vectorization has improved recently and it hasn't been evaluated
>> on the mailing list for a few years. This is the latest discussion I
>> found in my searches:
>> http://ffmpeg.org/pipermail/ffmpeg-devel/2016-May/193977.html
> 
> Every time this was done, it was inevitably reverted after complains and
> crash reports started piling up because gcc can't really handle all the
> inline code our codebase has, among other things.
> 
>>
>> If the community is not comfortable accepting a patch like this
>> outright, would you be willing to accept a new option to the configure
>> script, something like --enable-auto-vectorization?
> 
> --extra-cflags can be used for this.
> 

No, it can't, because what is given via --extra-cflags is inserted at
the start of CFLAGS, so that the automatically added -fno-tree-vectorize
overwrites it.

- Andreas


More information about the ffmpeg-devel mailing list