[FFmpeg-devel] [PATCH] enable auto vectorization for gcc 7 and higher

Hendrik Leppkes h.leppkes at gmail.com
Thu Jul 28 00:07:49 EEST 2022


On Wed, Jul 27, 2022 at 11:02 PM Martin Storsjö <martin at martin.st> wrote:
>
> On Wed, 27 Jul 2022, Hendrik Leppkes wrote:
>
> > On Wed, Jul 27, 2022 at 7:39 PM James Almer <jamrial at gmail.com> wrote:
> >>
> >> On 7/27/2022 2:34 PM, Swinney, Jonathan wrote:
> >>> I recognize that this patch is going to be somewhat controversial. I'm submitting it mostly to see what the opinions are and evaluate options. I am working on improving performance for aarch64. On that architecture, there are fewer hand written assembly implementations of hot functions than there are for x86_64 and allowing gcc to auto-vectorize yields noticeable improvements.
> >>>
> >>> Gcc vectorization has improved recently and it hasn't been evaluated on the mailing list for a few years. This is the latest discussion I found in my searches: http://ffmpeg.org/pipermail/ffmpeg-devel/2016-May/193977.html
> >>
> >> Every time this was done, it was inevitably reverted after complains and
> >> crash reports started piling up because gcc can't really handle all the
> >> inline code our codebase has, among other things.
> >>
> >
> > No need to wait for issues, I just tested, and the same issues still
> > persist that have existed for years with GCC now. They don't seem to
> > care to make it compatible with inline asm, which might be fair
> > enough, but it means it just can't work here.
> >
> > In file included from libavcodec/cabac_functions.h:49,
> >                 from libavcodec/h264_cabac.c:36:
> > libavcodec/h264_cabac.c: In function 'ff_h264_decode_mb_cabac':
> > libavcodec/x86/cabac.h:199:5: error: 'asm' operand has impossible constraints
>
> This particular bit of inline assembly has historically been very
> problematic in many configurations (although primarily on i386 I think) -
> see e.g. 8990c5869e27fcd43b53045f87ba251f42e7d293. Would something like
> that be enough for that build configuration to succeed, or are there many
> other cases that break?
>

I can test tomorrow, but if we start influencing optimizer decisions
just to run another optimizer flag, such a change would need to be
backed with (positive!) performance numbers, and _very_ thorough
testing (as we all know, trying to prove that something is not an
issue is practically impossible, as the combinations are infinite)

- Hendrik


More information about the ffmpeg-devel mailing list