[FFmpeg-devel] [PATCH v2 1/5] libavfilter/x86/vf_gblur: add ff_postscale_slice_avx512()

Thu Aug 12 11:41:30 EEST 2021

James Wrote: 
> On 8/11/2021 10:11 AM, Kieran Kunhya wrote:
> > On Wed, 11 Aug 2021 at 13:31, James Almer <jamrial at gmail.com> wrote:
> >
> >> You can disable AVX512 at both runtime and compile time. I don't
> >> think that because there's one CPU arch out there that sees a hit in
> >> performance for one instruction set we should stop applying code
> >> other CPUs will benefit from.
> >>
> >
> > Gramner suggests using the ice lake avx-512 subset as the minimum
> > baseline which I think is a good idea.
> >
> > Kieran
> 
> I'm fine with that, yes.
> 

Ice Lake is the arch that almost supports the most AVX512 subsets except
Tiger Lake for now. Actually, there are quite a few projects that have already
chosen Ice Lake as the minimum baseline. But on my mind, the subsets of 
AVX512, like F,  VL, DQ,	BW, was supported by a number of architectures that
have already been released, and they could help us improve performance in
most cases. Keep the quo and more CPUs would benefit from it.

I agree with James. The usage of AVX512 is not mandatory. Use whatever you prefer.

By the way, after I implement the algorithm newly designed for the gblur filter
with AVX512, which proved that it could definitely improve the performance,
I provided a version of AVX2, a instructions set supported by more CPUs.

Here let me show you some performance comparisons. It's better to care about 
the ratio between two fps, different CPUs could change the fps but the ratio 
would be the same approximately.

1. 1080p sigma=10:step=1:
        old gblur with avx2: 45 fps
        new gblur with avx512: 109 fps
2: 1080p sigma=10:steps=3:
        old gblur with avx2: 19 fps
        new gblur with avx512: 84 fps

Hopefully the info above is helpful.

Best regards,
Jianhua