[FFmpeg-devel] [PATCH v1] avfilter/vf_vaguedenoiser: use fabsf() instead of FFABS()

Fri Nov 8 04:26:27 EET 2019

On Fri, Nov 8, 2019 at 10:09 AM Limin Wang <lance.lmwang at gmail.com> wrote:
>
> On Wed, Nov 06, 2019 at 08:10:53PM +0100, Carl Eugen Hoyos wrote:
> > Am Mi., 6. Nov. 2019 um 12:04 Uhr schrieb Limin Wang <lance.lmwang at gmail.com>:
> > >
> > > On Wed, Nov 06, 2019 at 11:11:08AM +0100, Carl Eugen Hoyos wrote:
> > > > Am Mi., 6. Nov. 2019 um 10:31 Uhr schrieb <lance.lmwang at gmail.com>:
> > > > >
> > > > > From: Limin Wang <lance.lmwang at gmail.com>
> > > > >
> > > > > Signed-off-by: Limin Wang <lance.lmwang at gmail.com>
> > > > > ---
> > > > >  libavfilter/vf_vaguedenoiser.c | 6 +++---
> > > > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/libavfilter/vf_vaguedenoiser.c b/libavfilter/vf_vaguedenoiser.c
> > > > > index a68f7626e6..75a58c363b 100644
> > > > > --- a/libavfilter/vf_vaguedenoiser.c
> > > > > +++ b/libavfilter/vf_vaguedenoiser.c
> > > > > @@ -335,7 +335,7 @@ static void hard_thresholding(float *block, const int width, const int height,
> > > > >
> > > > >      for (y = 0; y < height; y++) {
> > > > >          for (x = 0; x < width; x++) {
> > > > > -            if (FFABS(block[x]) <= threshold)
> > > > > +            if (fabsf(block[x]) <= threshold)
> > > > >                  block[x] *= frac;
> > > > >          }
> > > > >          block += stride;
> > > > > @@ -359,7 +359,7 @@ static void soft_thresholding(float *block, const int width, const int height, c
> > > > >      for (y = 0; y < height; y++) {
> > > > >          const int x0 = (y < h) ? w : 0;
> > > > >          for (x = x0; x < width; x++) {
> > > > > -            const float temp = FFABS(block[x]);
> > > > > +            const float temp = fabsf(block[x]);
> > > > >              if (temp <= threshold)
> > > > >                  block[x] *= frac;
> > > > >              else
> > > > > @@ -380,7 +380,7 @@ static void qian_thresholding(float *block, const int width, const int height,
> > > > >
> > > > >      for (y = 0; y < height; y++) {
> > > > >          for (x = 0; x < width; x++) {
> > > > > -            const float temp = FFABS(block[x]);
> > > > > +            const float temp = fabsf(block[x]);
> > > > >              if (temp <= threshold) {
> > > > >                  block[x] *= frac;
> > > > >              } else {
> > > >
> > > > Please add a sentence to the commit message that explains why this
> > > > change is a good idea.
> > >
> > > block is float type, so I think it's better to use fabsf, isn't right?
> >
> > Looking at the definition of FFABS(), I don't think this is correct.
>
> Below is one old commit log to describe about it. What's the result for the discussion?
>
>
> commit 8507b98c10d948653375400e2b0a3d4389f74be4
> Author: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
> Date:   Mon Oct 12 01:30:22 2015 -0400
>
>     avfilter,swresample,swscale: use fabs, fabsf instead of FFABS
>
>     It is well known that fabs and fabsf are at least as fast and sometimes
>     faster than the FFABS macro, at least on the gcc+glibc combination.
>     For instance, see the reference:
>     http://patchwork.sourceware.org/patch/6735/.
>     This was a patch to glibc in order to remove their usages of a macro.
>
>     The reason essentially boils down to fabs using the __builtin_fabs of
>     the compiler, while FFABS needs to infer to not use a branch and to
>     simply change the sign bit. Usually the inference works, but sometimes
>     it does not. This may be easily checked by looking at the asm.
>
>     This also has the added benefit of reducing macro usage, which has
>     problems with side-effects.
>
>     Note that avcodec is not handled here, as it is huge and
>     most things there are integer arithmetic anyway.
>
>     Tested with FATE.
>
>     Reviewed-by: Clément Bœsch <u at pkh.me>
>     Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
>
>
>
> >

Is it has some performance data after change FFABS to fabsf in this filter?   As
my personal opinion,  if FFABS is not a performance bottleneck in this filter,
keep the old way may be better.

Swresample, swscale are the other thing, they are basic components for
another part
in FFmpeg, so I think we can use the fabsf for potential performance income.