[FFmpeg-devel] [PATCH] lavc/aacenc_utils: unroll abs_pow34_v loop

Reimar Döffinger Reimar.Doeffinger at gmx.de
Sat Mar 19 17:09:43 CET 2016


On Sat, Mar 19, 2016 at 12:42:09PM +0100, Clément Bœsch wrote:
> On Fri, Mar 18, 2016 at 10:12:14PM -0700, Ganesh Ajjanagadde wrote:
> > -static inline void abs_pow34_v(float *av_restrict out, const float *av_restrict in, const int size)
> > -{
> > -    int i;
> > -    for (i = 0; i < size; i++) {
> > -        float a = fabsf(in[i]);
> > -        out[i] = sqrtf(a * sqrtf(a));
> > -    }
> > -}
> > -
> >  static inline float pos_pow34(float a)
> >  {
> >      return sqrtf(a * sqrtf(a));
> >  }
> >  
> > +static inline void abs_pow34_v(float *av_restrict out, const float *av_restrict in, const int size)
> > +{
> > +    av_assert2(!(size % 4));
> > +    for (int i = 0; i < size; i+=4) {
> > +        float a0 = fabsf(in[i]);
> > +        float a1 = fabsf(in[i+1]);
> > +        float a2 = fabsf(in[i+2]);
> > +        float a3 = fabsf(in[i+3]);
> > +        out[i  ] = pos_pow34(a0);
> > +        out[i+1] = pos_pow34(a1);
> > +        out[i+2] = pos_pow34(a2);
> > +        out[i+3] = pos_pow34(a3);
> > +    }
> > +}
> > +
> 
> I'm curious (and lazy), is GCC able to unroll by itself if you hint it
> with a loop such as:
> 
>     int i;
>     for (i = 0; i < size & ~3; i++) {
>         float a = fabsf(in[i]);
>         out[i] = sqrtf(a * sqrtf(a));
>     }

I haven't been able to to figure out for
sure for this one, but at least the other one
Debian gcc 5.3.1 already unrolls and vectorizes
for me, though it has a bit of extra code to
handle cases where size is not a multiple of 4.
So I suspect "which gcc?" is probably an important
question.


More information about the ffmpeg-devel mailing list