[FFmpeg-devel] [PATCH] iirfilter: Use local variables for state in loop for FILTER_O2().

Måns Rullgård mans
Sun Jan 30 19:51:07 CET 2011


Justin Ruggles <justin.ruggles at gmail.com> writes:

> 4% faster 2nd order ff_iir_filter_flt().
> ---
>  libavcodec/iirfilter.c |   12 +++++++-----
>  1 files changed, 7 insertions(+), 5 deletions(-)
>
> I tried most of yesterday and this morning trying to make an asm
> version of the float biquad filter, but nothing I came up with was
> faster than what gcc did with the C version.  I did, however manage
> to speed up the C version by about 4% by adding local variables
> inside the inner loop for the 2 states.
>
> diff --git a/libavcodec/iirfilter.c b/libavcodec/iirfilter.c
> index bc63c39..dd593dd 100644
> --- a/libavcodec/iirfilter.c
> +++ b/libavcodec/iirfilter.c
> @@ -261,11 +261,13 @@ av_cold struct FFIIRFilterState* ff_iir_filter_init_state(int order)
>      const type *src0 = src;                                             \
>      type       *dst0 = dst;                                             \
>      for (i = 0; i < size; i++) {                                        \
> -        float in = *src0   * c->gain  +                                 \
> -                   s->x[0] * c->cy[0] +                                 \
> -                   s->x[1] * c->cy[1];                                  \
> -        CONV_##fmt(*dst0, s->x[0] + in + s->x[1] * c->cx[1])            \
> -        s->x[0] = s->x[1];                                              \
> +        float s0 = s->x[0];                                             \
> +        float s1 = s->x[1];                                             \
> +        float in = *src0 * c->gain  +                                   \
> +                   s0    * c->cy[0] +                                   \
> +                   s1    * c->cy[1];                                    \
> +        CONV_##fmt(*dst0, in + s0 + s1 * c->cx[1])                      \
> +        s->x[0] = s1;                                                   \
>          s->x[1] = in;                                                   \
>          src0 += sstep;                                                  \
>          dst0 += dstep;                                                  \

Why do you do load/store the struct values in the loop?  Wouldn't it
be better to load the x[] values to locals before the loop and write
them back after?  You might try doing the same with c[xy] as well.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list