[FFmpeg-devel] [PATCH] iirfilter: Use local variables for state in loop for FILTER_O2().
Måns Rullgård
mans
Sun Jan 30 19:51:07 CET 2011
Justin Ruggles <justin.ruggles at gmail.com> writes:
> 4% faster 2nd order ff_iir_filter_flt().
> ---
> libavcodec/iirfilter.c | 12 +++++++-----
> 1 files changed, 7 insertions(+), 5 deletions(-)
>
> I tried most of yesterday and this morning trying to make an asm
> version of the float biquad filter, but nothing I came up with was
> faster than what gcc did with the C version. I did, however manage
> to speed up the C version by about 4% by adding local variables
> inside the inner loop for the 2 states.
>
> diff --git a/libavcodec/iirfilter.c b/libavcodec/iirfilter.c
> index bc63c39..dd593dd 100644
> --- a/libavcodec/iirfilter.c
> +++ b/libavcodec/iirfilter.c
> @@ -261,11 +261,13 @@ av_cold struct FFIIRFilterState* ff_iir_filter_init_state(int order)
> const type *src0 = src; \
> type *dst0 = dst; \
> for (i = 0; i < size; i++) { \
> - float in = *src0 * c->gain + \
> - s->x[0] * c->cy[0] + \
> - s->x[1] * c->cy[1]; \
> - CONV_##fmt(*dst0, s->x[0] + in + s->x[1] * c->cx[1]) \
> - s->x[0] = s->x[1]; \
> + float s0 = s->x[0]; \
> + float s1 = s->x[1]; \
> + float in = *src0 * c->gain + \
> + s0 * c->cy[0] + \
> + s1 * c->cy[1]; \
> + CONV_##fmt(*dst0, in + s0 + s1 * c->cx[1]) \
> + s->x[0] = s1; \
> s->x[1] = in; \
> src0 += sstep; \
> dst0 += dstep; \
Why do you do load/store the struct values in the loop? Wouldn't it
be better to load the x[] values to locals before the loop and write
them back after? You might try doing the same with c[xy] as well.
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list