[FFmpeg-devel] Fixpoint FFT optimization, with MDCT and IMDCT wrappers for audio optimization

Sun Jul 29 02:49:52 CEST 2007

On 7/27/07, Michael Niedermayer <michaelni at gmx.at> wrote:
> Hi
>
> On Fri, Jul 27, 2007 at 09:56:34PM -0400, Marc Hoffman wrote:
> > Hi,
> >
> > On 7/27/07, Michael Niedermayer <michaelni at gmx.at> wrote:
> > > Hi
> > >
> > > On Fri, Jul 27, 2007 at 05:40:14PM -0400, mmh wrote:
> > > [...]
> > > > +/*
>
> >
> > >
> > > also several iterations have trivial (1/-1/0) w* these should not be done
> > > in this generic way
> > ?
>
> these things called twiddle factors, which are represented by wwr and wwi in
> your code have values like 1/-1/0, well 32768/-32768/0 as its fixedpoint
> in some iterations and its very lame to do a full complex multiplication
> for w being one of 1,-1,i,-i
>

So you want to see the trivial stage pulled?  Is the use of N/2
acceptable to you here?

/*
  This is a fixedpoint inplace 32bit FFT which accepts 3 arguments:

  @param X   - input signal in format 1.31
  @param W   - phase factors in 1.31 format
  @param lgN - log_2(N) where N is the size of the input data set.

*/
void ff_fft32 (FFTContext *c, FFTComplex32 *X)
{
    FFTComplex16 *W = c->exptab16;
    int lgN    = c->nbits;
    int N      = 1<<lgN;
    int s, j, k, m, w, ws, hm;
    int64_t wwr,wwi;
    long tr,ti;

    for (j=0;j<N/2;j++) {
        tr = X[2*j+1].re;
        ti = X[2*j+1].im;
        X[2*j+1].re = X[2*j].re - tr;
        X[2*j+1].im = X[2*j].im - ti;
        X[2*j].re   = X[2*j].re + tr;
        X[2*j].im   = X[2*j].im + ti;
    }

    for (s=2; s<=lgN; s++) {
        m  = 1<<s;
        ws = 1<<(lgN-s);
        w=0;
	hm = m>>1;
        for (j=0; j<hm; j++) {
            wwr=W[w].re;
            wwi=W[w].im;
            w+=ws;
            for (k=j; k<N; k+=m) {
                int k2 = k+hm;

                tr        = (X[k2].re*wwr - X[k2].im*wwi + 0x4000)>>15;
                ti        = (X[k2].re*wwi + X[k2].im*wwr + 0x4000)>>15;

                X[k2].re  = (X[k].re - tr);
                X[k2].im  = (X[k].im - ti);

                X[k].re   = (X[k].re + tr);
                X[k].im   = (X[k].im + ti);
            }
        }
    }
}