[FFmpeg-devel] Fwd: Fixpoint FFT optimization, with MDCT and IMDCT wrappers for audio optimization

Marc Hoffman mmhoffm
Tue Aug 21 23:02:18 CEST 2007


Hi,

On 8/19/07, Justin Ruggles <justinruggles at bellsouth.net> wrote:
>
>                      256    512   1024   2048    4096
> -------------------------------------------------------
> (1) fft32           46.1   101.0  205.0  494.4  1169.3
> (2) fftr2            7.6    16.0   34.1  101.3   258.5
> (3) fft ffmpeg       7.4    16.4   33.8   82.4   187.7
> (4) fftr4/2          6.7    13.6   28.2   74.0   169.8
>
> (1) fixedpoint 32bit simple RAD2 fft
> (2) simple rad2 first stage rad4
> (3) current lavc RAD2, first 2 stages, and middle pulled
> (4) rad4/2 with last stage rad2 if needed

I'm adding a split radix to the list.  Its a lot harder to optimize
because of the extra pointer manipulation and control structures.  I
did manage to unroll the W0 multipliers out but getting a clean
iteration space has not happened yet.

mmh at yoda$ for i in 8 9 10 11 12 ; do echo $i; fft $i; done
8
256
FFT32:time: 42.3 us/transform [total time=1.39 s its=32768]
FFTR2:time: 7.0 us/transform [total time=1.82 s its=262144]
FFT-ffmpeg:time: 6.3 us/transform [total time=1.65 s its=262144]
FFTR4/2:time: 5.2 us/transform [total time=1.37 s its=262144]
SRFFTR:time: 7.1 us/transform [total time=1.87 s its=262144]
9
512
FFT32:time: 94.7 us/transform [total time=1.55 s its=16384]
FFTR2:time: 15.0 us/transform [total time=1.96 s its=131072]
FFT-ffmpeg:time: 14.2 us/transform [total time=1.86 s its=131072]
FFTR4/2:time: 11.5 us/transform [total time=1.51 s its=131072]
SRFFTR:time: 15.1 us/transform [total time=1.98 s its=131072]
10
1024
FFT32:time: 206.6 us/transform [total time=1.69 s its=8192]
FFTR2:time: 34.4 us/transform [total time=1.13 s its=32768]
FFT-ffmpeg:time: 31.7 us/transform [total time=1.04 s its=32768]
FFTR4/2:time: 26.4 us/transform [total time=1.73 s its=65536]
SRFFTR:time: 39.1 us/transform [total time=1.28 s its=32768]
11
2048
FFT32:time: 454.7 us/transform [total time=1.86 s its=4096]
FFTR2:time: 86.0 us/transform [total time=1.41 s its=16384]
FFT-ffmpeg:time: 70.1 us/transform [total time=1.15 s its=16384]
FFTR4/2:time: 62.9 us/transform [total time=1.03 s its=16384]
SRFFTR:time: 97.5 us/transform [total time=1.60 s its=16384]
12
4096
FFT32:time: 1004.5 us/transform [total time=1.03 s its=1024]
FFTR2:time: 217.5 us/transform [total time=1.78 s its=8192]
FFT-ffmpeg:time: 155.8 us/transform [total time=1.28 s its=8192]
FFTR4/2:time: 129.6 us/transform [total time=1.06 s its=8192]
SRFFTR:time: 242.3 us/transform [total time=1.99 s its=8192]
mmh at yoda$




More information about the ffmpeg-devel mailing list