[FFmpeg-devel] [PATCH] Faster SSE FFT/MDCT
Sun May 13 18:33:41 CEST 2007
On 5/12/07, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Fri, May 11, 2007 at 04:51:21PM +0800, Zuxy Meng wrote:
> > 2007/5/11, Zuxy Meng <zuxy.meng at gmail.com>:
> > >The patch unrolls some loops, utilizing all 8 xmm registers. fft-test
> > >shows ~10% speed up in (I)FFT and ~5% speed up in (I)MDCT on my
> > >Dothan. Of course with x86-64 we can unroll one more time but I don't
> > >have a test bench....
> > >
> > >Full test passed on x86, and a test on x86-64 would be prudent: I used
> > >xmm8 to save a memory access.
> > I just unrolled another loop in imdct. Now IMDCT is about ~8% faster
> > compared to SVN head. Please ignore the last patch and try this one
> > instead:-)
> looks ok if it passes regression tests
Regression tests do pass here on AMD64.
Rich, you're forgetting one thing here: *everybody* except you is
More information about the ffmpeg-devel