[FFmpeg-devel] [PATCH]: Align branch target for fft_sse.c, fft_3dn.c and fft_3dn2.c
Zuxy Meng
zuxy.meng
Sun May 27 08:17:48 CEST 2007
Hi,
2007/5/27, Loren Merritt <lorenm at u.washington.edu>:
> On Sat, 26 May 2007, Zuxy Meng wrote:
>
> > This patch aligns the loop entry on 16-byte aligned address, as
> > recommended by Intel's and AMD's manuals. But I don't see any
> > noticeable improvements on my Dothan. Anyone to test it on other CPUs?
>
> Core2:
> 256-point fft got 11% faster.
> 2048-point fft got 2% slower.
> Total time to decode vorbis is unchanged.
Thanks! Seems that if branch prediction works right, the misaligned
penalty is well hidden in the pipeline, so patch dropped.
--
Zuxy
Beauty is truth,
While truth is beauty.
PGP KeyID: E8555ED6
More information about the ffmpeg-devel
mailing list