[FFmpeg-devel] Fwd: Fixpoint FFT optimization, with MDCT and IMDCT wrappers for audio optimization
Fri Aug 24 23:22:38 CEST 2007
On 24 August 2007, Loren Merritt wrote:
> On Thu, 23 Aug 2007, Mike Giacomelli wrote:
> >> well first part you reduce the number of multiplications split radix
> >> would do that but i guess you can make x operations faster by low level
> >> optimizations than half of the same operations to which the same low
> >> level optimizations could be applied
> > Most targets without fpus also have very slow integer multiplications
> > too. On ARM for instance, the 32x32->64 multiplies needed to do fix
> > point ops can be 4 times slower or more then a simple add as I recall.
> As opposed to recent x86 chips, where 32x32 mul is 9 times slower than add?
Moreover, at least ARM9E and ARM11 cores execute 32x32->64 MAC in 3 cycles
(that means they also have an extra addition per multiplication for free).
And if you only need 32x16->(high 32 bits of the result), such MAC operation
executes in only a single cycle (with some result latency though). So modern
ARM cores are pretty fast at doing multiplications.
More information about the ffmpeg-devel