[Ffmpeg-devel] Fixed vs. Floating Point AAC

Michael Niedermayer michaelni
Thu Mar 9 00:37:05 CET 2006


On Wed, Mar 08, 2006 at 11:39:10AM -0500, Rich Felker wrote:
> On Wed, Mar 08, 2006 at 01:01:47PM +0100, Michael Niedermayer wrote:
> > Hi
> > 
> > On Wed, Mar 08, 2006 at 04:14:25AM -0500, Rich Felker wrote:
> > > On Sun, Mar 05, 2006 at 02:11:54PM -0800, Mike Melanson wrote:
> > > > Hi,
> > > > 	Pursuant to the recent discussion of a possible AAC 
> > > > 	re-implementation, do you think it should rely on fixed or floating point 
> > > > numbers? I know the prevailing sentiment will probably be fixed point. So I 
> > > 
> > > Definitely integer (fixed point).
> > 
> > it should support both, fixed point is absolutely needed for embeded systems
> > (and richs K6), and floating point is more accurate and faster on modern
> > cpus, and floating point is much easier to implement
> Would you like to provide benches showing it's faster? Testing of

my claim is based on my attemps to optimize the mp3 decoder in ffmpeg
float based code was simply faster then integer based code in pretty much
every case on P3 unless i used "low-precission" fixed point calculations
but i was able to hear the difference with cheap sennheiser headphones
this is quite some time ago so i cant provide much details, anyway
it doesnt matter, we need fixed point for embeded systems anyway and
IMO a initial implementation using floats would be simpler then fixed
point as overflows and so on wont need to be considered
and in the end its the decission of whoever writes the codec, if its
me the initial implementation will be pure floats no matter what you
argue, as its simpler that way

but lets do a few theoretical calculations to show the issues with fixed
point and audio
if we want to maintain approximately 16bits precission (we dont want the
least significnt bit to be totally random but we neither need it to be 
always correct)
so lets say an error of 1/65536 is ok
if we now have a linear, orthogonal transform of N points, then due to
the orthogonality we know the sum of squares before and after the transform
will be equal, so a naive guess would be that we need to keep the same
number of fractional bits to maintain the same preccission
but what about the dynamic range? if all samples are 1.0 (max) then a
dc component would have a value of N^0.5 which for lets say N=1024 would
be 32, so we would need 21 bits, wheres the problem now, 21bits * 21bits=
42bits and that doesnt fit in 32bits so no fast 32*32->32bit muliplies 

you will have to do one of the following
use floats
use low precission integers
use 32*32->64 >> 32 -> 32 and hope the target cpu can do this quickly
use full 32*32->64 mutiplies which will be probably slow


More information about the ffmpeg-devel mailing list