[FFmpeg-devel] [RFC] AAC Encoder

Tue Aug 19 18:26:43 CEST 2008

Michael Niedermayer a ?crit :

>>Regarding the highpass method, if the highpass freq is properly choosen, 
>>it's a method that demonstrated its usefullness. Its main drawback is 
>>usually the computation time compared to spectrum-based methods which 
>>can often be done using data that is already computed.
> 
> 
> And how does one properly choose it?
> Is it content dependant, if so i really have my doubts about it being a
> good choice as algorithm.

The highpass freq should be choosed in order to:

a)filter low freq content in order to not trigger short blocks on low 
freq transcient, as humans are not really sensitive to "low" freq 
attacks. Some codecs like mp3 and atrac even some possible mixed block 
sizes, where low freqs are coded with a long block while higher freqs 
are coded with some short blocks. Usually, a 2 or 3kHz highpass is fine 
for it.

b) filter low freqs so you are not fooled by the period of a low freq 
signal (you could end up with fake surges within some of the short windows)

(Lame uses fs/4 as an highpass value)

 From a practical POV, when there is a transcient (this is the same for 
audio or video), you have some energy spreaded over a big part of the 
spectrum, so a "too high" highpass is not really a big deal.

Of course, if the highpass is too high, you might miss a few cases which 
do not really qualify as transcient, but would still be more efficiently 
coded as short (like fatboy.wav or some harpsichord samples).
In those less easy cases, other methods like detection of PE surge help. 
(Lame uses both time domain highpass method and PE surge detection)

There are also frequency based methods, which are mainlu doing a regular 
time to freq transform over long block, and are looking at how energy is 
distributed over the spectrum in order to detect attacks. (you could try 
plotting a spectrogram of castanet.wav with something like audacity, you 
would quickly grasp it, even though it's probably intuitive enough if 
you compare it to how a dct of a big image block looks in case of sharp 
transitions like the ones in some anime content)

Please note that block switching itself is not enough to provent all 
cases of transcient smearing, and that the full aac standard also 
provides additional tools to deal with it (TNS and gain_control).

-- 
Gabriel Bouvigne
www.mp3-tech.org
personal page: http://gabriel.mp3-tech.org