[FFmpeg-devel] [RFC] AAC Encoder
Gabriel Bouvigne
bouvigne
Mon Aug 18 13:54:54 CEST 2008
Michael Niedermayer a ?crit :
>>static av_always_inline int quant(float coef, const float Q)
>>{
>> return av_clip((int)(pow(fabsf(coef) * Q, 0.75) + 0.4054), 0, 8191);
>>}
>
>
> converting float to int by casting is rather slow on x86
> also i do not see why the cliping against 0 is done
>
> and where does the 0.4054 come from? How has this value been selected?
It's a magic number to compensate for the fact that we want the
quantization noise to be minimal for the actual x value while we are in
fact quantizing x^0.75.
I don't remember how it was determined, but the exact same 0.4054f value
is used within Lame:
http://lame.cvs.sourceforge.net/lame/lame/libmp3lame/takehiro.c?&view=markup
>> switch(last_window_sequence){
>> case ONLY_LONG_SEQUENCE:
>> win[ch] = switch_to_eight ? LONG_START_SEQUENCE : ONLY_LONG_SEQUENCE;
>> grouping[ch] = 0;
>> break;
>> case LONG_START_SEQUENCE:
>> win[ch] = EIGHT_SHORT_SEQUENCE;
>> grouping[ch] = pch->next_grouping[ch];
>> break;
>> case LONG_STOP_SEQUENCE:
>> win[ch] = ONLY_LONG_SEQUENCE;
>> grouping[ch] = 0;
>> break;
>> case EIGHT_SHORT_SEQUENCE:
>> win[ch] = switch_to_eight ? EIGHT_SHORT_SEQUENCE : LONG_STOP_SEQUENCE;
>> grouping[ch] = switch_to_eight ? pch->next_grouping[ch] : 0;
>> break;
>> }
>> pch->next_grouping[ch] = window_grouping[attack_n];
>> }
>
>
> How much quality is lost by using this compared to RD optimal switching?
Very few. Block switching decision are relatively easy in the vast
majority of cases. Moreover, RD decision is relatively uneasy about it,
as a psy-model working in the frequency domain (thus providing you a
masking threshold which on which perceptual distorion computation is
based) is not that good at determining time domain smearing resulting
from the wrong window size.
>> //determine scalefactors - 5.6.2 "Scalefactor determination"
>> for(ch = 0; ch < chans; ch++){
>> prev_scale = -1;
>> for(w = 0; w < cpe->ch[ch].ics.num_windows; w++){
>> for(g = 0; g < cpe->ch[ch].ics.num_swb; g++){
>> g2 = w*16 + g;
>
>
>> cpe->ch[ch].zeroes[w][g] = pch->band[ch][g2].thr >= pch->band[ch][g2].energy;
>
>
> how much quality is lost compared to full RD decission ? its just a matter of
> checking how many bits this would need which is likely negligible speed wise.
> (assuming you can unentangle the threshold check into a distortion
> computation)
Quite a bit, and that is why the 3gp doc recommend to use this
scalefator determination as a starting point for a "full search" of sf
value, ie direct computation is just an heuristic to speed up the
scalefactor determination. (the non linear quantizer used for coeffs
hinders a potential "full direct" scalefactor computation)
--
Gabriel Bouvigne
www.mp3-tech.org
personal page: http://gabriel.mp3-tech.org
More information about the ffmpeg-devel
mailing list