[FFmpeg-devel] [PATCH] AAC Decoder - Round 2.

Michael Niedermayer michaelni
Fri Jun 27 22:38:32 CEST 2008


On Fri, Jun 27, 2008 at 03:35:06PM +0100, Robert Swain wrote:
> 2008/6/23 Michael Niedermayer <michaelni at gmx.at>:
> > On Mon, Jun 23, 2008 at 02:10:56PM +0100, Robert Swain wrote:
> >> 2008/6/20 Michael Niedermayer <michaelni at gmx.at>:
> >> > On Thu, Jun 19, 2008 at 04:22:57PM +0100, Robert Swain wrote:
> >> > [...]
> >> >> +
> >> >> +    for (g = 0; g < ics->num_window_groups; g++) {
> >> >> +        for (i = 0; i < ics->max_sfb; i++) {
> >> >> +            if (cb[g][i] == NOISE_HCB) {
> >> >> +                for (group = 0; group < ics->group_len[g]; group++) {
> >> >> +                    float energy = 0;
> >> >> +                    float scale = 1.;// / (float)(offsets[i+1] - offsets[i]);
> >> >> +                    for (k = offsets[i]; k < offsets[i+1]; k++)
> >> >> +                        energy += (float)icoef[group*128+k] * icoef[group*128+k];
> >> >> +                    scale *= sf[g][i] / sqrt(energy);
> >> >
> >> > are you sure that the random values have to be normalized like that?
> >> > I suspect energy is supposed tp be a constant.
> >>
> >> That's how it is in the spec. From section 4.6.13 Perceptual Noise
> >> Substitution (PNS):
> >
> > Ive checked the spec before my reply, and i belive your code is wrong.
> >
> >> The energy information for percpetual noise substitution decoding is
> >> represented by a "noise energy" value indicating the overall power of
> >> the substituted spectral coefficients in steps of 1.5 dB. If noise
> >> substitution coding is active for a particular group and scalefactor
> >> band, a noise energy value is transmitted instead of the scalefactor
> >> of the respective channel.
> >
> > Doesnt say that the output from the random number generator should be choped
> > up in bands and each independantly renormalized.
> >
> > Heres what the spec says:
> >    /* Decode noise energies for this group */
> >    for (sfb=0; sfb<max_sfb; sfb++)
> >        if (is_noise(g,sfb))
> >            noise_nrg[g][sfb] = nrg += dpcm_noise_nrg[g][sfb];
> >    /* Do perceptual noise substitution decoding */
> >    for (b=0; b<window_group_length[g]; b++) {
> >        for (sfb=0; sfb<max_sfb; sfb++) {
> >            if (is_noise(g,sfb)) {
> >                offs = swb_offset[sfb];
> >                size = swb_offset[sfb+1] - offs;
> >                /* Generate random vector */
> >                gen_rand_vector( &spec[g][b][sfb][0], size );
> >                scale = 1/(size * sqrt(MEAN_NRG));
> >                scale *= 2.0^(0.25*noise_nrg [g][sfb]);
> >                /* Scale random vector to desired target energy */
> >                for (i=0; i<len; i++)
> >                    spec[g][b][sfb][i] *= scale;
> >            }
> >        }
> >    }
> >    ...
> >    The function gen_rand_vector( addr, size ) generates a vector of length <size> with signed random values of
> >    average energy MEAN_NRG per random value. A suitable random number generator can be realized using one
> >    multiplication/accumulation per random value.
> >
> > No weird renormalization!
> > also the size factor is commented out in our code, i guess to cancel the
> > incorrect normalization mostly out.
> 
> OK, I understand what you mean now. I see that the current code we
> have calculates the energy of the band in question (hence there's no
> need for /size) and scales to that energy rather than scaling to
> 1/(size * MEAN_NRG). I have a few questions:
> 

> - Is the av_random() & 0xFFFF code OK or should the 32-bit random
> value rather be scaled to 16-bit? I suspect it should be scaled (i.e.
> >> 16).

Why dont you use the whole 32bit value? Or was the array int16_t in
which case >>16 is as good as &0xFFFF given the random number generator
is good.


> - How does one analytically calculate the mean energy of the noise we
> generate such that this value can be defined as a constant and used in
> our code? 

How do you define energy?
What is the distribution of the values av_random() outputs?



> After assuming white noise and dabbling about a bit with
> some maths it seems it should be 1/3 * maximum possible energy i.e.
> #define MEAN_NRG sqrt(3.0) * 32768.0 or something like that. Is this
> correct? 

If you know how you define energy then you can easily check this by
calculating the exact nummerical energy of many values from av_random()
your calculation should be close to the value returned ...


If you further assume random() has a flat equi distibuted output than
a simple 
for(i=0; i<=0xFFFF; i++) 
    find_energy(&energy, i);
will give you the exact value 
... the analytical one, assuming you solved the correct integral ;) is
actually just an approximation for the continuous case ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Observe your enemies, for they first find out your faults. -- Antisthenes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080627/7773afc9/attachment.pgp>



More information about the ffmpeg-devel mailing list