[FFmpeg-devel] [PATCH] AAC decoder

Robert Swain robert.swain
Mon May 26 14:16:12 CEST 2008


2008/5/26 Robert Swain <robert.swain at gmail.com>:
> 2008/5/25 Michael Niedermayer <michaelni at gmx.at>:
>> On Fri, May 23, 2008 at 04:06:10PM +0100, Robert Swain wrote:
>>> 2008/5/21 Robert Swain <robert.swain at gmail.com>:
>>> > 2008/4/2 Michael Niedermayer <michaelni at gmx.at>:
>>> >> On Tue, Apr 01, 2008 at 04:56:48PM +0200, Andreas ?man wrote:
>>> >>> Andreas ?man wrote:
>>> >
>>> > [...]
>>> >
>>> >>> +/**
>>> >>> + * Decode scale_factor_data
>>> >>> + * reference: Table 4.47
>>> >>> + */
>>> >>> +static int scale_factor_data(AACContext * ac, GetBitContext * gb, float mix_gain, unsigned int global_gain, ics_struct * ics, const int cb[][64], float sf[][64]) {
>>> >>
>>> >> Maybe rename the function to what the comment says
>>> >>
>>> >>
>>> >>> +    int g, i;
>>> >>> +    unsigned int intensity = 100; // normalization for intensity_tab lookup table
>>> >>> +    int noise = global_gain - 90;
>>> >>> +    int noise_flag = 1;
>>> >>> +    ics->intensity_present = 0;
>>> >>> +    for (g = 0; g < ics->num_window_groups; g++) {
>>> >>> +        for (i = 0; i < ics->max_sfb; i++) {
>>> >>> +            if (cb[g][i] == ZERO_HCB) {
>>> >>> +                sf[g][i] = 0.;
>>> >>
>>> >>> +            } else if (cb[g][i] == INTENSITY_HCB || cb[g][i] == INTENSITY_HCB2) {
>>> >>> +                ics->intensity_present = 1;
>>> >>> +                intensity += get_vlc2(gb, ac->mainvlc.table, 7, 3) - 60;
>>> >>> +                if(intensity > 255) {
>>> >>> +                    av_log(ac->avccontext, AV_LOG_ERROR,
>>> >>> +                           "Intensity (%d) out of range", intensity);
>>> >>> +                    return -1;
>>> >>> +                }
>>> >>> +                sf[g][i] = ac->intensity_tab[intensity];
>>> >>> +            } else if (cb[g][i] == NOISE_HCB) {
>>> >>> +                if (noise_flag) {
>>> >>> +                    noise_flag = 0;
>>> >>> +                    noise += get_bits(gb, 9) - 256;
>>> >>> +                } else {
>>> >>> +                    noise += get_vlc2(gb, ac->mainvlc.table, 7, 3) - 60;
>>> >>> +                }
>>> >>> +                sf[g][i] = pow(2.0, 0.25 * noise) * ac->sf_scale;
>>> >>> +            } else {
>>> >>> +                global_gain += get_vlc2(gb, ac->mainvlc.table, 7, 3) - 60;
>>> >>> +                if(global_gain > 255) {
>>> >>> +                    av_log(ac->avccontext, AV_LOG_ERROR,
>>> >>> +                           "Global gain (%d) out of range", global_gain);
>>> >>> +                    return -1;
>>> >>> +                }
>>> >>> +                sf[g][i] = ac->pow2sf_tab[global_gain];
>>> >>> +            }
>>> >>
>>> >> The identical code is implemented here 3 times, 2 times with seperate
>>> >> LUTs and once by directly calling pow()
>>> >> The differenes look like bugs. like forgetting to check validity or
>>> >> performing the scaling needed for the float2int16 transform.
>>> >>
>>> >> if(cb[g][i] == NOISE_HCB && noise_flag-- > 0)
>>> >>    gain[index] += get_bits(gb, 9) - 256;
>>> >> else
>>> >>    gain[index] += get_vlc2(gb, ac->mainvlc.table, 7, 3) - 60;
>>> >> if(gain[index] > 255)
>>> >>    ...
>>> >> sf[g][i] = ac->pow2sf_tab[gain[index]];
>>> >
>>> > I tried this and audibly it sounds fine on a few files I tested. I
>>> > want to compare the two tables and pow() * ac->sf_scale call before
>>> > submitting a patch though.
>>>
>>> The intensity case is different to the noise and global cases. See
>>> attached patch.
>>
>> It may be different, but it has to be scaled by sf_scale somewhere.
>> intensity_tab is not and i dont see it anywhere afterwards either.
>> if intensity_tab would be scaled then the difference would disapear.
>> all just IMHO and after a quick look ...
>
> It doesn't seem to be scaled by sf_scale anywhere as far as I can see.
> I guess this is a bug, right? Andreas said he fixed a couple of issues
> regarding intensity stereo in aac.c r2035 and r2036 and he's tested it
> against FAAD so either FAAD is broken too or there is no bug, somehow.
>
> The tables are still different as intensity uses pow(0.5, (i-100)/4.)
> and the other cases use pow(2.0, (i-100)/4.). I'm trying to find some
> form of documentation for this in the specification.
>
> Also, the cases use some weird offsets: noise = global_gain - 90;
> intensity = 100; I'm not sure what effect these have.
>
> Unless you're happy with the clean up patch I presented and/or have
> some other suggestions, I get the feeling I'm going to have to dive
> into the spec and may end up rewriting this code to be more consistent
> and concise. Hrm. I'll be optimistic and hope that all will become
> clear and only minor to no modifications are needed. :)

I've looked at the spec and the intensity case is special. The other
non-ZERO_HCB cases should have their values in [0, 255] but the
intensity case doesn't actually code scale factors. To quote the spec:

[QUOTE]

The directional information for the intensity stereo decoding is
represented by an "intensity stereo position" value indicating the
relation between left and right channel scaling. If intensity stereo
coding is active for a particular group and scalefactor band, an
intensity stereo position value is transmitted instead of the
scalefactor of the right channel. Intensity positions are coded just
like scalefactors, i.e. by Huffman coding of differential values with
two differences:

- There is no first value that is sent as PCM. Instead, the
differential decoding is started assuming the last
intensity stereo position value to be zero.
- Differential decoding is done separately between scalefactors and
intensity stereo positions. In other words, the scalefactor decoder
ignores interposed intensity stereo position values and vice versa
(see subclause ?)

[/QUOTE]
(I edited the formatting slightly and the subclauses and many other
things don't display properly, if at all in my copy of the spec...)

I suspect the first difference has some bearing on the intensity being
initialised to 100. And, now obviously, as the coded values aren't
actually scale factors but intensity stereo positions, it makes sense
that they aren't multiplied by sf_scale at any point in the code.

The intensity case is merely in the scale factor decoding function for
convenience as it is coded in the same way. So, what should be the
plan of action from here?

- Only validate appropriate cases to the range [0, 255]
- Split out the intensity case entirely into a separate function or
just comment that it codes intensity stereo positions rather than
scale factors?
- Anything else? :)

Rob



More information about the ffmpeg-devel mailing list