[FFmpeg-devel] [PATCH] E-AC-3 spectral extension

Justin Ruggles justin.ruggles
Wed Jun 3 03:19:23 CEST 2009


Michael Niedermayer wrote:
> On Sun, May 17, 2009 at 02:23:34PM -0400, Justin Ruggles wrote:
>> Hi,
>>
>> I was recently made aware that some French TV station(s) will soon (if
>> not already) start using E-AC-3 streams in their broadcasts which
>> utilize spectral extension.  I was also given some samples (thanks j-b
>> and Anthony), which I uploaded to mphq:
>> http://samples.mplayerhq.hu/A-codecs/AC3/eac3/csi_miami_*
>>
>> So I decided to revisit my SPX patch.  The previous version was done
>> with all integer arithmetic, but it turns out that it's really not
>> accurate enough for spectal extension processing.  The resulting decoded
>> output had a max bandwidth of about 2kHz less when using 24-bit fixed
>> point vs. floating point, and was only slightly higher than without any
>> SPX processing at all.  Making just the square roots floating point
>> raised the bandwidth about 1kHz, and making the rest (noise/signal
>> scaling, spx coords, and notch filters) floating point added about
>> another 1kHz.
>> [...]
>> +            decode_band_structure(gbc, blk, s->eac3, 0,
>> +                                  s->spx_start_subband, spx_end_subband,
>> +                                  ff_eac3_default_spx_band_struct,
>> +                                  s->spx_band_struct, &s->num_spx_bands,
>> +                                  s->spx_band_sizes);
>> +        } else {
>> +            for (ch = 1; ch <= fbw_channels; ch++) {
>> +                s->channel_in_spx[ch] = 0;
>> +                s->first_spx_coords[ch] = 1;
>> +            }
>>          }
>> -        /* TODO: parse spectral extension strategy info */
>>      }
>>  
>> -    /* TODO: spectral extension coordinates */
>> +    /* spectral extension coordinates */
>> +    if (s->spx_in_use) {
>> +        for (ch = 1; ch <= fbw_channels; ch++) {
>> +            if (s->channel_in_spx[ch]) {
>> +                if (s->first_spx_coords[ch] || get_bits1(gbc)) {
>> +                    int bin;
>> +                    float spx_blend;
>> +                    int master_spx_coord;
>> +                    s->first_spx_coords[ch] = 0;
>> +                    spx_blend = get_bits(gbc, 5) / 32.0f;
>> +                    master_spx_coord = get_bits(gbc, 2) * 3;
>> +                    bin = s->spx_start_freq;
>> +                    for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
>> +                        int bandsize;
>> +                        int spx_coord_exp, spx_coord_mant;
>> +                        float nratio, sblend, nblend, spx_coord;
>> +
>> +                        /* calculate blending factors */
>> +                        bandsize = s->spx_band_sizes[bnd];
>> +                        nratio = ((float)((bin + (bandsize >> 1))) / s->spx_end_freq) - spx_blend;
>> +                        nratio = av_clipf(nratio, 0.0f, 1.0f);
> 
>> +                        nblend = sqrt(       nratio);
>> +                        sblend = sqrt(1.0f - nratio);
>> +                        nblend *= 1.73205077648f; // scale noise to give unity variance
> 
> nblend = sqrt( 3*nratio);

fixed.  also used *0.03125f instead of /32.0f for spx_blend.

> 
>> +                        bin += bandsize;
>> +
>> +                        /* decode spx coordinates */
>> +                        spx_coord_exp  = get_bits(gbc, 4);
>> +                        spx_coord_mant = get_bits(gbc, 2);
> 
>> +                        if (spx_coord_exp == 15)
>> +                            spx_coord = spx_coord_mant / 4.0f;
>> +                        else
>> +                            spx_coord = (spx_coord_mant + 4) / 8.0f;
> 
> multiply is faster then divide

fixed and also merged the *32.0 with the /4.0 and /8.0.

> 
>> +                        spx_coord /= (float)(1 << (spx_coord_exp + master_spx_coord));
> 
> the float cast looks useles

fixed here and a couple other places.

> 
> [...]
>> +        /* Copy coeffs from normal bands to extension bands */
>> +        bin = s->spx_start_freq;
>> +        for (i = 0; i < num_copy_sections; i++) {
>> +            memcpy(&s->transform_coeffs[ch][bin],
>> +                   &s->transform_coeffs[ch][s->spx_copy_start_freq],
>> +                   copy_sizes[i]*sizeof(float));
>> +            bin += copy_sizes[i];
>> +        }
> 
> cant that memcpy be merged with some of the other processing?

I thought I might be able to, but no.  Because of the rules of how the
copying is done, it makes it more efficient to do it this way.  The copy
band is a multiple of 12 like the spx bands, but not necessarily the
same size, and the way that it wraps around would make it awkward to mix
with the other purely per-band calculations.  This way the wrapping
boundaries are calculated once for all channels and are separate from
the spx band structure.

I did try merging the copying and energy calculation without memcpy but
it was slower and much uglier.

> 
>> +
>> +        /* Calculate RMS energy for each SPX band. */
>> +        bin = s->spx_start_freq;
>> +        for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
>> +            int bandsize = s->spx_band_sizes[bnd];
>> +            float accum = 0.0f;
>> +            for (i = 0; i < bandsize; i++) {
>> +                float coeff = s->transform_coeffs[ch][bin++];
>> +                accum += coeff * coeff;
>> +            }
>> +            rms_energy[bnd] = sqrt(accum / (float)bandsize);
>> +        }
>> +
>> +        /* Apply a notch filter at transitions between normal and extension
>> +           bands and at all wrap points. */
>> +        if (s->spx_atten_code[ch] >= 0) {
>> +            const float *atten_tab = ff_eac3_spx_atten_tab[s->spx_atten_code[ch]];
>> +            bin = s->spx_start_freq - 2;
>> +            for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
>> +                if (wrapflag[bnd]) {
>> +                    float *coeffs = &s->transform_coeffs[ch][bin];
>> +                    coeffs[0] *= atten_tab[0];
>> +                    coeffs[1] *= atten_tab[1];
>> +                    coeffs[2] *= atten_tab[2];
>> +                    coeffs[3] *= atten_tab[1];
>> +                    coeffs[4] *= atten_tab[0];
>> +                }
>> +                bin += s->spx_band_sizes[bnd];
>> +            }
>> +        }
>> +
>> +        /* Apply noise-blended coefficient scaling based on previously
>> +           calculated RMS energy, blending factors, and SPX coordinates for
>> +           each band. */
>> +        bin = s->spx_start_freq;
>> +        for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
>> +            float nscale = s->spx_noise_blend[ch][bnd] * rms_energy[bnd];
>> +            float sscale = s->spx_signal_blend[ch][bnd];
>> +            for (i = 0; i < s->spx_band_sizes[bnd]; i++) {
>> +                float noise  = nscale * (((int)av_lfg_get(&s->dith_state))/(float)(1<<31));
> 
> the 1<<31 factor can be merged into nscale

fixed. and the function is now 10% faster.


New patch attached.

Thanks,
Justin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: eac3_spx_4.diff
Type: text/x-patch
Size: 19424 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090602/94770df8/attachment.bin>



More information about the ffmpeg-devel mailing list