[FFmpeg-devel] [PATCH] E-AC-3 spectral extension

Michael Niedermayer michaelni
Sun May 31 02:31:04 CEST 2009


On Sun, May 17, 2009 at 02:23:34PM -0400, Justin Ruggles wrote:
> Hi,
> 
> I was recently made aware that some French TV station(s) will soon (if
> not already) start using E-AC-3 streams in their broadcasts which
> utilize spectral extension.  I was also given some samples (thanks j-b
> and Anthony), which I uploaded to mphq:
> http://samples.mplayerhq.hu/A-codecs/AC3/eac3/csi_miami_*
> 
> So I decided to revisit my SPX patch.  The previous version was done
> with all integer arithmetic, but it turns out that it's really not
> accurate enough for spectal extension processing.  The resulting decoded
> output had a max bandwidth of about 2kHz less when using 24-bit fixed
> point vs. floating point, and was only slightly higher than without any
> SPX processing at all.  Making just the square roots floating point
> raised the bandwidth about 1kHz, and making the rest (noise/signal
> scaling, spx coords, and notch filters) floating point added about
> another 1kHz.
> 
> I was able to compare the output to Nero's E-AC-3 decoder (thanks
> madshi), and the results are very close considering that AC-3 uses
> random noise for zero-bit mantissas:

> stddev:  131.16 PSNR: 53.96

i wouldnt call 131.16 close


> PEAQ ODG: -0.44

what is PEAQ ODG ?


btw, have you tested your code with our trasher / some fuzzer to make sure
it doesnt segfault?


> 
> One thing I'm unsure about is whether I should add optional runtime
> generation of the attenuation table rather than always hardcoding it.

i think due to the relatively small size there is little point


[...]

> diff --git a/libavcodec/ac3dec.c b/libavcodec/ac3dec.c
> index c176cb3..e6d7a9d 100644
> --- a/libavcodec/ac3dec.c
> +++ b/libavcodec/ac3dec.c
> @@ -825,14 +825,94 @@ static int decode_audio_block(AC3DecodeContext *s, int blk)
>  
>      /* spectral extension strategy */
>      if (s->eac3 && (!blk || get_bits1(gbc))) {
> -        if (get_bits1(gbc)) {
> -            ff_log_missing_feature(s->avctx, "Spectral extension", 1);
> -            return -1;
> +        s->spx_in_use = get_bits1(gbc);
> +        if (s->spx_in_use) {
> +            int begf, endf;
> +            int spx_end_subband;
> +
> +            /* determine which channels use spx */
> +            if (s->channel_mode == AC3_CHMODE_MONO) {
> +                s->channel_in_spx[1] = 1;
> +            } else {
> +                for (ch = 1; ch <= fbw_channels; ch++)
> +                    s->channel_in_spx[ch] = get_bits1(gbc);
> +            }
> +
> +            s->spx_copy_start_freq = get_bits(gbc, 2) * 12 + 25;
> +            begf = get_bits(gbc, 3);
> +            endf = get_bits(gbc, 3);
> +            s->spx_start_subband = begf < 6 ? begf+2 : 2*begf-3;
> +            spx_end_subband      = endf < 4 ? endf+5 : 2*endf+3;
> +            if (s->spx_start_subband >= spx_end_subband) {
> +                av_log(s->avctx, AV_LOG_ERROR, "invalid spectral extension range (%d >= %d)\n",
> +                       s->spx_start_subband, spx_end_subband);
> +                return -1;
> +            }
> +            s->spx_start_freq    = s->spx_start_subband * 12 + 25;
> +            s->spx_end_freq      = spx_end_subband      * 12 + 25;
> +            if (s->spx_copy_start_freq >= s->spx_start_freq) {
> +                av_log(s->avctx, AV_LOG_ERROR, "invalid spectral extension copy start bin (%d >= %d)\n",
> +                       s->spx_copy_start_freq, s->spx_start_freq);
> +                return -1;
> +            }

you know, i always have a bad feeling when various variables are updated
first and checked afterwards but left in an invalid state anyway

are you sure this is all free of buffer overflows? (ive not checked so
it may very well be ok ...)



> +            decode_band_structure(gbc, blk, s->eac3, 0,
> +                                  s->spx_start_subband, spx_end_subband,
> +                                  ff_eac3_default_spx_band_struct,
> +                                  s->spx_band_struct, &s->num_spx_bands,
> +                                  s->spx_band_sizes);
> +        } else {
> +            for (ch = 1; ch <= fbw_channels; ch++) {
> +                s->channel_in_spx[ch] = 0;
> +                s->first_spx_coords[ch] = 1;
> +            }
>          }
> -        /* TODO: parse spectral extension strategy info */
>      }
>  
> -    /* TODO: spectral extension coordinates */
> +    /* spectral extension coordinates */
> +    if (s->spx_in_use) {
> +        for (ch = 1; ch <= fbw_channels; ch++) {
> +            if (s->channel_in_spx[ch]) {
> +                if (s->first_spx_coords[ch] || get_bits1(gbc)) {
> +                    int bin;
> +                    float spx_blend;
> +                    int master_spx_coord;
> +                    s->first_spx_coords[ch] = 0;
> +                    spx_blend = get_bits(gbc, 5) / 32.0f;
> +                    master_spx_coord = get_bits(gbc, 2) * 3;
> +                    bin = s->spx_start_freq;
> +                    for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> +                        int bandsize;
> +                        int spx_coord_exp, spx_coord_mant;
> +                        float nratio, sblend, nblend, spx_coord;
> +
> +                        /* calculate blending factors */
> +                        bandsize = s->spx_band_sizes[bnd];
> +                        nratio = ((float)((bin + (bandsize >> 1))) / s->spx_end_freq) - spx_blend;
> +                        nratio = av_clipf(nratio, 0.0f, 1.0f);

> +                        nblend = sqrt(       nratio);
> +                        sblend = sqrt(1.0f - nratio);
> +                        nblend *= 1.73205077648f; // scale noise to give unity variance

nblend = sqrt( 3*nratio);


> +                        bin += bandsize;
> +
> +                        /* decode spx coordinates */
> +                        spx_coord_exp  = get_bits(gbc, 4);
> +                        spx_coord_mant = get_bits(gbc, 2);

> +                        if (spx_coord_exp == 15)
> +                            spx_coord = spx_coord_mant / 4.0f;
> +                        else
> +                            spx_coord = (spx_coord_mant + 4) / 8.0f;

multiply is faster then divide


> +                        spx_coord /= (float)(1 << (spx_coord_exp + master_spx_coord));

the float cast looks useles


[...]
> +        /* Copy coeffs from normal bands to extension bands */
> +        bin = s->spx_start_freq;
> +        for (i = 0; i < num_copy_sections; i++) {
> +            memcpy(&s->transform_coeffs[ch][bin],
> +                   &s->transform_coeffs[ch][s->spx_copy_start_freq],
> +                   copy_sizes[i]*sizeof(float));
> +            bin += copy_sizes[i];
> +        }

cant that memcpy be merged with some of the other processing?


> +
> +        /* Calculate RMS energy for each SPX band. */
> +        bin = s->spx_start_freq;
> +        for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> +            int bandsize = s->spx_band_sizes[bnd];
> +            float accum = 0.0f;
> +            for (i = 0; i < bandsize; i++) {
> +                float coeff = s->transform_coeffs[ch][bin++];
> +                accum += coeff * coeff;
> +            }
> +            rms_energy[bnd] = sqrt(accum / (float)bandsize);
> +        }
> +
> +        /* Apply a notch filter at transitions between normal and extension
> +           bands and at all wrap points. */
> +        if (s->spx_atten_code[ch] >= 0) {
> +            const float *atten_tab = ff_eac3_spx_atten_tab[s->spx_atten_code[ch]];
> +            bin = s->spx_start_freq - 2;
> +            for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> +                if (wrapflag[bnd]) {
> +                    float *coeffs = &s->transform_coeffs[ch][bin];
> +                    coeffs[0] *= atten_tab[0];
> +                    coeffs[1] *= atten_tab[1];
> +                    coeffs[2] *= atten_tab[2];
> +                    coeffs[3] *= atten_tab[1];
> +                    coeffs[4] *= atten_tab[0];
> +                }
> +                bin += s->spx_band_sizes[bnd];
> +            }
> +        }
> +
> +        /* Apply noise-blended coefficient scaling based on previously
> +           calculated RMS energy, blending factors, and SPX coordinates for
> +           each band. */
> +        bin = s->spx_start_freq;
> +        for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> +            float nscale = s->spx_noise_blend[ch][bnd] * rms_energy[bnd];
> +            float sscale = s->spx_signal_blend[ch][bnd];
> +            for (i = 0; i < s->spx_band_sizes[bnd]; i++) {
> +                float noise  = nscale * (((int)av_lfg_get(&s->dith_state))/(float)(1<<31));

the 1<<31 factor can be merged into nscale


[...
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Observe your enemies, for they first find out your faults. -- Antisthenes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090531/9324de26/attachment.pgp>



More information about the ffmpeg-devel mailing list