[FFmpeg-devel] [PATCH] RealAudio 14.4K encoder

Sat May 22 16:00:51 CEST 2010

On Sat, May 22, 2010 at 03:18:45PM +0200, Francesco Lavra wrote:
[...]
> > > +        if (index == low)
> > > +            return table[high] + error > value ? low : high;
> > > +        if (error > 0) {
> > > +            high = index;
> > > +        } else {
> > > +            low = index;
> > > +        }
> > > +    }
> > > +}
> > > +
> > > +
> > 
> > > +/**
> > > + * Orthogonalizes a vector to another vector
> > > + *
> > > + * @param v vector to orthogonalize
> > > + * @param u vector against which orthogonalization is performed
> > > + */
> > > +static void orthogonalize(float *v, const float *u)
> > 
> > missing const
> 
> Vector v is not constant.
> Or do you mean something like (float * const v, const float * const u)?

i probably meant, that iam stupid

[...]
> > [...]
> > > +/**
> > > + * Searches the adaptive codebook for the best entry and gain and removes its
> > > + * contribution from input data
> > > + *
> > > + * @param adapt_cb array from which the adaptive codebook is extracted
> > > + * @param work array used to calculate LPC-filtered vectors
> > > + * @param coefs coefficients of the LPC filter
> > > + * @param data input data
> > > + * @return index of the best entry of the adaptive codebook
> > > + */
> > > +static int adaptive_cb_search(const int16_t *adapt_cb, float *work,
> > > +                              const float *coefs, float *data)
> > > +{
> > > +    int i, j, best_vect;
> > > +    float score, gain, best_score, best_gain;
> > > +    float exc[BLOCKSIZE];
> > > +
> > > +    gain = best_score = 0;
> > > +    for (i = BLOCKSIZE / 2; i <= BUFFERSIZE; i++) {
> > > +        for (j = 0; j < FFMIN(BLOCKSIZE, i); j++)
> > > +            exc[j] = adapt_cb[BUFFERSIZE - i + j];
> > > +        if (i < BLOCKSIZE)
> > > +            for (j = 0; j < BLOCKSIZE - i; j++)
> > > +                exc[i + j] = adapt_cb[BUFFERSIZE - i + j];
> > > +        get_match_score(work, coefs, exc, NULL, NULL, data, &score, &gain);
> > > +        if (score > best_score) {
> > > +            best_score = score;
> > > +            best_vect = i;
> > > +            best_gain = gain;
> > > +        }
> > > +    }
> > > +    if (!best_score)
> > > +        return 0;
> > > +
> > > +    /**
> > > +     * Re-calculate the filtered vector from the vector with maximum match score
> > > +     * and remove its contribution from input data.
> > > +     */
> > 
> > > +    for (j = 0; j < FFMIN(BLOCKSIZE, best_vect); j++)
> > > +        exc[j] = adapt_cb[BUFFERSIZE - best_vect + j];
> > > +    if (i < BLOCKSIZE)
> > > +        for (j = 0; j < BLOCKSIZE - best_vect; j++)
> > > +            exc[best_vect + j] = adapt_cb[BUFFERSIZE - best_vect + j];
> > 
> > code duplication
> 
> Will fix it if you like the floating point approach.
> 
> > > +    ff_celp_lp_synthesis_filterf(work, coefs, exc, BLOCKSIZE, LPC_ORDER);
> > 
> > 
> > 
> > > +    for (i = 0; i < BLOCKSIZE; i++)
> > > +        data[i] -= best_gain * work[i];
> > > +    return (best_vect - BLOCKSIZE / 2 + 1);
> > > +}
> > > +
> > > +
> > 
> > > +/**
> > > + * Searches the two fixed codebooks for the best entry and gain
> > > + *
> > > + * @param work array used to calculate LPC-filtered vectors
> > > + * @param coefs coefficients of the LPC filter
> > > + * @param data input data
> > > + * @param cba_idx index of the best entry of the adaptive codebook
> > > + * @param cb1_idx pointer to variable where the index of the best entry of the
> > > + *        first fixed codebook is returned
> > > + * @param cb2_idx pointer to variable where the index of the best entry of the
> > > + *        second fixed codebook is returned
> > > + */
> > > +static void fixed_cb_search(float *work, const float *coefs, float *data,
> > > +                            int cba_idx, int *cb1_idx, int *cb2_idx)
> > > +{
> > > +    int i, j, ortho_cb1;
> > > +    float score, gain, best_score, best_gain;
> > > +    float cba_vect[BLOCKSIZE], cb1_vect[BLOCKSIZE];
> > > +    float vect[BLOCKSIZE];
> > > +
> > > +    /**
> > > +     * The filtered vector from the adaptive codebook can be retrieved from
> > > +     * work, because this function is called just after adaptive_cb_search().
> > > +     */
> > > +    if (cba_idx)
> > > +        memcpy(cba_vect, work, sizeof(cba_vect));
> > > +
> > > +    *cb1_idx = gain = best_score = best_gain = 0;
> > > +    for (i = 0; i < FIXED_CB_SIZE; i++) {
> > > +        for (j = 0; j < BLOCKSIZE; j++)
> > > +            vect[j] = ff_cb1_vects[i][j];
> > > +        get_match_score(work, coefs, vect, cba_idx ? cba_vect: NULL, NULL, data,
> > > +                        &score, &gain);
> > > +        if (score > best_score) {
> > > +            best_score = score;
> > > +            *cb1_idx = i;
> > > +            best_gain = gain;
> > > +        }
> > > +    }
> > > +
> > > +    /**
> > > +     * Re-calculate the filtered vector from the vector with maximum match score
> > > +     * and remove its contribution from input data.
> > > +     */
> > > +    if (best_gain) {
> > > +        for (j = 0; j < BLOCKSIZE; j++)
> > > +            vect[j] = ff_cb1_vects[*cb1_idx][j];
> > > +        ff_celp_lp_synthesis_filterf(work, coefs, vect, BLOCKSIZE, LPC_ORDER);
> > > +        if (cba_idx)
> > > +            orthogonalize(work, cba_vect);
> > > +        for (i = 0; i < BLOCKSIZE; i++)
> > > +            data[i] -= best_gain * work[i];
> > > +        memcpy(cb1_vect, work, sizeof(cb1_vect));
> > > +        ortho_cb1 = 1;
> > > +    } else
> > > +        ortho_cb1 = 0;
> > > +
> > 
> > > +    *cb2_idx = best_score = best_gain = 0;
> > > +    for (i = 0; i < FIXED_CB_SIZE; i++) {
> > > +        for (j = 0; j < BLOCKSIZE; j++)
> > > +            vect[j] = ff_cb2_vects[i][j];
> > > +        get_match_score(work, coefs, vect, cba_idx ? cba_vect : NULL,
> > > +                        ortho_cb1 ? cb1_vect : NULL, data, &score, &gain);
> > > +        if (score > best_score) {
> > > +            best_score = score;
> > > +            *cb2_idx = i;
> > > +            best_gain = gain;
> > > +        }
> > > +    }
> > 
> > duplicate
> 
> Ditto.
[...]
> > > +    }
> > > +
> > > +    /**
> > > +     * Calculate the zero-input response of the LPC filter and subtract it from
> > > +     * input data.
> > > +     */
> > > +    memset(data, 0, sizeof(data));
> > > +    ff_celp_lp_synthesis_filterf(work + LPC_ORDER, coefs, data, BLOCKSIZE,
> > > +                                 LPC_ORDER);
> > > +    for (i = 0; i < BLOCKSIZE; i++) {
> > > +        zero[i] = work[LPC_ORDER + i];
> > > +        data[i] = sblock_data[i] - zero[i];
> > > +    }
> > > +
> > > +    /**
> > > +     * Codebook search is performed without taking into account the contribution
> > > +     * of the previous subblock, since it has been just subtracted from input
> > > +     * data.
> > > +     */
> > > +    memset(work, 0, LPC_ORDER * sizeof(*work));
> > > +
> > > +    cba_idx = adaptive_cb_search(ractx->adapt_cb, work + LPC_ORDER, coefs,
> > > +                                 data);
> > > +    if (cba_idx) {
> > > +        /**
> > > +         * The filtered vector from the adaptive codebook can be retrieved from
> > > +         * work, see implementation of adaptive_cb_search().
> > > +         */
> > > +        memcpy(cba, work + LPC_ORDER, sizeof(cba));
> > > +
> > > +        ff_copy_and_dup(cba_vect, ractx->adapt_cb, cba_idx + BLOCKSIZE / 2 - 1);
> > > +        m[0] = (ff_irms(cba_vect) * rms) >> 12;
> > > +    }
> > > +    fixed_cb_search(work + LPC_ORDER, coefs, data, cba_idx, &cb1_idx, &cb2_idx);
> > > +    for (i = 0; i < BLOCKSIZE; i++) {
> > > +        cb1[i] = ff_cb1_vects[cb1_idx][i];
> > > +        cb2[i] = ff_cb2_vects[cb2_idx][i];
> > > +    }
> > > +    ff_celp_lp_synthesis_filterf(work + LPC_ORDER, coefs, cb1, BLOCKSIZE,
> > > +                                 LPC_ORDER);
> > > +    memcpy(cb1, work + LPC_ORDER, sizeof(cb1));
> > > +    m[1] = (ff_cb1_base[cb1_idx] * rms) >> 8;
> > > +    ff_celp_lp_synthesis_filterf(work + LPC_ORDER, coefs, cb2, BLOCKSIZE,
> > > +                                 LPC_ORDER);
> > > +    memcpy(cb2, work + LPC_ORDER, sizeof(cb2));
> > > +    m[2] = (ff_cb2_base[cb2_idx] * rms) >> 8;
> > > +
> > > +    /**
> > > +     * Gain quantization is performed taking the NUM_BEST_GAINS best entries
> > > +     * obtained from floating point data and calculating for each entry the
> > > +     * actual encoding error with fixed point data.
> > > +     */
> > > +    for (i = 0; i < NUM_BEST_GAINS; i++) {
> > > +        best_errors[i] = FLT_MAX;
> > > +        indexes[i] = -1;
> > > +    }
> > > +    for (n = 0; n < 256; n++) {
> > > +        g[1] = ((ff_gain_val_tab[n][1] * m[1]) >> ff_gain_exp_tab[n]) / 4096.0;
> > > +        g[2] = ((ff_gain_val_tab[n][2] * m[2]) >> ff_gain_exp_tab[n]) / 4096.0;
> > > +        error = 0;
> > > +        if (cba_idx) {
> > > +            g[0] = ((ff_gain_val_tab[n][0] * m[0]) >> ff_gain_exp_tab[n]) /
> > > +                   4096.0;
> > > +            for (i = 0; i < BLOCKSIZE; i++) {
> > > +                data[i] = zero[i] + g[0] * cba[i] + g[1] * cb1[i] +
> > > +                          g[2] * cb2[i];
> > > +                error += (data[i] - sblock_data[i]) *
> > > +                         (data[i] - sblock_data[i]);
> > > +            }
> > > +        } else {
> > > +            for (i = 0; i < BLOCKSIZE; i++) {
> > > +                data[i] = zero[i] + g[1] * cb1[i] + g[2] * cb2[i];
> > > +                error += (data[i] - sblock_data[i]) *
> > > +                         (data[i] - sblock_data[i]);
> > > +            }
> > > +        }
> > 
> > > +        for (i = 0; i < NUM_BEST_GAINS; i++)
> > > +            if (error < best_errors[i]) {
> > > +                best_errors[i] = error;
> > > +                indexes[i] = n;
> > > +                break;
> > > +            }
> > 
> > this does not keep the 5 best
> > it only gurantees to keep the 1 best
> 
> Why? Perhaps you missed the break statement?

if we feed the values 9,8,7,6,5,4,3,2,1 in then
the list will just contain 1 afterwards

> 
> > you are testing your changes in terms of PSNR, arent you?
> > if not, we need to go back to the last patch and test each change
> > individually.
> > I  very much prefer naive and slow code compared to optimized but
> > untested and thus buggy code. we alraedy have a vorbis and aac encoder
> > </rant>
> 
> I did test each individual change by measuring the resulting average
> encoding error. Now I have re-tested them with tiny_psnr. Here are the
> results with 7 different samples.
> 
> Fixed point, without orthogonalization, with brute force gain
> quantization
> stddev:  849.73 PSNR: 37.74 bytes:   200320/   200334
> stddev:  983.24 PSNR: 36.48 bytes:   144000/   144014
> stddev:  835.19 PSNR: 37.89 bytes:   745280/   745294
> stddev: 3737.95 PSNR: 24.88 bytes:  5370880/  5370880
> stddev: 2605.75 PSNR: 28.01 bytes:   814400/   814400
> stddev: 3634.44 PSNR: 25.12 bytes:   432640/   432640
> stddev: 2853.26 PSNR: 27.22 bytes:  1741440/  1741440
> 
> Floating point, without orthogonalization, with gain quantization done
> the fast way
> stddev:  940.92 PSNR: 36.86 bytes:   200320/   200334
> stddev: 1010.57 PSNR: 36.24 bytes:   144000/   144014
> stddev:  904.31 PSNR: 37.20 bytes:   745280/   745294
> stddev: 3753.33 PSNR: 24.84 bytes:  5370880/  5370880
> stddev: 2612.23 PSNR: 27.99 bytes:   814400/   814400
> stddev: 3638.47 PSNR: 25.11 bytes:   432640/   432640
> stddev: 2855.30 PSNR: 27.22 bytes:  1741440/  1741440

you change 2 things relative to the previous test, this makes it
hard to be certain which change causes the quality loss

> 
> Floating point, with orthogonalization, with gain quantization done the
> fast way
> stddev:  818.14 PSNR: 38.07 bytes:   200320/   200334
> stddev:  986.48 PSNR: 36.45 bytes:   144000/   144014
> stddev:  811.68 PSNR: 38.14 bytes:   745280/   745294
> stddev: 3762.86 PSNR: 24.82 bytes:  5370880/  5370880
> stddev: 2635.10 PSNR: 27.91 bytes:   814400/   814400
> stddev: 3647.02 PSNR: 25.09 bytes:   432640/   432640
> stddev: 2862.79 PSNR: 27.19 bytes:  1741440/  1741440

some files loose quality by enabling orthogonalization, thats odd but
possible.
assuming there is no bug in the orthogonalization then you could try to
run the quantization with both codebooks found with and without
orthogonalization, this should always be better. And or avoid codebook
choices that would need quantization factors that are far away from
available values

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The real ebay dictionary, page 1
"Used only once"    - "Some unspecified defect prevented a second use"
"In good condition" - "Can be repaird by experienced expert"
"As is" - "You wouldnt want it even if you were payed for it, if you knew ..."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100522/51f091f3/attachment.pgp>