[FFmpeg-devel] [PATCH] TwinVQ decoder
Reimar Döffinger
Reimar.Doeffinger
Sun Aug 2 16:30:14 CEST 2009
On Sun, Aug 02, 2009 at 03:18:02PM +0200, Vitor Sessak wrote:
> static void memset_float(float *buf, float val, int size)
Document that it is not performance relevant and thus not optimized at all.
> static void add_vec(int cont, const float *buf1, const float *buf2, float *buf3)
Same.
Maybe
> const float *cos_tab = ff_cos_tabs[av_log2(size_s)-1];
cos_tab += 2;
> for (i=0; i < size_s/2; i++) {
> lpc[i] = eval_lpc_spectrum(cos_vals, cos_tab[4*i + 2],
> mtab->n_lsp);
> lpc[size_s-i-1] = eval_lpc_spectrum(cos_vals, -cos_tab[4*i + 2],
> mtab->n_lsp);
*cos_tab and -*cos_tab
cos_tab += 4;
Or at least a tmp variable for the value of cos_tab[4*i + 2].
IMO it hurts readability a lot when the reader has to find out
by themselves if the terms are identical (the compiler probably
won't care).
> get_cos(i*4+2, part, cos_tab, 2*size),
I guess the answer is "no", but since cos_tab is usually accessed
with 4*i+2 as index I wonder if it would make sense to rearrange it
so you can use e.g. compact_cos_tab[i] directly...
> for (i=step; i <= size - 2*step; i += step) {
Missing spaces.
> if ((out[i + step] + out[i - step] > 1.95*out[i]) ||
> (out[i - step] <= out[i + step])) {
Useless ().
And how about
> if (out[i + step] + out[i - step] > 1.95*out[i] ||
> out[i + step] >= out[i - step]) {
> interpolate(out + i-step + 1, out[i], out[i-step], step - 1);
> interpolate(out+i-step +1, out[i-step/2], out[i-step ], step/2-1);
> interpolate(out+i-step/2+1, out[i ], out[i-step/2], step/2-1);
> interpolate(out +size - 2*step+1, out[size-step], out[size-2*step], step-1);
You don't place spaces particularly consistently...
> /**
> * Rearrange the LSP coefficients so that they have a minimum distance of
> * min_dist. This function does it exactly as described in section of 3.2.4
> * of the G.729 specification (but interestingly is different from what the
> * reference decoder actually does).
> */
> static void rearrange_lsp(int order, float *lsp, float min_dist)
> {
> int i;
> for (i=1; i<order; i++)
> if (lsp[i] - lsp[i-1] < min_dist) {
> float new_prev = (lsp[i] + lsp[i-1] - min_dist) * 0.5;
> float new_cur = (lsp[i] + lsp[i-1] + min_dist) * 0.5;
Uh, that is not the same as the old code... Which one is right now?
This formula could be simplified to
float mindist2 = mindist * 0.5;
float avg = (lsp[i] + lsp[i-1]) * 0.5;
lsp[i-1] = avg - mindist2;
lsp[i ] = avg + mindist2;
That wasn't possible with the formula you had before, since the new value
of lsp[i] depended on the new value of lsp[i-1];
> /* sort lsp in ascending order. float bubble agorithm,
> O(n) if data already sorted, O(n^2) - otherwise */
> for (i=0; i<lp_order-1; i++)
> for (j=i; j>=0 && lsp[j] > lsp[j+1]; j--)
Inconsistent spaces.
> const float *cb = mtab->lspcodebook;
> const float *cb2 = mtab->lspcodebook + (1 << mtab->lsp_bit1)*mtab->n_lsp;
> const float *cb3 = mtab->lspcodebook + (1 << mtab->lsp_bit1)*mtab->n_lsp +
> (1 << mtab->lsp_bit2)*mtab->n_lsp;
Huh?
cb2 = cb + (1 << mtab->lsp_bit1)*mtab->n_lsp;
cb3 = cb2 + (1 << mtab->lsp_bit2)*mtab->n_lsp;
Seems nicer/simpler...
> case 5:
> wsize = size_m;
> next_wsize = mtab->size;
> break;
Indentation?
> sizes[0] = (block_size-wsize)/2;
> sizes[1] = wsize;
> sizes[2] = (block_size-wsize)/2;
sizes[2] = sizes[0]; ?
I think that sizes is not really a good name anyway and making
it an array has no real advantage.
> for (j=0; j < mtab->fmode[FT_MEDIUM].sub; j++) {
> if (wtype == 4 && !j) {
> sub_wtype = 4;
> } else if (wtype == 7 && j == mtab->fmode[FT_MEDIUM].sub-1) {
> sub_wtype = 7;
> } else
> sub_wtype = 8;
Hm. I'd try to at least make it easier to optimize by writing it as e.g.
sub_wtype = 8;
if (!j || j == mtab->fmode[FT_MEDIUM].sub-1) {
if (!j && wtype == 4) sub_wtype = 4;
if ( j && wtype == 7) sub_wtype = 7;
}
Or something like that...
Not sure though.
> for (i=0; i < fw_cb_len; i++)
> for (j=0; j < bark_n_coef; j++) {
> int idx = j + i*bark_n_coef;
> int idx = 0;
> for (i=0; i < fw_cb_len; i++)
> for (j=0; j < bark_n_coef; j++, idx++) {
is the same and might avoid one multiply per iteration if the compiler is stupid?
> float st = (use_hist) ?
Useless ()
> for (j=0; j < mtab->fmode[ftype].sub; j++)
> tctx->dsp.vector_fmul(chunk + j*block_size, tctx->tmp_buf,
> block_size);
Could do
chunk += block_size;
Not sure if it is really better, it splits the complexity a bit across
two code lines...
> init_get_bits(&gb, buf, buf_size * 8);
I haven't checked, but are you regularly checking that you don't actually
overread?
> if ((num_blocks == 1) ||
> (ftype == FT_LONG && num_vect % num_blocks) ||
> (ftype != FT_LONG && num_vect & 1 ) ||
> (i == line_len[1])) {
Quite a few () too many...
More information about the ffmpeg-devel
mailing list