[FFmpeg-devel] [PATCH] TwinVQ decoder
Vitor Sessak
vitor1001
Sun Aug 2 18:25:16 CEST 2009
Reimar Doeffinger wrote:
> On Sun, Aug 02, 2009 at 03:18:02PM +0200, Vitor Sessak wrote:
>> static void memset_float(float *buf, float val, int size)
>
> Document that it is not performance relevant and thus not optimized at all.
>
>> static void add_vec(int cont, const float *buf1, const float *buf2, float *buf3)
>
> Same.
Done
> Maybe
>> const float *cos_tab = ff_cos_tabs[av_log2(size_s)-1];
>
> cos_tab += 2;
>> for (i=0; i < size_s/2; i++) {
>> lpc[i] = eval_lpc_spectrum(cos_vals, cos_tab[4*i + 2],
>> mtab->n_lsp);
>> lpc[size_s-i-1] = eval_lpc_spectrum(cos_vals, -cos_tab[4*i + 2],
>> mtab->n_lsp);
>
> *cos_tab and -*cos_tab
> cos_tab += 4;
>
> Or at least a tmp variable for the value of cos_tab[4*i + 2].
> IMO it hurts readability a lot when the reader has to find out
> by themselves if the terms are identical (the compiler probably
> won't care).
Done with a tmp var.
>> get_cos(i*4+2, part, cos_tab, 2*size),
>
> I guess the answer is "no", but since cos_tab is usually accessed
> with 4*i+2 as index I wonder if it would make sense to rearrange it
> so you can use e.g. compact_cos_tab[i] directly...
Thanks, I checked and the other values of the table are never used. Now
the code reads just get_cos(i, part, cos_tab, 2*size) and the cos tables
are 4x smaller.
>> for (i=step; i <= size - 2*step; i += step) {
>
> Missing spaces.
where?
>> if ((out[i + step] + out[i - step] > 1.95*out[i]) ||
>> (out[i - step] <= out[i + step])) {
>
> Useless ().
> And how about
>
>> if (out[i + step] + out[i - step] > 1.95*out[i] ||
>> out[i + step] >= out[i - step]) {
I like it, changed.
>> interpolate(out + i-step + 1, out[i], out[i-step], step - 1);
>> interpolate(out+i-step +1, out[i-step/2], out[i-step ], step/2-1);
>> interpolate(out+i-step/2+1, out[i ], out[i-step/2], step/2-1);
>> interpolate(out +size - 2*step+1, out[size-step], out[size-2*step], step-1);
>
> You don't place spaces particularly consistently...
I hate breaking lines more than I like consistent spaces :(
>> /**
>> * Rearrange the LSP coefficients so that they have a minimum distance of
>> * min_dist. This function does it exactly as described in section of 3.2.4
>> * of the G.729 specification (but interestingly is different from what the
>> * reference decoder actually does).
>> */
>> static void rearrange_lsp(int order, float *lsp, float min_dist)
>> {
>> int i;
>> for (i=1; i<order; i++)
>> if (lsp[i] - lsp[i-1] < min_dist) {
>> float new_prev = (lsp[i] + lsp[i-1] - min_dist) * 0.5;
>> float new_cur = (lsp[i] + lsp[i-1] + min_dist) * 0.5;
>
> Uh, that is not the same as the old code... Which one is right now?
You spotted a bug. Before was wrong (not that the output changed a lot)...
> This formula could be simplified to
> float mindist2 = mindist * 0.5;
> float avg = (lsp[i] + lsp[i-1]) * 0.5;
> lsp[i-1] = avg - mindist2;
> lsp[i ] = avg + mindist2;
> That wasn't possible with the formula you had before, since the new value
> of lsp[i] depended on the new value of lsp[i-1];
done
>> /* sort lsp in ascending order. float bubble agorithm,
>> O(n) if data already sorted, O(n^2) - otherwise */
>> for (i=0; i<lp_order-1; i++)
>> for (j=i; j>=0 && lsp[j] > lsp[j+1]; j--)
>
> Inconsistent spaces.
>
>> const float *cb = mtab->lspcodebook;
>> const float *cb2 = mtab->lspcodebook + (1 << mtab->lsp_bit1)*mtab->n_lsp;
>> const float *cb3 = mtab->lspcodebook + (1 << mtab->lsp_bit1)*mtab->n_lsp +
>> (1 << mtab->lsp_bit2)*mtab->n_lsp;
>
> Huh?
> cb2 = cb + (1 << mtab->lsp_bit1)*mtab->n_lsp;
> cb3 = cb2 + (1 << mtab->lsp_bit2)*mtab->n_lsp;
>
> Seems nicer/simpler...
indeed
>> case 5:
>> wsize = size_m;
>> next_wsize = mtab->size;
>> break;
>
> Indentation?
oops, fixed
>> sizes[0] = (block_size-wsize)/2;
>> sizes[1] = wsize;
>> sizes[2] = (block_size-wsize)/2;
>
> sizes[2] = sizes[0]; ?
>
> I think that sizes is not really a good name anyway and making
> it an array has no real advantage.
It has no disadvantage either (this part of the code is not very speed
critical). Also, since it is the sizes of chunks, I find it more
readable this way.
>> for (j=0; j < mtab->fmode[FT_MEDIUM].sub; j++) {
>> if (wtype == 4 && !j) {
>> sub_wtype = 4;
>> } else if (wtype == 7 && j == mtab->fmode[FT_MEDIUM].sub-1) {
>> sub_wtype = 7;
>> } else
>> sub_wtype = 8;
>
> Hm. I'd try to at least make it easier to optimize by writing it as e.g.
> sub_wtype = 8;
> if (!j || j == mtab->fmode[FT_MEDIUM].sub-1) {
> if (!j && wtype == 4) sub_wtype = 4;
> if ( j && wtype == 7) sub_wtype = 7;
> }
>
> Or something like that...
> Not sure though.
I prefer as you suggested, changed.
>> for (i=0; i < fw_cb_len; i++)
>> for (j=0; j < bark_n_coef; j++) {
>> int idx = j + i*bark_n_coef;
>
>> int idx = 0;
>> for (i=0; i < fw_cb_len; i++)
>> for (j=0; j < bark_n_coef; j++, idx++) {
>
> is the same and might avoid one multiply per iteration if the compiler is stupid?
>
>> float st = (use_hist) ?
>
> Useless ()
>
>> for (j=0; j < mtab->fmode[ftype].sub; j++)
>> tctx->dsp.vector_fmul(chunk + j*block_size, tctx->tmp_buf,
>> block_size);
>
> Could do
> chunk += block_size;
> Not sure if it is really better, it splits the complexity a bit across
> two code lines...
All three done.
>
>> init_get_bits(&gb, buf, buf_size * 8);
Oops, I completely forgot that. Fortunately it is pretty trivial for
twinvq (the number of bits per frame is fixed). Done.
New version attached.
-Vitor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twinvq.c
Type: text/x-csrc
Size: 39021 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090802/036975f7/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twinvq_data.h
Type: text/x-chdr
Size: 13509 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090802/036975f7/attachment.h>
More information about the ffmpeg-devel
mailing list