[FFmpeg-devel] [PATCH 2/2] Add SIPR decoder for 5k0, 6k5 and 8k5 modes
Vitor Sessak
vitor1001
Thu Jan 14 05:07:36 CET 2010
Michael Niedermayer wrote:
> On Sun, Jan 10, 2010 at 04:43:18PM -0500, Vitor Sessak wrote:
>> Vitor Sessak wrote:
>>> Ronald S. Bultje wrote:
>>>> Hi Vitor,
>>>>
>>>> On Jan 7, 2010, at 10:23 PM, Vitor Sessak <vitor1001 at gmail.com> wrote:
>>>>> in: { 2, 2.2, 7, 9.5, 10 }
>>>>> out: { 2, 4, 7, 9.5, 10 }
>>>>> ff_acelp_reorder_lsf(): { 2, 4, 7, 9.5, 11.5}
>>>>> sane: { 2, 4, 7, 9.5, 11.5}
>>>>>
>>>>> here we see a problem
>>>> Ah, I see what you mean, min_dist is not enforced for the last lsf. Very
>>>> interesting. For WMAVoice, that's not the case, it's exactly like the int
>>>> version. OK, I guess that makes the discussion moot then.
>>>>
>>>> Does output quality audibly differ when adding in this "feature"? (It
>>>> might just be a decoder bug, WMAVoice is full of them.) Deviating from
>>>> binary can sometimes be OK...
>>>>
>>>> If the output doesn't change or worsens, patch OK (but please explain
>>>> this difference in a doxy); if it improves, it might be worth using that
>>>> instead.
>>> No, the output gets completely garbled if I do like in
>>> ff_acelp_reorder_lsf().
>>> Does anyone else wants to comment/review this codec? If nobody comment in
>>> a couple of days I'll commit it.
>> Committed.
>>
>> Now the patch that adds 16k support. I'm undecided if I should put this
>> code in a separated file...
>
> Considering that files always grow and that later spliting is not always
> easy, id say yes you should unless theres some disadvantage (like speed loss)
> also if split the interface between them should be documented
>
>
>> -Vitor
>
>> sipr.c | 257 ++++++++++++++++++++++++++++--
>> siprdata.h | 520 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 761 insertions(+), 16 deletions(-)
>> daa9f55138b73a225f13548e38a5b1504fc39283 sipr_16k.diff
>> Index: libavcodec/siprdata.h
>> ===================================================================
>> --- libavcodec/siprdata.h (revision 21125)
>> +++ libavcodec/siprdata.h (working copy)
>
> [...]
>
> I assume the tables arent duplicates of anything else?
I've checked it some time ago, together with the other modes.
> [...]
>> @@ -159,25 +194,57 @@
>> int gc_index[5]; ///< fixed-codebook gain indexes
>> } SiprParameters;
>>
>> +/**
>> + * Convert an lsf vector into an lsp vector.
>> + *
>> + * @param lsf input lsf vector
>> + * @param lsp output lsp vector
>> + */
>> +static void lsf2lsp(float *lsf, double *lsp)
>> +{
>> + int i;
>>
>> + for (i = 0; i < LP_FILTER_ORDER_16k; i++)
>> + lsp[i] = cos(lsf[i]);
>> +}
>> +
>
> shouldnt that go into some common code?
Since it returns doubles and takes floats as input, I'm not sure it will
be very wildly used.
> also i assume cosf() is worse quality wise?
No, changed.
>> -static void dequant(float *out, const int *idx, const float *cbs[])
>> +static void dequant(float *out, const int *idx, const float *cbs[],
>> + int mode_16k)
>> {
>> int i;
>> - int stride = 2;
>> - int num_vec = 5;
>
>> + int stride = mode_16k ? 3 : 2;
>> + int num_vec = 7 - stride;
>>
>> for (i = 0; i < num_vec; i++)
>> memcpy(out + stride*i, cbs[i] + stride*idx[i], stride*sizeof(float));
>>
>> + if (mode_16k)
>> + memcpy(out + 12, cbs[4] + 4*idx[4], 4*sizeof(float));
>
> this is probably more efficient and readable like:
> if(mode_16k){
> for (i = 0; i < 4; i++)
> memcpy(out + 3*i, cbs[i] + 3*idx[i], 3*sizeof(float));
> memcpy(out + 12, cbs[4] + 4*idx[4], 4*sizeof(float));
> }else
> for (i = 0; i < 5; i++)
> memcpy(out + 2*i, cbs[i] + 2*idx[i], 2*sizeof(float));
Split in a different function per file.
>> }
>>
>> +static void lsf_decode_fp_16k(float* lsf_history, float* isp_new,
>> + const int* parm, int ma_pred)
>> +{
>> + int i;
>> + float isp_q[LP_FILTER_ORDER_16k];
>> +
>> + dequant(isp_q, parm, lsf_codebooks_16k, 1);
>> +
>> + for (i = 0; i < LP_FILTER_ORDER_16k; i++) {
>
>> + isp_new[i] = (1 - qu[ma_pred]) * isp_q[i] +
>> + qu[ma_pred] * lsf_history[i] + mean_lsf_16k[i];
>
> cosmetic suggestion:
> isp_new[i] = (1 - qu[ma_pred]) * isp_q[i]
> + qu[ma_pred] * lsf_history[i] + mean_lsf_16k[i];
> or
Did this one.
[...]
>> +}
>> +
>> +static void decode_frame_16k(SiprContext *ctx, SiprParameters *params,
>> + float *out_data)
>> +{
>> + int frame_size = ctx->m.subframe_count * L_SUBFR_16k;
>> + float *synth = ctx->synth_buf + LP_FILTER_ORDER_16k;
>> + float lsf_new[LP_FILTER_ORDER_16k];
>> + double lsp_new[LP_FILTER_ORDER_16k];
>> + float Az[2][LP_FILTER_ORDER_16k];
>> + float fixed_vector[L_SUBFR_16k];
>> + float pitch_fac, gain_code;
>> +
>> + int i;
>> + int pitch_delay_3x, pitch_lag;
>> + int pitch_delay_frac;
>> +
>> + float *excitation = ctx->excitation + 292;
>> +
>> + lsf_decode_fp_16k(ctx->lsf_history, lsf_new, params->vq_indexes,
>> + params->ma_pred_switch);
>> +
>> + ff_set_min_dist_lsf(lsf_new, LSFQ_DIFF_MIN / 2, LP_FILTER_ORDER_16k);
>> +
>> + lsf2lsp(lsf_new, lsp_new);
>> +
>> + ff_acelp_lp_decodef(Az[0], Az[1], lsp_new, ctx->lsp_history_16k,
>> + LP_FILTER_ORDER_16k);
>> +
>> + memcpy(ctx->lsp_history_16k, lsp_new, LP_FILTER_ORDER_16k * sizeof(double));
>> +
>> + memcpy(synth - LP_FILTER_ORDER_16k, ctx->synth,
>> + LP_FILTER_ORDER_16k * sizeof(*synth));
>> +
>> + for (i = 0; i < ctx->m.subframe_count; i++) {
>> + int i_subfr = i * L_SUBFR_16k;
>> + AMRFixed f;
>> + float gain_corr_factor;
>> +
>> + if (!i)
>> + pitch_delay_3x = dec_delay3_1st(params->pitch_delay[i]);
>> + else
>> + pitch_delay_3x = dec_delay3_2nd(params->pitch_delay[i],
>> + PITCH_MIN, PITCH_MAX,
>> + ctx->pitch_lag_prev);
>> +
>> + pitch_delay_frac = (pitch_delay_3x+1)%3 - 1;
>> + pitch_lag = (pitch_delay_3x+1)/3;
>> +
>> + pitch_fac = gain_pitch_cb_16k[params->gp_index[i]];
>> +
>> + ff_acelp_interpolatef(&excitation[i_subfr],
>> + &excitation[i_subfr] - (pitch_delay_3x+2)/3 + 1,
>> + sinc_win, 3, (pitch_delay_3x+2)%3 + 1,
>> + LP_FILTER_ORDER, L_SUBFR_16k);
>
> i would suspect that the code can be changed so it doesnt need pitch_delay_3x
> split twice in different ways, but i might be wrong.
I thought a few minutes about it and found no obvious way and this code
is not really speed critical...
> At least % and / are slow and should be avoided, a multiply and >> could
> likely replace the /3 accurately for the limited range of input and the
> % can be done from - pitch_lag*3
done
>> +
>> +
>> + memset(fixed_vector, 0, sizeof(fixed_vector));
>> +
>> + ff_decode_10_pulses_35bits(params->fc_indexes[i], &f,
>> + ff_fc_4pulses_8bits_tracks_13, 5, 4);
>> +
>> + f.pitch_lag = pitch_lag;
>> + f.pitch_fac = FFMIN(pitch_fac, 1.0);
>> +
>> + ff_set_fixed_vector(fixed_vector, &f, 1.0, L_SUBFR_16k);
>> +
>> + gain_corr_factor = gain_cb_16k[params->gc_index[i]];
>> + gain_code = gain_corr_factor *
>> + ff_acelp_decode_gain_codef(sqrt(L_SUBFR_16k), fixed_vector,
>
>> + 19.0 - 15.0/(log2f(10.0)*0.05),
>
> you dont need log2fm log2 should be fine for a constant in case there is
> no standard #defined one
Fixed
>> + pred_16k, ctx->energy_history,
>> + L_SUBFR_16k, 2);
>> +
>> + ctx->energy_history[1] = ctx->energy_history[0];
>> + ctx->energy_history[0] = 20.0*log10(gain_corr_factor);
>
> log10 not log10f, is that intended?
No, fixed.
> [...]
>> @@ -533,20 +752,21 @@
>> ctx->m = modes[ctx->mode];
>> av_log(avctx, AV_LOG_DEBUG, "Mode: %s\n", ctx->m.mode_name);
>>
>> + for (i = 0; i < LP_FILTER_ORDER_16k; i++)
>> + ctx->lsp_history_16k[i] = cos((i+1) * M_PI / (LP_FILTER_ORDER_16k + 1));
>> +
>> for (i = 0; i < LP_FILTER_ORDER; i++)
>> ctx->lsp_history[i] = cos((i+1) * M_PI / (LP_FILTER_ORDER + 1));
>
> cant these use the same array?
No, different types (double/float).
-Vitor
More information about the ffmpeg-devel
mailing list