[FFmpeg-devel] [PATCH] QCELP decoder
Michael Niedermayer
michaelni
Sun Oct 5 02:09:22 CEST 2008
On Fri, Oct 03, 2008 at 03:48:52PM -0700, Kenan Gillet wrote:
> Hi,
>
> here is a round 2 of the patch based on feedback from Vitor and Diego.
> It includes:
> - some spelling/grammar fixes,
> - some cosmetics,
> - changes to output float instead of int,
> - bug fixes uncovered from the change of output,
> - improvements to the pitch pre/synthesis filters.
>
> Kenan
[...]
> +typedef enum
> +{
> + BLANK = 0,
is this supposed to mean silence? if so it should be named accordingly
[...]
> +static int qcelp_find_frame_end(const uint8_t *buf, const int buf_size) {
> + // Let's try and see if this packet holds exactly one frame
> +
> + switch (buf_size) {
> + case 35: // RATE_FULL
> + case 17: // RATE_HALF
> + case 8: // RATE_QUARTER
> + case 4: // RATE_OCTAVE
> +
> + // the first byte describing the type of frame is missing.
> + // TODO: not tested, it needs samples.
> + case 34: // RATE_FULL
> + case 16: // RATE_HALF
> + case 7: // RATE_QUARTER
> + case 3: // RATE_OCTAVE
> + return buf_size;
> +
> + case 2:
> + case 1:
> + case 0:
> + return END_NOT_FOUND;
> + }
> +
> + /*
> + * If we reach this point it means the packet holds a multiset of
> + * frames, each one of them in codec frame format, all with the same
> + * framerate, as described in:
> + *
> + * http://tools.ietf.org/html/draft-mckay-qcelp-02
> + *
> + * TODO: not tested, it needs samples.
> + */
> +
> + switch (buf[0]) {
> + case RATE_FULL:
> + return 35;
> + case RATE_HALF:
> + return 17;
> + case RATE_QUARTER:
> + return 8;
> + case RATE_OCTAVE:
> + return 4;
> + }
> +
> + return END_NOT_FOUND;
> +}
The code above looks wrong and messy, either
A. a parser is needed in which case it must function with any amount of data,
that is its input may always be a single byte or random pieces of the
bitstream. (above clearly can not even remotely handle these)
B. no parser is needed (in which case this file clearly is unneeded)
C. something else is going on, but this requires to be documented and
explained
[...]
> +#define QCELP_RATE_FULL_BITMAP \
> +{62,2},{62,1},{62,0},{61,6},{61,5},{61,4},{61,3},{61,2},\
> +{61,1},{61,0},{60,5},{60,4},{60,3},{60,2},{60,1},{60,0},\
> +{64,5},{64,4},{64,3},{64,2},{64,1},{64,0},{63,5},{63,4},\
> +{63,3},{63,2},{63,1},{63,0},{62,6},{62,5},{62,4},{62,3},\
> +{ 0,0},{16,3},{16,2},{16,1},{16,0},{52,0},{48,6},{48,5},\
> +{48,4},{48,3},{48,2},{48,1},{48,0},{56,2},{56,1},{56,0},\
> +{33,3},{33,2},{33,1},{33,0},{ 1,0},{17,3},{17,2},{17,1},\
> +{17,0},{32,6},{32,5},{32,4},{32,3},{32,2},{32,1},{32,0},\
> +{19,0},{34,6},{34,5},{34,4},{34,3},{34,2},{34,1},{34,0},\
> +{ 2,0},{18,3},{18,2},{18,1},{18,0},{33,6},{33,5},{33,4},\
> +{49,2},{49,1},{49,0},{57,2},{57,1},{57,0},{35,6},{35,5},\
> +{35,4},{35,3},{35,2},{35,1},{35,0},{ 3,0},{19,2},{19,1},\
> +{36,5},{36,4},{36,3},{36,2},{36,1},{36,0},{ 4,0},{20,3},\
> +{20,2},{20,1},{20,0},{53,0},{49,6},{49,5},{49,4},{49,3},\
> +{22,2},{22,1},{22,0},{37,6},{37,5},{37,4},{37,3},{37,2},\
> +{37,1},{37,0},{ 5,0},{21,3},{21,2},{21,1},{21,0},{36,6},\
> +{39,2},{39,1},{39,0},{ 7,0},{23,2},{23,1},{23,0},{38,6},\
> +{38,5},{38,4},{38,3},{38,2},{38,1},{38,0},{ 6,0},{22,3},\
> +{24,0},{54,0},{50,6},{50,5},{50,4},{50,3},{50,2},{50,1},\
> +{50,0},{58,2},{58,1},{58,0},{39,6},{39,5},{39,4},{39,3},\
> +{ 9,0},{25,3},{25,2},{25,1},{25,0},{40,6},{40,5},{40,4},\
> +{40,3},{40,2},{40,1},{40,0},{ 8,0},{24,3},{24,2},{24,1},\
> +{42,3},{42,2},{42,1},{42,0},{10,0},{26,3},{26,2},{26,1},\
> +{26,0},{41,6},{41,5},{41,4},{41,3},{41,2},{41,1},{41,0},\
> +{59,1},{59,0},{43,6},{43,5},{43,4},{43,3},{43,2},{43,1},\
> +{43,0},{11,0},{27,2},{27,1},{27,0},{42,6},{42,5},{42,4},\
> +{44,1},{44,0},{12,0},{28,3},{28,2},{28,1},{28,0},{55,0},\
> +{51,6},{51,5},{51,4},{51,3},{51,2},{51,1},{51,0},{59,2},\
> +{45,5},{45,4},{45,3},{45,2},{45,1},{45,0},{13,0},{29,3},\
> +{29,2},{29,1},{29,0},{44,6},{44,5},{44,4},{44,3},{44,2},\
> +{31,2},{31,1},{31,0},{46,6},{46,5},{46,4},{46,3},{46,2},\
> +{46,1},{46,0},{14,0},{30,3},{30,2},{30,1},{30,0},{45,6},\
> +{65,1},{65,0},{47,6},{47,5},{47,4},{47,3},{47,2},{47,1},\
> +{47,0},{15,0}
> +
> +#define QCELP_RATE_HALF_BITMAP \
> +{62,2},{62,1},{62,0},{61,6},{61,5},{61,4},{61,3},{61,2},\
> +{61,1},{61,0},{60,5},{60,4},{60,3},{60,2},{60,1},{60,0},\
> +{64,5},{64,4},{64,3},{64,2},{64,1},{64,0},{63,5},{63,4},\
> +{63,3},{63,2},{63,1},{63,0},{62,6},{62,5},{62,4},{62,3},\
> +{ 0,0},{16,3},{16,2},{16,1},{16,0},{52,0},{48,6},{48,5},\
> +{48,4},{48,3},{48,2},{48,1},{48,0},{56,2},{56,1},{56,0},\
> +{49,5},{49,4},{49,3},{49,2},{49,1},{49,0},{57,2},{57,1},\
> +{57,0},{32,6},{32,5},{32,4},{32,3},{32,2},{32,1},{32,0},\
> +{58,1},{58,0},{33,6},{33,5},{33,4},{33,3},{33,2},{33,1},\
> +{33,0},{ 1,0},{17,3},{17,2},{17,1},{17,0},{53,0},{49,6},\
> +{34,1},{34,0},{ 2,0},{18,3},{18,2},{18,1},{18,0},{54,0},\
> +{50,6},{50,5},{50,4},{50,3},{50,2},{50,1},{50,0},{58,2},\
> +{55,0},{51,6},{51,5},{51,4},{51,3},{51,2},{51,1},{51,0},\
> +{59,2},{59,1},{59,0},{34,6},{34,5},{34,4},{34,3},{34,2},\
> +{35,6},{35,5},{35,4},{35,3},{35,2},{35,1},{35,0},{ 3,0},\
> +{19,3},{19,2},{19,1},{19,0}
> +
> +#define QCELP_RATE_4THR_BITMAP \
> +{62,2},{62,1},{62,0},{61,6},{61,5},{61,4},{61,3},{61,2},\
> +{61,1},{61,0},{60,5},{60,4},{60,3},{60,2},{60,1},{60,0},\
> +{64,5},{64,4},{64,3},{64,2},{64,1},{64,0},{63,5},{63,4},\
> +{63,3},{63,2},{63,1},{63,0},{62,6},{62,5},{62,4},{62,3},\
> +{19,3},{19,2},{19,1},{19,0},{18,3},{18,2},{18,1},{18,0},\
> +{17,3},{17,2},{17,1},{17,0},{16,3},{16,2},{16,1},{16,0},\
> +{65,1},{65,0},{20,3},{20,2},{20,1},{20,0}
> +
> +#define QCELP_RATE_8THR_BITMAP \
> +{76,3},{66,0},{67,0},{68,0},{76,2},{69,0},{70,0},{71,0},\
> +{76,1},{72,0},{73,0},{74,0},{76,0},{75,0},{16,1},{16,0},\
> +{65,3},{65,2},{65,1},{65,0}
> +
> +static const QCELPBitmap QCELP_REFERENCE_FRAME[464] = {QCELP_RATE_FULL_BITMAP,
> + QCELP_RATE_HALF_BITMAP,
> + QCELP_RATE_4THR_BITMAP,
> + QCELP_RATE_8THR_BITMAP};
> +/**
> + * position of the bitmapping data for each pkt type in
> + * the big REFERENCE FRAME array
> + */
> +static const QCELPBitmap *qcelp_unpacking_bitmpap_per_rate[6] = {
> + NULL, /*!< for BLANK */
> + QCELP_REFERENCE_FRAME + 444, /*!< for RATE_OCTAVE */
> + QCELP_REFERENCE_FRAME + 390, /*!< for RATE_QUARTER */
> + QCELP_REFERENCE_FRAME + 266, /*!< for RATE_HALF */
> + QCELP_REFERENCE_FRAME, /*!< for RATE_FULL */
> + NULL /*!< for I_F_Q */
> +};
4 arrays and 1 array of pointers to the 4 arrays would be cleaner IMHO
> +
> +/**
> + * position of the transmission codes inside the universal frame
> + */
> +
> +#define QCELP_CBSIGN0_POS 0
> +#define QCELP_CBGAIN0_POS 16
> +#define QCELP_CINDEX0_POS 32
> +#define QCELP_PLAG0_POS 48
> +#define QCELP_PFRAC0_POS 52
> +#define QCELP_PGAIN0_POS 56
> +#define QCELP_LSPV0_POS 60
> +#define QCELP_RSRVD_POS 65 /*!< on all but rate 1/2 packets */
> +#define QCELP_LSP0_POS 66 /*!< only in rate 1/8 packets */
> +#define QCELP_CBSEED_POS 76 /*!< only in rate 1/8 packets */
instead of funny 'something= data + QCELP*_POS; something' please make
this use context->something
the "translation" tables should not contain indexes into a abstract byte
array but into a struct IMHO.
[...]
> +static const double qcelp_rnd_fir_coefs[22] = {
> + 0 , -1.344519e-1, 1.735384e-2, -6.905826e-2,
> + 2.434368e-2, -8.210701e-2, 3.041388e-2, -9.251384e-2,
> + 3.501983e-2, -9.918777e-2, 3.749518e-2, 8.985137e-1,
> + 3.749518e-2, -9.918777e-2, 3.501983e-2, -9.251384e-2,
> + 3.041388e-2, -8.210701e-2, 2.434368e-2, -6.905826e-2,
> + 1.735384e-2, -1.344519e-1
> +}; /*!< Start reading from [1]. */
> +
> +#define QCELP_LSP_SPREAD_FACTOR 0.02
> +#define QCELP_LSP_OCTAVE_PREDICTOR 0.90625
a comment of "29/32" could be usefull
[...]
> +static void interpolate_lspf(float *interpolated_lspf, const float curr_weight,
> + const float *curr_lspf, const float *prev_lspf) {
> + int i;
> +
> + for (i=0;i<10;i++)
> + interpolated_lspf[i] = (1.0 - curr_weight) * prev_lspf[i]
> + + curr_weight * curr_lspf[i];
this can be vertically aligned
[...]
> +
> +/**
> + * Decodes the 10 quantized LSP frequencies from the LSPV/LSP
> + * transmission codes of any framerate and check for badly
> + * received packets.
> + *
> + * TIA/EIA/IS-733 2.4.3.2.6.2-2, 2.4.8.7.3
> + */
> +static int decode_lspf(QCELPContext *q, float *lspf) {
> + const uint8_t *lspv;
> + int i;
> + float predictor;
> +
> + if (q->rate == RATE_OCTAVE) {
> + q->octave_count++;
> +
> + lspv=q->data+QCELP_LSP0_POS;
> + for (i=0; i<10; i++) {
> + lspf[i] = (i + 1) / (float)(10 + 1)
> + + (lspv[i] ? QCELP_LSP_SPREAD_FACTOR * QCELP_LSP_OCTAVE_PREDICTOR
> + : -QCELP_LSP_SPREAD_FACTOR * QCELP_LSP_OCTAVE_PREDICTOR);
> + }
> +
> + // Check the stability of the LSP frequencies.
> + if (lspf[0] < QCELP_LSP_SPREAD_FACTOR)
> + lspf[0] = QCELP_LSP_SPREAD_FACTOR;
this cannot be true
> + for (i = 1; i < 10; i++) {
> + if (lspf[i] < lspf[i-1] + QCELP_LSP_SPREAD_FACTOR)
> + lspf[i] = lspf[i-1] + QCELP_LSP_SPREAD_FACTOR;
> + }
neither can this
> + if (lspf[9] > 1.0 - QCELP_LSP_SPREAD_FACTOR)
> + lspf[9] = 1.0 - QCELP_LSP_SPREAD_FACTOR;
nor this
> + for (i = 9; i > 0; i--) {
> + if (lspf[i-1] > lspf[i] - QCELP_LSP_SPREAD_FACTOR)
> + lspf[i-1] = lspf[i] - QCELP_LSP_SPREAD_FACTOR;
> + }
and even the duplicated code cannot be true
> +
> + // Low-pass filter the LSP frequencies
> + if (q->octave_count < 10) {
> + interpolate_lspf(lspf, 1 - 0.125, lspf, q->prev_lspf);
> + } else {
> + interpolate_lspf(lspf, 1 - 0.9, lspf, q->prev_lspf);
> + }
> +
> + } else if (q->rate == I_F_Q) {
> + predictor = QCELP_LSP_OCTAVE_PREDICTOR * -QCELP_LSP_SPREAD_FACTOR;
> + if (q->erasure_count > 1) {
> + predictor *= (q->erasure_count < 4 ? 0.9 : 0.7);
> + }
> + for (i=0; i<10; i++) {
> + lspf[i] = (i + 1) / (float)(10 + 1) + predictor;
instead of casting literal constants to float please use a . like 11.0
> + }
> +
> + // Low-pass filter the LSP frequencies
> + interpolate_lspf(lspf, 1 - 0.875, lspf, q->prev_lspf);
> + } else {
> + q->octave_count = 0;
> +
> + lspv=q->data+QCELP_LSPV0_POS;
> +
> + lspf[0]= qcelp_lspvq1[lspv[0]].x / 10000.0;
> + lspf[1]=lspf[0]+qcelp_lspvq1[lspv[0]].y / 10000.0;
> + lspf[2]=lspf[1]+qcelp_lspvq2[lspv[1]].x / 10000.0;
> + lspf[3]=lspf[2]+qcelp_lspvq2[lspv[1]].y / 10000.0;
> + lspf[4]=lspf[3]+qcelp_lspvq3[lspv[2]].x / 10000.0;
> + lspf[5]=lspf[4]+qcelp_lspvq3[lspv[2]].y / 10000.0;
> + lspf[6]=lspf[5]+qcelp_lspvq4[lspv[3]].x / 10000.0;
> + lspf[7]=lspf[6]+qcelp_lspvq4[lspv[3]].y / 10000.0;
> + lspf[8]=lspf[7]+qcelp_lspvq5[lspv[4]].x / 10000.0;
> + lspf[9]=lspf[8]+qcelp_lspvq5[lspv[4]].y / 10000.0;
this looks like it should be a loop
[...]
> +/**
> + * Converts codebook transmission codes to GAIN and INDEX.
> + *
> + * TIA/EIA/IS-733 2.4.6.2
> + */
> +static int decode_gain_and_index(QCELPContext *q, float *gain, int *index) {
> + int i, g1[16];
> + const uint8_t *cbgain, *cbsign, *cindex;
> + float ga[16], gain_memory;
> +
> + cbsign=q->data+QCELP_CBSIGN0_POS;
> + cbgain=q->data+QCELP_CBGAIN0_POS;
> + cindex=q->data+QCELP_CINDEX0_POS;
> +
> + switch (q->rate) {
> + case RATE_FULL:
> + case RATE_HALF:
> + for (i=0; i<16; i++) {
> + if (q->rate == RATE_HALF && i>=4) break;
> +
> + g1[i]=4*cbgain[i];
> + if (q->rate == RATE_FULL && i > 0 && !((i+1) & 3)) {
> + g1[i] += av_clip((g1[i-1] + g1[i-2] + g1[i-3]) / 3, 6, 38) - 6;
> + if (g1[i] > 60)
> + g1[i] = 60;
> + }
> +
> + gain[i]=qcelp_g12ga[g1[i]];
> +
> + if (cbsign[i]) {
> + gain[i] = -gain[i];
> + index[i]= (cindex[i]-89) & 127;
> + } else {
> + index[i] = cindex[i];
> + }
> + }
> +
> + q->prev_g1[0] = g1[i-2];
> + q->prev_g1[1] = g1[i-1];
> + q->last_codebook_gain=gain[i-1];
> +
> + break;
> + case RATE_QUARTER:
> + for (i=0; i<5; i++) {
> + g1[i]=4*cbgain[i];
> +
> + if (i>0 && FFABS(g1[i] - g1[i-1]) > 40) return -1;
> + if (i >= 2 && FFABS(g1[i] - 2*g1[i-1] + g1[i-2]) > 48) return -1;
inconsistent formating
> + ga[i]=qcelp_g12ga[g1[i]];
> + }
> +
> + q->prev_g1[0] = g1[3];
so is the whitespace surrounding the =
> + q->prev_g1[1] = g1[4];
> + q->last_codebook_gain=ga[4];
> +
> + // Provide smoothing of the energy of the unvoiced excitation
> + gain[0]= ga[0];
> + gain[1]=0.6*ga[0]+0.4*ga[1];
> + gain[2]= ga[1];
> + gain[3]=0.2*ga[1]+0.8*ga[2];
> + gain[4]=0.8*ga[2]+0.2*ga[3];
> + gain[5]= ga[3];
> + gain[7]=0.4*ga[3]+0.6*ga[4];
> + gain[7]= ga[4];
> + break;
> + case RATE_OCTAVE:
> + g1[0] = -4 + 2 * cbgain[0]
> + + av_clip((q->prev_g1[0] + q->prev_g1[1]) / 2, 4, 58);
> +
> + ga[0]=qcelp_g12ga[g1[0]];
> + gain_memory=FFABS(q->last_codebook_gain);
> +
> + q->last_codebook_gain =
> + gain[7] =
> + ga[0] = 0.5*gain_memory + 0.5*ga[0];
0.5*(A+B)
> +
> + // This interpolation is done to produce smoother background noise.
> + for (i = 0; i < 7; i++)
> + gain[i]=(0.875-0.125*i)*gain_memory+(0.125+0.125*i)*ga[0];
for (i = 1; i < 8; i++)
gain[i-1] = gain_memory + 0.125*i*(ga[0] - gain_memory);
> +
> + q->prev_g1[0] = q->prev_g1[1];
> + q->prev_g1[1] = g1[0];
> + break;
> + case I_F_Q:
> + g1[0] = q->prev_g1[1];
> + switch (q->erasure_count) {
> + case 1 : break;
> + case 2 : g1[0] -= 1; break;
> + case 3 : g1[0] -= 2; break;
> + default: g1[0] -= 6;
> + }
> + gain[0] = qcelp_g12ga[g1[0] < 0 ? g1[0] : 0];
> +
> + gain_memory=FFABS(q->last_codebook_gain);
> + q->last_codebook_gain =
> + gain[3] =
> + ga[0] = 0.5*gain_memory + 0.5*gain[0];
> +
> + for (i = 0; i < 3; i++) {
> + gain[i]= (0.75 - 0.25 * i) * gain_memory + (0.25 + 0.25 * i) * ga[0];
> + }
> + q->prev_g1[0] = q->prev_g1[1];
> + q->prev_g1[1] = g1[0];
this code looks rather similar and likely can be factorized
> + break;
> + }
> + return 0;
> +}
> +
> +/**
> + * Computes the scaled codebook vector Cdn From INDEX and GAIN
> + * For all rates.
> + *
> + * The specification misses some information here.
> + *
> + * TIA/EIA/IS-733 has an omission on the codebook index determination
> + * formula for RATE_FULL and RATE_HALF frames at section 2.4.8.1.1. It says
> + * you have to subtract the decoded index parameter from the given scaled
> + * codebook vector index 'n' to get the desired circular codebook index, but
> + * it does not mention that you have to clamp 'n' to [0-9] in order to get
> + * RI-compliant results.
> + *
> + * The reason for this mistake seems to be the fact they forget to mention you
> + * have to do these calculations per codebook subframe and adjust given
> + * equation values accordingly.
> + *
> + * @param q the context
> + * @param gain array holding the 4 pitch subframe gain values
> + * @param index array holding the 4 pitch subframe index values
> + * other than RATE_FULL or RATE_HALF
> + * @param cnd_vector array for the generated scaled codebook vector
> + */
> +static void compute_svector(const QCELPContext *q, const float *gain,
> + const int *index, float *cdn_vector) {
> + int i,j;
> + uint16_t cbseed;
> + float rnd[160];
> +
> + switch (q->rate) {
> + case RATE_FULL:
> +
> + for (i=0; i<16; i++) {
> + for (j=0; j<10; j++) {
> + cdn_vector[10*i+j]= gain[i]*QCELP_FULLRATE_CODEBOOK((j-index[i]) & 127);
> + }
> + }
> + break;
> + case RATE_HALF:
> +
> + for (i=0; i<4; i++) {
> + for (j=0; j<40; j++) {
> + cdn_vector[40*i+j]= gain[i]*QCELP_HALFRATE_CODEBOOK((j-index[i]) & 127);
> + }
> + }
> + break;
> + case RATE_QUARTER:
> + cbseed=(0x0003 & q->data[QCELP_LSPV0_POS + 4])<<14 |
> + (0x003F & q->data[QCELP_LSPV0_POS + 3])<< 8 |
> + (0x0060 & q->data[QCELP_LSPV0_POS + 2])<< 1 |
> + (0x0007 & q->data[QCELP_LSPV0_POS + 1])<< 3 |
> + (0x0038 & q->data[QCELP_LSPV0_POS + 0])>> 3 ;
> + for (i=0; i<160; i++) {
> + cbseed=(521*cbseed+259) & 65535;
the & is unneeded
> + rnd[i] = QCELP_SQRT1887 / 32768.0 * (((cbseed + 32768) & 65535) - 32768);
QCELP_SQRT1887 / 32768.0 + (int16_t)cbseed;
[...]
> +/**
> + * Apply generic gain control, filtered by a first order IIR filter
> + * for the final stage gain control.
> + *
> + * @param q if not null, apply harcoded coef infinite impulse response filter
> + * @param in vector to control gain off
> + * @param out gain-controled output vector
> + *
> + * TIA/EIA/IS-733 2.4.8.3-4/5, 2.4.8.6
> + */
> +static void apply_gain_ctrl(QCELPContext *q, const float *in, float *out) {
> + int i;
> + float scalefactors[4];
> +
> + for (i = 0; i < 4; i++)
> + scalefactors[i] = sqrt(compute_subframe_energy(in , i) /
> + compute_subframe_energy(out, i));
> +
> + if (q) {
> + scalefactors[0]=0.9375*q->prev_iirf_scalefactor + 0.0625*scalefactors[0];
> +
> + for (i=1;i<4;i++)
> + scalefactors[i]=0.9375*scalefactors[i-1]+0.0625*scalefactors[i];
> + q->prev_iirf_scalefactor=scalefactors[3];
> + }
q is always NULL
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Those who are too smart to engage in politics are punished by being
governed by those who are dumber. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20081005/3426707e/attachment.pgp>
More information about the ffmpeg-devel
mailing list