[FFmpeg-devel] [PATCH] QCELP decoder

Sun Nov 16 00:10:56 CET 2008

On Fri, Nov 14, 2008 at 03:32:51PM -0800, Kenan Gillet wrote:
> Hi,
> On Fri, Nov 14, 2008 at 2:27 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Fri, Nov 14, 2008 at 12:17:50PM -0800, Kenan Gillet wrote:
> >>
> >> On Nov 14, 2008, at 2:14 AM, Michael Niedermayer wrote:
> > [...]
> >> >
> >> >
> >> >> @@ -152,11 +537,140 @@
> >> >>     return -1;
> >> >> }
> >> >>
> >> >> +/*
> >> >> + * Determine the framerate from the frame size and/or the first
> >> >> byte of the frame.
> >> >> + *
> >> >> + * @param avctx the AV codec context
> >> >> + * @param buf_size length of the buffer
> >> >> + * @param buf the bufffer
> >> >> + *
> >> >> + * @return the framerate on success, RATE_UNKNOWN otherwise.
> >> >> + */
> >> >> +static int determine_framerate(AVCodecContext *avctx,
> >> >> +                               const int buf_size,
> >> >> +                               uint8_t **buf) {
> >> >> +    qcelp_packet_rate framerate;
> >> >> +
> >> >> +    if ((framerate = buf_size2framerate(buf_size)) >= 0) {
> >> >> +        if (framerate != **buf) {
> >> >
> >> > iam not sure but didnt you at some point reorder the enum?
> >> > if so how can this code be correct before and afterwards?
> >>
> >>
> >> I reorder the enum on the 09/07/2008, way before submitting my first
> >> patch to
> >>      RATE_FULL   = 0,
> >>      RATE_HALF   = 1,
> >>      RATE_QUARTER= 2,
> >>      RATE_OCTAVE = 3,
> >>      I_F_Q,          /*!< insufficient frame quality */
> >>      BLANK,
> >>      RATE_UNKNOWN
> >> to
> >>      SILENCE = 0,
> >>      RATE_OCTAVE,
> >>      RATE_QUARTER,
> >>      RATE_HALF,
> >>      RATE_FULL,
> >>      I_F_Q,          /*!< insufficient frame quality */
> >>      RATE_UNKNOWN
> >> in order to reflect the rate byte in the QCELP frame.
> >>
> >> and I changed on the 10/27/2008 to
> >>      RATE_UNKNOWN = -2,
> >>      I_F_Q,             /*!< insufficient frame quality */
> >>      SILENCE,
> >>      RATE_OCTAVE,
> >>      RATE_QUARTER,
> >>      RATE_HALF,
> >>      RATE_FULL
> >> when you asked me to change the
> >> switch (framerate)
> >>    case RATE_FULL:
> >>    case RATE_QUARTER:
> >>    case RATE_OCTAVE:
> >> }
> >> to (framerate >= RATE_QUARTER)
> >
> > that doesnt awnser
> > how the changing enum interacts with
> > if(framerate != [some byte from the bitstream])
> >
> 
> basically, the first byte of the frame corresponds to the rate in the enum.
> the first byte can be
> 0 => SILENCE
> 1 => RATE_OCTAVE
> 2 => RATE_QUARTER
> 3 => RATE_HALF
> 4 => RATE_FULL
> 
> if it is not one of those, then we should have an I_F_Q.
> The SoC code determined the framerate by looking at the bufsize
> and then had a warning if the framerate byte (1st byte of the frame)
> differed.
> I suppose it was to handle frame which would not contain this first byte.
> I have not seen any files with such a feature though, and neither
> Reynaldo [1].
> 
> We could simplify to just looking at the framerate and checking
> that the buffer contains enough data for the corresponding rate.
> 
> what do you think?
> 
> Renaldo, any thought on this one?
> 
> 
> >
> > [...]
> >
> >> attached the round 11:
> >
> > OutOfMissingAttachmentJokesException
> >
> 
> lol
> 
> it is very bizarre because the email in my sent box has the attachment,
> but it is seems to have been scrapped out somewhere along the way :(
> I'll be using the web interface of gmail from now on.
> 
> hopefully the round 11 will be attached this time
[...]
> Index: libavcodec/qcelpdata.h
> ===================================================================
> --- libavcodec/qcelpdata.h	(revision 15824)
> +++ libavcodec/qcelpdata.h	(working copy)
> @@ -23,8 +23,54 @@
>  #define AVCODEC_QCELPDATA_H
>  
>  #include <stdint.h>
> +#include <stddef.h>
>  
> +#include "qcelp.h"
> +#include "bitstream.h"
> +

>  /**
> + * @file qcelpdata.h
> + *
> + * QCELP unpacking tables and structures,
> + * QCELP decoding tables and structures
> + *
> + * @author Reynaldo H. Verdejo Pinochet
> + */
> +
> +typedef struct {
> +    GetBitContext     gb;

> +    qcelp_packet_rate framerate;

do the number of frames per second change or the bits per second?
if later bitrate is the proper term IMO

> +
> +/// @defgroup qcelp_unpacked_data_frame QCELP unpacked data frame
> +/// @{
> +    uint8_t           cbsign[16];

> +    uint8_t           cbgain[16];
> +    uint8_t           cindex[16];
> +    uint8_t           plag[4];
> +    uint8_t           pfrac[4];
> +    uint8_t           pgain[4];
> +    uint8_t           lspv[10];               /*!< LSP for RATE_OCTAVE, LSPV for other rates */
> +    uint8_t           reserved;               /*!< on all but rate 1/2 packets */
> +/// @}
> +
> +    uint8_t           erasure_count;
> +    uint8_t           octave_count;           /*!< count the consecutive RATE_OCTAVE frames */
> +    float             prev_lspf[10];
> +    float             predictor_lspf[10];     /*!< LSP predictor,
> +                                                  only use for RATE_OCTAVE and I_F_Q */
> +    float             pitch_synthesis_filter_mem[303];
> +    float             pitch_pre_filter_mem[303];
> +    float             rnd_fir_filter_mem[180];
> +    float             formant_mem[170];
> +    float             last_codebook_gain;
> +    int               prev_g1[2];
> +    int               prev_framerate;
> +    float             prev_pitch_gain[4];
> +    uint8_t           prev_pitch_lag[4];
> +    uint16_t          first16bits;
> +} QCELPContext;

i somehow think this struct does not belong in qcelpdata.h
but rather qcelpdec.c

[...]
> @@ -406,7 +462,7 @@
>    100.000/QCELP_SCALE, 112.250/QCELP_SCALE, 125.875/QCELP_SCALE, 141.250/QCELP_SCALE,
>    158.500/QCELP_SCALE, 177.875/QCELP_SCALE, 199.500/QCELP_SCALE, 223.875/QCELP_SCALE,
>    251.250/QCELP_SCALE, 281.875/QCELP_SCALE, 316.250/QCELP_SCALE, 354.875/QCELP_SCALE,
> -  398.125/QCELP_SCALE, 446.625/QCELP_SCALE, 501.125/QCELP_SCALE, 563.375/QCELP_SCALE,
> +  398.125/QCELP_SCALE, 446.625/QCELP_SCALE, 501.125/QCELP_SCALE, 562.375/QCELP_SCALE,
>    631.000/QCELP_SCALE, 708.000/QCELP_SCALE, 794.375/QCELP_SCALE, 891.250/QCELP_SCALE,
>   1000.000/QCELP_SCALE};
>  

if this is a intended bugfix and has been from spec or ref then its ok of
course

> @@ -483,4 +539,19 @@
>    -9.918777e-2, 3.749518e-2,  8.985137e-1
>  };
>  
> +/**
> + * This spread factor is used, for framerate 1/8,
> + * to force the LSP frequencies to be at least 80 Hz apart.
> + *
> + * TIA/EIA/IS-733 2.4.3.3.2
> + */
> +#define QCELP_LSP_SPREAD_FACTOR 0.02

this is also used in IFQ

> +
> +/**
> + * predictor coefficient for the conversion of LSP codes to LSP frequencies
> + * for RATE_OCTAVE and I_F_Q
> + * TIA/EIA/IS-733 2.4.3.2.7-2
> + */
> +#define QCELP_LSP_OCTAVE_PREDICTOR 29.0/32

inconsistant naming of 1/8 vs. RATE_OCTAVE

> +
>  #endif /* AVCODEC_QCELPDATA_H */
> Index: libavcodec/qcelpdec.c
> ===================================================================
> --- libavcodec/qcelpdec.c	(revision 15824)
> +++ libavcodec/qcelpdec.c	(working copy)
> @@ -69,6 +69,202 @@
>  }
>  
>  /**
> + * Decodes the 10 quantized LSP frequencies from the LSPV/LSP
> + * transmission codes of any framerate and checks for badly received packets.
> + *
> + * @param q the context
> + * @param lspf line spectral pair frequencies
> + *
> + * @return 0 on success, -1 if the packet is badly received
> + *
> + * TIA/EIA/IS-733 2.4.3.2.6.2-2, 2.4.8.7.3
> + */
> +static int decode_lspf(QCELPContext *q,
> +                       float *lspf) {
> +    int i;
> +    float tmp_lspf;
> +
> +    if (q->framerate == RATE_OCTAVE ||
> +        q->framerate == I_F_Q) {
> +        float smooth;
> +        const float *predictors = (q->prev_framerate != RATE_OCTAVE &&
> +                                   q->prev_framerate != I_F_Q ? q->prev_lspf
> +                                                              : q->predictor_lspf);
> +
> +        if (q->framerate == RATE_OCTAVE) {
> +            q->octave_count++;
> +
> +            for (i = 0; i < 10; i++) {

> +                lspf[i] = (i + 1) / 11.;
> +                q->predictor_lspf[i]  =
> +                             lspf[i] += (q->lspv[i] ?  QCELP_LSP_SPREAD_FACTOR
> +                                                    : -QCELP_LSP_SPREAD_FACTOR)
> +                                      + (predictors[i] - lspf[i]) * QCELP_LSP_OCTAVE_PREDICTOR;

q->predictor_lspf[i] =
             lspf[i] =  (q->lspv[i] ? QCELP_LSP_SPREAD_FACTOR : -QCELP_LSP_SPREAD_FACTOR)
                      + predictors[i] * QCELP_LSP_OCTAVE_PREDICTOR
                      + (i + 1) * ((1 - QCELP_LSP_OCTAVE_PREDICTOR)/11);

> +            }
> +            smooth = (q->octave_count < 10 ? .875 : 0.1);
> +        } else {
> +            float erasure_coeff;
> +
> +            assert(q->framerate == I_F_Q);
> +
> +            if (q->erasure_count > 1)
> +                erasure_coeff = (q->erasure_count < 4 ? 0.9 : 0.7);
> +            else
> +                erasure_coeff = 1.0;
> +

> +            for (i = 0; i < 10; i++) {
> +                lspf[i] = (i + 1) / 11.;
> +                q->predictor_lspf[i] = (predictors[i] - lspf[i]) * erasure_coeff;
> +                lspf[i] += QCELP_LSP_OCTAVE_PREDICTOR * q->predictor_lspf[i];

this code looks a little strange compared to the RATE_OCTAVE case
i mean that predictor_lspf is a difference here while above it is a sum

> +            }
> +            smooth = 0.125;
> +        }
> +
> +        // Check the stability of the LSP frequencies.
> +        lspf[0] = FFMAX(lspf[0], QCELP_LSP_SPREAD_FACTOR);
> +        for (i = 1; i < 10; i++)
> +            lspf[i] = FFMAX(lspf[i], (lspf[i-1] + QCELP_LSP_SPREAD_FACTOR));
> +
> +        lspf[9] = FFMIN(lspf[9], (1.0 - QCELP_LSP_SPREAD_FACTOR));
> +        for (i = 9; i > 0; i--)
> +            lspf[i-1] = FFMIN(lspf[i-1], (lspf[i] - QCELP_LSP_SPREAD_FACTOR));
> +
> +        // Low-pass filter the LSP frequencies.
> +        weighted_vector_sumf(lspf, lspf, q->prev_lspf, smooth, 1.0 - smooth, 10);
> +    } else {
> +        q->octave_count = 0;
> +
> +        tmp_lspf = 0.;
> +        for (i = 0; i < 5 ; i++) {
> +            lspf[2*i+0] = tmp_lspf += qcelp_lspvq[i][q->lspv[i]][0] * 0.0001;
> +            lspf[2*i+1] = tmp_lspf += qcelp_lspvq[i][q->lspv[i]][1] * 0.0001;
> +        }
> +

> +        // Check for badly received packets.
> +        if (q->framerate == RATE_QUARTER) {
> +            if (lspf[9] <= .70 || lspf[9] >=  .97)
> +                return -1;
> +            for (i = 3; i < 10; i++)
> +                if (FFABS(lspf[i] - lspf[i-2]) < .08)
> +                    return -1;

fabs() might be faster than FFABS() which really is for integers

[...]
> +        subframes_count = q->framerate == RATE_FULL ? 16
> +                                                    : q->framerate == RATE_HALF ? 4
> +                                                                                : 5;

something tells me a switch or if/else would be more readable

[...]

> +        ga[0] = qcelp_g12ga[g1[0]];
> +        gain_memory = q->last_codebook_gain;
> +
> +        q->last_codebook_gain =
> +                      gain[i] = 0.5 * (gain_memory + ga[0]);

the use of ga[0] as temporary seems unneeded

[...]
> +static int codebook_sanity_check_for_rate_quarter(const uint8_t *cbgain) {
> +    int i, prev_diff, diff;
> +
> +    prev_diff = diff= cbgain[1] - cbgain[0];
> +    if (FFABS(diff) > 10)
> +        return -1;
> +    for (i = 2; i < 5; i++) {
> +        diff = cbgain[i] - cbgain[i-1];
> +        if (FFABS(diff) > 10)
> +            return -1;
> +        else if (FFABS(diff - prev_diff) > 12)
> +            return -1;
> +        prev_diff = diff;
> +    }
> +    return 0;
> +}

static int codebook_sanity_check_for_rate_quarter(const uint8_t *cbgain) {
    int i, prev_diff=0;

    for (i = 1; i < 5; i++) {
        int diff = cbgain[i] - cbgain[i-1];
        if (FFABS(diff) > 10)
            return -1;
        else if (FFABS(diff - prev_diff) > 12)
            return -1;
        prev_diff = diff;
    }
    return 0;
}

> +
> +/**
>   * Computes the scaled codebook vector Cdn From INDEX and GAIN
>   * for all rates.
>   *
> @@ -242,6 +438,64 @@
>  }
>  

>  /**
> + * Apply pitch synthesis filter and pitch prefilter to the scaled codebook vector.
> + * TIA/EIA/IS-733 2.4.5.2
> + *
> + * @param q the context
> + * @param cdn_vector the scaled codebook vector
> + */
> +static void apply_pitch_filters(QCELPContext *q,
> +                                float *cdn_vector) {
> +    int         i;
> +    float       gain[4];
> +    const float *v_synthesis_filtered, *v_pre_filtered;
> +
> +    if (q->framerate >= RATE_HALF ||
> +       (q->framerate == I_F_Q && (q->prev_framerate >= RATE_HALF))) {
> +
> +        if (q->framerate >= RATE_HALF) {
> +
> +            // Compute gain & lag for the whole frame.
> +            for (i = 0; i < 4; i++) {

> +                gain[i] = q->plag[i] ? (q->pgain[i] + 1) / 4.0 : 0.0;

*0.25 may be faster with some reterded compilers than /4.0

[...]
> +/*
> + * Determine the framerate from the frame size and/or the first byte of the frame.
> + *
> + * @param avctx the AV codec context
> + * @param buf_size length of the buffer
> + * @param buf the bufffer
> + *
> + * @return the framerate on success,
> + *         I_F_Q  if the framerate cannot be satisfactorily determined
> + *
> + * TIA/EIA/IS-733 2.4.8.7.1
> + */
> +static int determine_framerate(AVCodecContext *avctx,
> +                               const int buf_size,
> +                               uint8_t **buf) {
> +    qcelp_packet_rate framerate;
> +
> +    if ((framerate = buf_size2framerate(buf_size)) >= 0) {

> +        if (framerate > **buf) {
> +            av_log(avctx, AV_LOG_WARNING, "Claimed framerate and buffer size mismatch.\n");
> +            framerate = **buf;

iam not sure if this is a good idea

[...]

>  static void warn_insufficient_frame_quality(AVCodecContext *avctx,
>                                              const char *message) {
>      av_log(avctx, AV_LOG_WARNING, "Frame #%d, IFQ: %s\n", avctx->frame_number, message);
>  }
>  
> +static int qcelp_decode_frame(AVCodecContext *avctx,
> +                              void *data,
> +                              int *data_size,
> +                              uint8_t *buf,
> +                              const int buf_size) {
> +    QCELPContext      *q = avctx->priv_data;
> +    float             *outbuffer = data;
> +    int               i;
> +    float             quantized_lspf[10], lpc[10];
> +    float             gain[16];
> +    float             *formant_mem;
> +
> +    if ((q->framerate = determine_framerate(avctx, buf_size, &buf)) == I_F_Q) {
> +        warn_insufficient_frame_quality(avctx, "Framerate cannot be determined.");
> +        goto erasure;
> +    }
> +
> +    if (q->framerate == RATE_OCTAVE &&
> +       (q->first16bits = AV_RB16(buf)) == 0xFFFF) {
> +        warn_insufficient_frame_quality(avctx, "Framerate is 1/8 and first 16 bits are on.");
> +        goto erasure;
> +    }
> +
> +    if (q->framerate > SILENCE) {
> +        const QCELPBitmap *bitmaps     = qcelp_unpacking_bitmaps_per_rate[q->framerate];
> +        const QCELPBitmap *bitmaps_end = qcelp_unpacking_bitmaps_per_rate[q->framerate]
> +                                       + qcelp_bits_per_rate[q->framerate];
> +        uint8_t           *unpacked_data = (uint8_t *)q;
> +

> +        init_get_bits(&q->gb, buf, qcelp_bits_per_rate[q->framerate]);

qcelp_bits_per_rate does not seem correct here nor does its name seem
to match what it contains

[...]
> +    for (i = 0; i < 160; i++)
> +        *outbuffer++ = av_clipf(*formant_mem++, QCELP_CLIP_LOWER_BOUND, QCELP_CLIP_UPPER_BOUND);

are there artifacts or some loud trash without this cliping?

[...]
> +/**
> + * Computes the Pa or Qa coefficients needed for LSP to LPC conversion.
> + * We only need to calculate the 6 first elements of the polynomial.
> + *
> + * @param lspf line spectral pair frequencies
> + * @param v_poly polynomial input/output as a vector
> + *
> + * TIA/EIA/IS-733 2.4.3.3.5-1/2
> + */
> +static void lsp2poly(const float *lspf,
> +                     float *v_poly) {
> +    float val, *v;
> +    int   i;
> +
> +    // optimization to simplify calculation in loop
> +    v_poly++;
> +
> +    for (i = 0; i < 10; i += 2) {
> +        val = -2 * cos(M_PI * *lspf);
> +        lspf += 2;
> +        v = v_poly + FFMIN(4, i);
> +
> +        if (i < 4) {
> +            v[2] = v[0];
> +            v[1] = v[0] * val + v[-1];
> +        }
> +        for ( ; v > v_poly; v--)
> +            v[0] = v[0]
> +                 + v[-1] * val
> +                 + v[-2];
> +        v[0] += v[-1] * val;
> +    }
> +}
> +
> +/**
> + * Reconstructs LPC coefficients from the line spectral pair frequencies
> + * and performs bandwidth expansion.
> + *
> + * @param lspf line spectral pair frequencies
> + * @param lpc linear predictive coding coefficients
> + *
> + * @note: bandwith_expansion_coeff could be precalculated into a table
> + *        but it seems to be slower on x86
> + *
> + * TIA/EIA/IS-733 2.4.3.3.5
> + */
> +void qcelp_lspf2lpc(const float *lspf,
> +                    float *lpc) {
> +    float pa[6], qa[6];
> +    int   i;
> +    float bandwith_expansion_coeff = -QCELP_BANDWITH_EXPANSION_COEFF;
> +

> +    pa[0] = 0.5;
> +    pa[1] = 0.5;
> +    lsp2poly(lspf, pa);
> +
> +    qa[0] = 0.5;
> +    qa[1] = -0.5;
> +    lsp2poly(lspf + 1, qa);

it should be faster to deal with 0.5 + 0.5x / 0.5 - 0.5x after building
the polynomials

anyway, see ff_acelp_lsp2lpc

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Many that live deserve death. And some that die deserve life. Can you give
it to them? Then do not be too eager to deal out death in judgement. For
even the very wise cannot see all ends. -- Gandalf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20081116/a7c0b70e/attachment.pgp>