[FFmpeg-devel] [PATCH] G.722 decoder
Martin Storsjö
martin
Mon Aug 23 11:49:50 CEST 2010
On Fri, 6 Aug 2010, Martin Storsj? wrote:
> Hi,
>
> The G.722 decoder patch that Kenan Gillet took a far way through the
> review process last year seemed to have stalled, I'm trying to pick it up
> and get it committed.
>
> On Fri, 10 Apr 2009, Michael Niedermayer wrote:
>
> > On Thu, Apr 09, 2009 at 08:14:58AM -0700, Kenan Gillet wrote:
> > [...]
> > > + * @note For the 56000bps and 48000bps bitrates, the respectively 7 and 6 bits
> > > + * codeword might be packed, so unpacking might be needed either
> > > + * internally or as a separate parser.
> >
> > do non packed cases exist anywhere in the wild?
> > do packed cases exist anywhere in the wild?
> >
> > it seems to me that filling 30% of the bits with 0 is not particlarely
> > likely in a real world environment
>
> According to the spec, the point of the lower bitrate versions is to be
> able to use 1 or 2 bits per byte as an auxillary data channel. The encoder
> always outputs the 64 kbps mode, but you can choose to overwrite the
> lowest two bits if you want to.
>
> The decoder is able to cope with these bits being overwritten with random
> bits, but you get slightly better results if you tell the decoder that
> this has happened and so it should disregard those bits.
>
> With that in mind, as far as I can interpret the spec, it's the lowest
> bits that should be skipped for those modes, but the source for this
> decoder seems to think otherwise. The implementation in spandsp (that
> has some common heritage with this code base, iirc, does things this
> way, too.
>
> > > + /**
> > > + * The band[0] and band[1] correspond respectively to the lower band and higher band.
> > > + */
> > > + struct G722Band {
> >
> > > + int16_t s_predictor; ///< predictor output value
> >
> > ok that makes sense
> >
> >
> > > + int32_t s_zero; ///< zero section output signal
> >
> > what is that?
> >
> >
> > > + int8_t part_reconst_mem[2]; ///< partially reconstructed signal memory
> >
> > and that?
>
> Tried to clarify these doxy comments, to the best of my understanding of
> the spec.
>
> > > + int16_t prev_qtzd_reconst; ///< previous quantized reconstructed signal
> >
> > this is not true, this uses a different dequantization table in some cases
> > i even wonder if this is a bug
>
> No, it's explicitly stated that it should be done this way in the spec. I
> figure it's part of the "scalability" feature - the predictor updating is
> always done using the 4-bit dequantizer so that it evolves in the same way
> regardless of the mode it is decoded in. Each individual output sample is
> calculated using the more precise dequantizer if that is chosen, but the
> predictor is always updated using the lower precision dequantizer.
>
> >
> > > + int16_t pole_mem[2]; ///< second-order pole section coefficient buffer
> > > + int32_t diff_mem[6]; ///< quantizer difference signal memory
> > > + int16_t zero_mem[6]; ///< Seventh-order zero section coefficient buffer
> >
> > > + int16_t log_factor; ///< delayed logarithmic quantizer factor
> >
> > log what? log_e log_2 log_10 ?
>
> log_2, clarified
>
> >
> > > + int16_t scale_factor; ///< delayed quantizer scale factor
> > > + } band[2];
> > > +} G722Context;
> > > +
> > > +
> > > +static const int8_t sign_lookup[2] = { -1, 1 };
> > > +
> > > +static const int16_t ilb[32] = {
> > > + 2048, 2093, 2139, 2186, 2233, 2282, 2332, 2383,
> > > + 2435, 2489, 2543, 2599, 2656, 2714, 2774, 2834,
> > > + 2896, 2960, 3025, 3091, 3158, 3228, 3298, 3371,
> > > + 3444, 3520, 3597, 3676, 3756, 3838, 3922, 4008
> > > +};
> >
> > > +static const int16_t wh[2] = { 798, -214 };
> >
> > 2 letters is better than 1 ... well sometimes its not much better though
>
> Tried to give this table a better name
>
> > > +static const int16_t qm2[4] = { -7408, -1616, 7408, 1616 };
> >
> > > +/**
> > > + * qm3[index] == wl[rl42[index]]
> > > + */
> > > +static const int16_t qm3[16] = {
> > > + -60, 3042, 1198, 538, 334, 172, 58, -30,
> > > + 3042, 1198, 538, 334, 172, 58, -30, -60
> > > +};
> > > +static const int16_t qm4[16] = {
> > > + 0, -20456, -12896, -8968, -6288, -4240, -2584, -1200,
> > > + 20456, 12896, 8968, 6288, 4240, 2584, 1200, 0
> > > +};
> >
> > what is qm?
> > i know these are dequantization tables, and at that poorly designed ones
> > (yeah the duplicate entries)
> > but i cant relate "qm" to "dequantization"
>
> Tried to give better names to these tables, too
>
> > > +
> > > +/**
> > > + * quadrature mirror filter (QMF) coefficients
> > > + *
> > > + * ITU-T G.722 Table 11
> > > + */
> > > +static const int16_t qmf_coeffs[12] = {
> > > + 3, -11, 12, 32, -210, 951, 3876, -805, 362, -156, 53, -11,
> > > +};
> > > +
> > > +
> >
> > > +/**
> > > + * adaptive predictor
> > > + *
> > > + * @note On x86 using the MULL macro in a loop is slower than not using the macro.
> > > + */
> > > +static void do_adaptive_prediction(struct G722Band *band, const int cur_diff)
> >
> > missing doxy for cur_diff
>
> Added some sort of doxy for this parameter
>
> > [...]
> > > +static int inline scale(const int log_factor, int shift)
> > > +{
> > > + const int wd1 = ilb[(log_factor >> 6) & 31];
> > > + shift -= log_factor >> 11;
> > > + return (shift < 0 ? wd1 << -shift : wd1 >> shift) << 2;
> > > +}
> >
> > i belive we already have some integer exp2()
>
> Didn't find any common method to do shifts where the shift can be
> negative - only found av_shr_i in libavutil/integer.h, that works on
> arbitrary precision integers.
>
> > also this one lacks documentation and a name that is related to what it
> > does if there is a reason why common code cannot be used
>
> Renamed it to a slightly more descriptive name
>
> > [...]
> > > +static void apply_qmf(int16_t *prev_samples, int *xout1, int *xout2)
> > > +{
> > > + int i;
> > > +
> > > + *xout1 = 0;
> > > + *xout2 = 0;
> > > + for (i = 0; i < 12; i++) {
> > > + MAC16(*xout2, prev_samples[2*i ], qmf_coeffs[i ]);
> > > + MAC16(*xout1, prev_samples[2*i+1], qmf_coeffs[11-i]);
> > > + }
> >
> > > + memmove(prev_samples, prev_samples + 2, 22*sizeof(prev_samples[0]));
> >
> > please find a solution that does not need to move the array by 2 samples
> > after each 2 samples.
>
> Changed to use a longer internal buffer for these samples so that memmoves
> are done much more seldom (every 1024 samples).
>
> > [...]
> > > +static int g722_decode_frame(AVCodecContext *avctx, void *data, int *data_size, AVPacket *avpkt)
> > > +{
> > > + G722Context *c = avctx->priv_data;
> > > + int16_t *out_buf = data;
> > > + const uint8_t *buf = avpkt->data;
> > > + int j, out_len = 0;
> > > + const int shift = 8 - c->bits_per_sample;
> > > + const int16_t *quantizer_table = qms[shift];
> > > +
> > > + for (j = 0; j < avpkt->size; j++) {
> > > + const int ilow = buf[j] & (0x3F >> shift);
> > > + const int rlow = av_clip(MULL(c->band[0].scale_factor, quantizer_table[ilow], 15) +
> > > + c->band[0].s_predictor, -16384, 16383);
> > > +
> > > + update_low_predictor(&c->band[0], ilow >> (2 - shift));
> > > +
> > > + if (avctx->sample_rate == 16000) {
> > > + const int ihigh = (buf[j] >> (6 - shift)) & 0x03;
> >
> > all that looks a litte obfuscated, get_bits() seems like the proper choice
>
> Changed to use get_bits
Ping
// Martin
More information about the ffmpeg-devel
mailing list