[FFmpeg-devel] [PATCH] wmapro decoder

Sat Jun 20 14:50:24 CEST 2009

Hi Michael,

thanks for the review. See my answers below.

> > +/**
> > + *@brief Get the samples per frame for this stream.
> > + *@param sample_rate output sample_rate
> > + *@param decode_flags codec compression features
> > + *@return number of output samples per frame
> > + */
> > +static int av_cold wma_get_samples_per_frame(int sample_rate,
> > +                                             unsigned int decode_flags)
> > +{
> > +
> > +    int samples_per_frame;
> > +    int tmp;
> > +
> > +    if (sample_rate <= 16000)
> > +        samples_per_frame = 512;
> > +    else if (sample_rate <= 22050)
> > +        samples_per_frame = 1024;
> > +    else if (sample_rate <= 48000)
> > +        samples_per_frame = 2048;
> > +    else if (sample_rate <= 96000)
> > +        samples_per_frame = 4096;
> > +    else
> > +        samples_per_frame = 8192;
>
> mergeable with the related code from ff_wma_init()
>

Done.

> > +
> > +    av_free(s->def_decorrelation_mat);
>
> i tend to prefer av_freep() for sake of saftey
>

Changed.

> > +        /** dump the extradata */
> > +        for (i=0 ; i<avctx->extradata_size ; i++)
> > +            av_log(avctx, AV_LOG_DEBUG, "[%x] ",avctx->extradata[i]);
> > +        av_log(avctx, AV_LOG_DEBUG, "\n");
>
> this stuff will have to be droped or made accessible via
> AVCodecContext.debug in the final version
>

I changed it to dprintf so that it is not compiled in by default.

> > +    if (!s->len_prefix) {
> > +         av_log(avctx, AV_LOG_ERROR, "no length prefix, please
> > report\n");
>
> ff_log_ask_for_sample
>

Replaced.

> >
> > +    s->max_num_subframes = 1 << ((s->decode_flags & 0x38) >> 3);
> > +    s->num_possible_block_sizes = av_log2(s->max_num_subframes) + 1;
>
> IMHO ugly, i mean 1<< and then log2
>

Changed.

> > +    if (channel_mask & 8) {
> > +        unsigned int mask = 1;
> > +        for (i=0;i<32;i++) {
> > +            if (channel_mask & mask)
> > +                ++s->lfe_channel;
> > +            if (mask & 8)
> > +                break;
> > +            mask <<= 1;
> > +        }
> > +    }
>
> looks buggy
> the loops would do 32 iterations but the tests limit it to 4
>

Fixed.

> > +        int subframe_len = s->samples_per_frame / (1 << i);
>
> samples_per_frame >> i
>

Fixed.

> > +        /** calculate subframe len bits */
> > +        if (s->lossless)
> > +            subframe_len_bits = av_log2(s->max_num_subframes - 1) + 1;
> > +        else if (s->max_num_subframes == 16) {
>
> if{
> }else
>

Fixed.

> > +            i = 0;
> > +            while (i < 4) {
>
> [...]
>
> > +                i += 2;
>
> for()
>

Changed.

> > sf++) { +                    val = get_vlc2(&s->gb, sf_vlc.table,
> > SCALEVLCBITS, +                 
> > ((FF_WMA3_HUFF_SCALE_MAXBITS+SCALEVLCBITS-1)/SCALEVLCBITS)); +           
> >         *sf = *(sf - 1) + val - 60;
> > +                }
>
> the way this line is wraped is very ugly, even not wraping at all would be
> better
> also the calculation of the stages could be moved into some define, greatly
> simplifyng the look of get_vlc*
>

Done.

> val= 45 / s->channel[c].scale_factor_step;
> for(sf= s->channel[c].scale_factors...){
>     val += get_vlc2() - 60;
>     *sf= val;
> }
>

Fixed.

> > +                for (i=0;i<s->num_bands;i++) {
> > +                    int idx;
> > +                    short skip;
> > +                    short val;
> > +                    short sign;
>
> is there any reason why they are short ?
>

No. Changed to int.

> > +
> > +                    idx = get_vlc2(&s->gb, sf_rl_vlc.table, VLCBITS,
> > +                         
> > ((FF_WMA3_HUFF_SCALE_RL_MAXBITS+VLCBITS-1)/VLCBITS)); +
> > +                    if ( !idx ) {
> >
> > +                        uint32_t code = get_bits(&s->gb,14);
> > +                        val = code >> 6;
> > +                        sign = (code & 1) - 1;
> > +                        skip = (code & 0x3f)>>1;
>
> is that faster than 3 get_bits() ?
>

Yes.

> > +/**
> > + *@brief Decorrelate and undo M/S stereo coding.
>
> decorrelation is what the encoder does
>

Fixed.

> > + *@param s codec context
> > + */
> > +static void wma_inverse_channel_transform(WMA3DecodeContext *s)
> > +{
> > +    int i;
> > +
> > +    for (i=0;i<s->num_chgroups;i++) {
> > +
> > +        if (s->chgroup[i].transform == 1) {
> > +            /** M/S stereo decoding */
> > +            int16_t* sfb_offsets = s->cur_sfb_offsets;
> > +            float* ch0 = *sfb_offsets + s->channel[0].coeffs;
> > +            float* ch1 = *sfb_offsets++ + s->channel[1].coeffs;
> > +            const char* tb = s->chgroup[i].transform_band;
> > +            const char* tb_end = tb + s->num_bands;
> > +
> > +            while (tb < tb_end) {
> > +                const float* ch0_end = s->channel[0].coeffs +
> > +                                      
> > FFMIN(*sfb_offsets,s->subframe_len); +                if (*tb++ == 1) {
> > +                    while (ch0 < ch0_end) {
> > +                        const float v1 = *ch0;
> > +                        const float v2 = *ch1;
> > +                        *ch0++ = v1 - v2;
> > +                        *ch1++ = v1 + v2;
> > +                    }
> > +                }else{
> > +                    while (ch0 < ch0_end) {
> > +                        *ch0++ *= 181.0 / 128;
> > +                        *ch1++ *= 181.0 / 128;
> > +                    }
> > +                }
> > +                ++sfb_offsets;
> > +            }
> > +        }else if (s->chgroup[i].transform) {
> > +            float data[MAX_CHANNELS];
> > +            const int num_channels = s->chgroup[i].num_channels;
> > +            float** ch_data = s->chgroup[i].channel_data;
> > +            float** ch_end = ch_data + num_channels;
> > +            const int8_t* tb = s->chgroup[i].transform_band;
> > +            int16_t* sfb;
> > +
> > +            /** multichannel decorrelation */
> > +            for (sfb = s->cur_sfb_offsets ;
> > +                sfb < s->cur_sfb_offsets + s->num_bands;sfb++) {
> > +                if (*tb++ == 1) {
> > +                    int y;
> > +                    /** multiply values with the decorrelation_matrix */
> > +                    for (y=sfb[0];y<FFMIN(sfb[1], s->subframe_len);y++)
> > { +                        const float* mat =
> > s->chgroup[i].decorrelation_matrix; +                        const float*
> > data_end= data + num_channels; +                        float* data_ptr=
> > data;
> > +                        float** ch;
> > +
> > +                        for (ch = ch_data;ch < ch_end; ch++)
> > +                           *data_ptr++ = (*ch)[y];
> > +
> > +                        for (ch = ch_data; ch < ch_end; ch++) {
> > +                            float sum = 0;
> > +                            data_ptr = data;
> > +                            while (data_ptr < data_end)
> > +                                sum += *data_ptr++ * *mat++;
> > +
> > +                            (*ch)[y] = sum;
> > +                        }
> > +                    }
> > +                }
> > +            }
> > +        }
> > +    }
> > +}
>
> isnt the if() a special case of te else ? if so this can be simplified
> and if its faster as is the special case could be limited to the inner
> loop
>

I don't understand what you mean here. Transform may either be 0, 1, 2. Of 
course both 1 and 2 are larger than 0 but they stand for the transform type 
so keeping all of them on the same level does not sound wrong to me even if 
the 0 case does nothing. Maybe it should be changed to
else if(s->chgroup[i].transform == 2)?

> > +        /** decode quantization step */
> > +        quant = get_bits(&s->gb,6);
> > +        if (quant & 0x20) {
> > +            quant |= 0xFFFFFFC0u;
> > +            sign = -1;
> > +        }
>
> get_sbits()
>

Fixed.

> > +                int end = s->cur_sfb_offsets[b+1];
> > +                int sf = s->channel[c].max_scale_factor;
> > +                float quant;
> > +                if (end > s->subframe_len)
> > +                    end = s->subframe_len;
>
> FFMIN

Fixed.

> > +
> > +                if (s->channel[c].transmit_sf)
> > +                     sf -= s->channel[c].scale_factors[b];
> > +                else
> > +                     sf -= s->channel[c].resampled_scale_factors[b];
> > +                sf *= -s->channel[c].scale_factor_step;
> > +                sf += s->quant_step + s->channel[c].quant_step_modifier;
> >
> > +                quant = pow(10.0,sf / 20.0);
>
> this can be simplified

I moved the condition out of the loop and moved quant_step to the per channel 
context.

> > +                while (start < end) {
> > +                    s->tmp[start] = s->channel[c].coeffs[start] * quant;
> > +                    ++start;
> > +                }
>
> for(start= s->cur_sfb_offsets[b]; start < end; ++start)
>

Fixed.

> > +    if (!s->packet_loss) {
> > +        /** save the rest of the data so that it can be decoded
> > +            with the next packet */
> > +        wma_save_bits(s, &gb, wma_remaining_bits(s,&gb), 0);
> > +    }
>
> and if a packet is lost the data is thrown away?
> is the data always useless or could it be of a correctly decodeable frame?
>

If packet_loss is set at this point we failed to decode the bitstream. When 
this is the case we do not really know where we are. It is possible that a 
(partial) valid frame can be found after the broken part but finding out 
where it starts is not easy as frames have variable lengths and no common 
startcode.

> you use *int8_t at some points and char here ...
> besides that i wonder why intXY_t instead of int in some cases
>

The intXY_t thing in the context came after it was more clear what the 
upper/lower bounds of the variables are. I changed the chars in the table to 
int8.

> > +static const uint32_t ff_wma3_scale_huffcodes[FF_WMA3_HUFF_SCALE_SIZE] =
> > { +    0x0E639, 0x0E6C2, 0x0E6C1, 0x0E6C0, 0x0E63F, 0x0E63E, 0x0E63D,
> > 0x0E63C, +    0x0E63B, 0x0E63A, 0x0E638, 0x0E637, 0x0E636, 0x0E635,
> > 0x0E634, 0x0E632, +    0x0E633, 0x0E620, 0x0737B, 0x0E610, 0x0E611,
> > 0x0E612, 0x0E613, 0x0E614, +    0x0E615, 0x0E616, 0x0E617, 0x0E618,
> > 0x0E619, 0x0E61A, 0x0E61B, 0x0E61C, +    0x0E61D, 0x0E61E, 0x0E61F,
> > 0x0E6C3, 0x0E621, 0x0E622, 0x0E623, 0x0E624, +    0x0E625, 0x0E626,
> > 0x0E627, 0x0E628, 0x0E629, 0x0E62A, 0x0E62B, 0x0E62C, +    0x0E62D,
> > 0x0E62E, 0x0E62F, 0x0E630, 0x0E631, 0x01CDF, 0x00E60, 0x00399, +   
> > 0x000E7, 0x0001D, 0x00000, 0x00001, 0x00001, 0x00001, 0x00002, 0x00006, +
> >    0x00002, 0x00007, 0x00006, 0x0000F, 0x00038, 0x00072, 0x0039A,
> > 0x0E6C4, +    0x0E6C5, 0x0E6C6, 0x0E6C7, 0x0E6C8, 0x0E6C9, 0x0E6CA,
> > 0x0E6CB, 0x0E6CC, +    0x0E6CD, 0x0E6CE, 0x0E6CF, 0x0E6D0, 0x0E6D1,
> > 0x0E6D2, 0x0E6D3, 0x0E6D4, +    0x0E6D5, 0x0E6D6, 0x0E6D7, 0x0E6D8,
> > 0x0E6D9, 0x0E6DA, 0x0E6DB, 0x0E6DC, +    0x0E6DD, 0x0E6DE, 0x0E6DF,
> > 0x0E6E0, 0x0E6E1, 0x0E6E2, 0x0E6E3, 0x0E6E4, +    0x0E6E5, 0x0E6E6,
> > 0x0E6E7, 0x0E6E8, 0x0E6E9, 0x0E6EA, 0x0E6EB, 0x0E6EC, +    0x0E6ED,
> > 0x0E6EE, 0x0E6EF, 0x0E6F0, 0x0E6F1, 0x0E6F2, 0x0E6F3, 0x0E6F4, +   
> > 0x0E6F5,
> > +};
>
> this one fits in 16 bits and its not the only such table
>

Fixed.
Besides that I also fixed following points from Vitors mails in the same 
thread:
- set avctx->sample_fmt
- use const matrix offset table
- get rid of wma3.h
- removed encoding X in general.texi
- removed prefix from static functions and tables

The code still outputs shorts. Float produced clicks all around the place. I'm 
not sure yet if the problem is in my code or in ffmpeg.
Float output can also be added after the code has been added to svn imho.

New patch attached.

Btw. should the code be renamed to wmaprodec.c and wmaprodata.h?
wma3 includes the voice and lossless codecs so wma3dec.c may not be the best 
name for the pro decoder.

Regards

Sascha

-------------- next part --------------
A non-text attachment was scrubbed...
Name: wmapro_try2.patch
Type: text/x-diff
Size: 98439 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090620/b11a07d7/attachment.patch>