[FFmpeg-devel] [PATCH] wmapro decoder
Michael Niedermayer
michaelni
Tue Jun 23 11:51:47 CEST 2009
sorry for the late review, but the patch was kinda big ...
On Sat, Jun 20, 2009 at 02:50:24PM +0200, Sascha Sommer wrote:
[...]
> > > + *@param s codec context
> > > + */
> > > +static void wma_inverse_channel_transform(WMA3DecodeContext *s)
> > > +{
> > > + int i;
> > > +
> > > + for (i=0;i<s->num_chgroups;i++) {
> > > +
> > > + if (s->chgroup[i].transform == 1) {
> > > + /** M/S stereo decoding */
> > > + int16_t* sfb_offsets = s->cur_sfb_offsets;
> > > + float* ch0 = *sfb_offsets + s->channel[0].coeffs;
> > > + float* ch1 = *sfb_offsets++ + s->channel[1].coeffs;
> > > + const char* tb = s->chgroup[i].transform_band;
> > > + const char* tb_end = tb + s->num_bands;
> > > +
> > > + while (tb < tb_end) {
> > > + const float* ch0_end = s->channel[0].coeffs +
> > > +
> > > FFMIN(*sfb_offsets,s->subframe_len); + if (*tb++ == 1) {
> > > + while (ch0 < ch0_end) {
> > > + const float v1 = *ch0;
> > > + const float v2 = *ch1;
> > > + *ch0++ = v1 - v2;
> > > + *ch1++ = v1 + v2;
> > > + }
> > > + }else{
> > > + while (ch0 < ch0_end) {
> > > + *ch0++ *= 181.0 / 128;
> > > + *ch1++ *= 181.0 / 128;
> > > + }
> > > + }
> > > + ++sfb_offsets;
> > > + }
> > > + }else if (s->chgroup[i].transform) {
> > > + float data[MAX_CHANNELS];
> > > + const int num_channels = s->chgroup[i].num_channels;
> > > + float** ch_data = s->chgroup[i].channel_data;
> > > + float** ch_end = ch_data + num_channels;
> > > + const int8_t* tb = s->chgroup[i].transform_band;
> > > + int16_t* sfb;
> > > +
> > > + /** multichannel decorrelation */
> > > + for (sfb = s->cur_sfb_offsets ;
> > > + sfb < s->cur_sfb_offsets + s->num_bands;sfb++) {
> > > + if (*tb++ == 1) {
> > > + int y;
> > > + /** multiply values with the decorrelation_matrix */
> > > + for (y=sfb[0];y<FFMIN(sfb[1], s->subframe_len);y++)
> > > { + const float* mat =
> > > s->chgroup[i].decorrelation_matrix; + const float*
> > > data_end= data + num_channels; + float* data_ptr=
> > > data;
> > > + float** ch;
> > > +
> > > + for (ch = ch_data;ch < ch_end; ch++)
> > > + *data_ptr++ = (*ch)[y];
> > > +
> > > + for (ch = ch_data; ch < ch_end; ch++) {
> > > + float sum = 0;
> > > + data_ptr = data;
> > > + while (data_ptr < data_end)
> > > + sum += *data_ptr++ * *mat++;
> > > +
> > > + (*ch)[y] = sum;
> > > + }
> > > + }
> > > + }
> > > + }
> > > + }
> > > + }
> > > +}
> >
> > isnt the if() a special case of te else ? if so this can be simplified
> > and if its faster as is the special case could be limited to the inner
> > loop
> >
>
> I don't understand what you mean here. Transform may either be 0, 1, 2. Of
> course both 1 and 2 are larger than 0 but they stand for the transform type
> so keeping all of them on the same level does not sound wrong to me even if
> the 0 case does nothing. Maybe it should be changed to
> else if(s->chgroup[i].transform == 2)?
I meant that "isnt M/S a special case of multichannel and cant the
multicannel code be used for it" ?
[...]
> > > + if (!s->packet_loss) {
> > > + /** save the rest of the data so that it can be decoded
> > > + with the next packet */
> > > + wma_save_bits(s, &gb, wma_remaining_bits(s,&gb), 0);
> > > + }
> >
> > and if a packet is lost the data is thrown away?
> > is the data always useless or could it be of a correctly decodeable frame?
> >
>
> If packet_loss is set at this point we failed to decode the bitstream. When
> this is the case we do not really know where we are. It is possible that a
> (partial) valid frame can be found after the broken part but finding out
> where it starts is not easy as frames have variable lengths and no common
> startcode.
ok, just thought MS maybe copied the features from mp3 togteher with the
bugs ...
[...]
> New patch attached.
>
> Btw. should the code be renamed to wmaprodec.c and wmaprodata.h?
> wma3 includes the voice and lossless codecs so wma3dec.c may not be the best
> name for the pro decoder.
whichever way you prefer ...
[...]
> +/**
> + * @file libavcodec/wma3dec.c
> + * @brief wmapro decoder implementation
> + * Wmapro is an MDCT based codec comparable to wma standard or AAC.
> + * The decoding therefore consist of the following steps:
> + * - bitstream decoding
> + * - reconstruction of per channel data
> + * - rescaling and requantization
dequantization ?
[...]
> +/** current decoder limitations */
> +#define WMAPRO_MAX_CHANNELS 8 ///< max number of handled channels
> +#define MAX_SUBFRAMES 32 ///< max number of subframes per channel
> +#define MAX_BANDS 29 ///< max number of scale factor bands
> +#define MAX_FRAMESIZE 16384 ///< maximum compressed frame size
> +
> +#define WMAPRO_BLOCK_MAX_BITS 12 ///< log2 of max block size
> +#define WMAPRO_BLOCK_MAX_SIZE (1 << WMAPRO_BLOCK_MAX_BITS) ///< maximum block size
> +#define WMAPRO_BLOCK_SIZES (WMAPRO_BLOCK_MAX_BITS - BLOCK_MIN_BITS + 1) ///< possible block sizes
some of these have WMAPRO prefix some not, why?
[...]
> +/**
> + * @brief decoder context for a single channel
> + */
> +typedef struct {
> + int16_t prev_block_len; ///< length of the previous block
> + uint8_t transmit_coefs; ///< transmit coefficients
useless comment
> + uint8_t num_subframes; ///< number of subframes
same
[...]
> +/**
> + * @brief main decoder context
> + */
> +typedef struct WMA3DecodeContext {
> + /** generic decoder variables */
> + AVCodecContext* avctx; ///< codec context for av_log
> + DSPContext dsp; ///< accelerated dsp functions
> + uint8_t frame_data[MAX_FRAMESIZE +
> + FF_INPUT_BUFFER_PADDING_SIZE];///< compressed frame data
> + MDCTContext mdct_ctx[WMAPRO_BLOCK_SIZES]; ///< MDCT context per block size
> + DECLARE_ALIGNED_16(float, tmp[WMAPRO_BLOCK_MAX_SIZE]); ///< imdct output buffer
> + float* windows[WMAPRO_BLOCK_SIZES]; ///< window per block size
size?
> + int coef_max[2]; ///< max length of vlc codes
unused
> +
> + /** frame size dependent frame information (set during initialization) */
> + uint8_t lossless; ///< lossless mode
> + uint32_t decode_flags; ///< used compression features
> + uint8_t len_prefix; ///< frame is prefixed with its length
> + uint8_t dynamic_range_compression; ///< frame contains DRC data
> + uint8_t sample_bit_depth; ///< bits per sample
i think the comment makes a better name ...
its also more consistent with the next:
> + uint16_t samples_per_frame; ///< number of samples to output
> + uint16_t log2_frame_size; ///< frame size
that comment is not strictly correct but at least its redundant
> + int8_t num_channels; ///< number of channels
also redundant
> + int8_t lfe_channel; ///< lfe channel index
> + uint8_t max_num_subframes; ///< maximum number of subframes
> + int8_t num_possible_block_sizes; ///< nb of supported block sizes
num in the variable explained as nb in the comment is really not that clear
also is it possible or supported, theres a subtile difference
> + uint16_t min_samples_per_subframe; ///< minimum samples per subframe
> + int8_t* num_sfb; ///< scale factor bands per block size
> + int16_t* sfb_offsets; ///< scale factor band offsets
if i understood the code correctly all sfb_offsets are multiplies of 4,
maybe this should be documented of true
[...]
> +/**
> + *@brief Initialize the decoder.
> + *@param avctx codec context
> + *@return 0 on success, -1 otherwise
> + */
> +static av_cold int decode_init(AVCodecContext *avctx)
> +{
> + WMA3DecodeContext *s = avctx->priv_data;
> + uint8_t *edata_ptr = avctx->extradata;
> + int16_t* sfb_offsets;
> + unsigned int channel_mask;
> + int i;
> + int log2_num_subframes;
> +
> + s->avctx = avctx;
> + dsputil_init(&s->dsp, avctx);
> +
> + avctx->sample_fmt = SAMPLE_FMT_S16;
> +
> + /** FIXME: is this really the right thing to do for 24 bits? */
> + s->sample_bit_depth = 16; // avctx->bits_per_sample;
> + if (avctx->extradata_size >= 18) {
> + s->decode_flags = AV_RL16(edata_ptr+14);
> + channel_mask = AV_RL32(edata_ptr+2);
> +// s->sample_bit_depth = AV_RL16(edata_ptr);
> +#ifdef DEBUG
> + /** dump the extradata */
> + for (i=0 ; i<avctx->extradata_size ; i++)
> + av_log(avctx, AV_LOG_DEBUG, "[%x] ",avctx->extradata[i]);
> + av_log(avctx, AV_LOG_DEBUG, "\n");
> +#endif
> +
> + } else {
> + ff_log_ask_for_sample(avctx, "Unknown extradata size\n");
> + return -1;
> + }
> +
> + /** generic init */
> + s->log2_frame_size = av_log2(avctx->block_align*8)+1;
the *8 and +1 can be merged
> +
> + /** frame info */
> + s->skip_frame = 1; /** skip first frame */
> + s->packet_loss = 1;
> + s->len_prefix = (s->decode_flags & 0x40) >> 6;
> +
> + if (!s->len_prefix) {
> + ff_log_ask_for_sample(avctx, "no length prefix\n");
> + return -1;
> + }
> +
> + /** get frame len */
> + s->samples_per_frame = 1 << ff_wma_get_frame_len_bits(avctx->sample_rate,
> + 3, s->decode_flags);
> +
> + /** init previous block len */
> + for (i=0;i<avctx->channels;i++)
> + s->channel[i].prev_block_len = s->samples_per_frame;
> +
> + /** subframe info */
> + log2_num_subframes = ((s->decode_flags & 0x38) >> 3);
> + s->max_num_subframes = 1 << log2_num_subframes;
> + s->num_possible_block_sizes = log2_num_subframes + 1;
> + s->min_samples_per_subframe = s->samples_per_frame / s->max_num_subframes;
> + s->dynamic_range_compression = (s->decode_flags & 0x80) >> 7;
the >>7 is unneeded
[...]
> + for (x=0;x < MAX_BANDS-1 && sfb_offsets[band-1] < subframe_len;x++) {
> + int offset = (subframe_len * 2 * critical_freq[x])
> + / s->avctx->sample_rate + 2;
> + offset -= offset % 4;
& ~3
> + if ( offset > sfb_offsets[band - 1] ) {
> + sfb_offsets[band] = offset;
> + ++band;
sfb_offsets[band++] = offset;
> + }
> + }
> + sfb_offsets[band - 1] = subframe_len;
> + s->num_sfb[i] = band - 1;
> + sfb_offsets += MAX_BANDS;
> + }
> +
> +
> + /** Scale factors can be shared between blocks of different size
> + as every block has a different scale factor band layout.
> + The matrix sf_offsets is needed to find the correct scale factor.
> + */
> +
> + for (i=0;i<s->num_possible_block_sizes;i++) {
> + int b;
> + for (b=0;b< s->num_sfb[i];b++) {
> + int x;
> + int offset = ((s->sfb_offsets[MAX_BANDS * i + b]
> + + s->sfb_offsets[MAX_BANDS * i + b + 1] - 1)<<i) >> 1;
> + for (x=0;x<s->num_possible_block_sizes;x++) {
> + int v = 0;
> + while (s->sfb_offsets[MAX_BANDS * x + v +1] << x < offset)
> + ++v;
> + s->sf_offsets[ i * s->num_possible_block_sizes * MAX_BANDS
> + + x * MAX_BANDS + b] = v;
> + }
> + }
> + }
> +
> + /** init MDCT, FIXME: only init needed sizes */
> + for (i = 0; i < WMAPRO_BLOCK_SIZES; i++)
> + ff_mdct_init(&s->mdct_ctx[i],
> + BLOCK_MIN_BITS+1+i, 1, 1.0 / (1<<(BLOCK_MIN_BITS+i-1)));
> +
> + /** init MDCT windows: simple sinus window */
> + for (i=0 ; i<WMAPRO_BLOCK_SIZES ; i++) {
> + const int n = 1 << (WMAPRO_BLOCK_MAX_BITS - i);
> + const int win_idx = WMAPRO_BLOCK_MAX_BITS - i - 7;
> + ff_sine_window_init(ff_sine_windows[win_idx], n);
> + s->windows[WMAPRO_BLOCK_SIZES-i-1] = ff_sine_windows[win_idx];
> + }
> +
> + /** calculate subwoofer cutoff values */
> +
> + for (i=0;i< s->num_possible_block_sizes;i++) {
the empty line after /** looks like a typo, the others dont have it ...
> + int block_size = s->samples_per_frame / (1 << i);
/ (1<<i) -> >>i
> + int cutoff = ceil(block_size * 440.0
> + / (double)s->avctx->sample_rate + 0.5);
this might be identical to: (avoiding floats ...)
(440*block_size + 3*s->avctx->sample_rate/2 - 1)/s->avctx->sample_rate
[...]
> +
> +/**
> + *@brief Decode how the data in the frame is split into subframes.
> + * Every WMA frame contains the encoded data for a fixed number of
> + * samples per channel. The data for every channel might be split
> + * into several subframes. This function will reconstruct the list of
> + * subframes for every channel.
> + *
> + * If the subframes are not evenly split, the algorithm estimates the
> + * channels with the lowest number of total samples.
> + * Afterwards, for each of these channels a bit is read from the
> + * bitstream that indicates if the channel contains a frame with the
> + * next subframe size that is going to be read from the bitstream or not.
> + * If a channel contains such a subframe, the subframe size gets added to
> + * the channel's subframe list.
> + * The algorithm repeats these steps until the frame is properly divided
> + * between the individual channels.
> + *
> + *@param s context
> + *@return 0 on success, < 0 in case of an error
> + */
> +static int decode_tilehdr(WMA3DecodeContext *s)
> +{
> + int c;
> + int missing_samples = s->num_channels * s->samples_per_frame;
> +
> + /* should never consume more than 3073 bits (256 iterations for the
> + * while loop when always the minimum amount of 128 samples is substracted
> + * from missing samples in the 8 channel case)
> + * 1 + BLOCK_MAX_SIZE * MAX_CHANNELS / BLOCK_MIN_SIZE * (MAX_CHANNELS + 4)
> + */
> +
> + /** reset tiling information */
> + for (c=0;c<s->num_channels;c++) {
> + s->channel[c].num_subframes = 0;
> + s->channel[c].channel_len = 0;
> + }
> +
> + /** handle the easy case with one constant-sized subframe per channel */
> + if (s->max_num_subframes == 1) {
> + for (c=0;c<s->num_channels;c++) {
> + s->channel[c].num_subframes = 1;
> + s->channel[c].subframe_len[0] = s->samples_per_frame;
> + s->channel[c].channel_len = 0;
redundant
> + }
> + } else { /** subframe length and number of subframes is not constant */
> + /** bits needed for the subframe length */
> + int subframe_len_bits = 0;
> + /** first bit indicates if length is zero */
> + int subframe_len_zero_bit = 0;
> + /** all channels have the same subframe layout */
> + int fixed_channel_layout;
i think the comments would be more readable at the right of the variables
the whole looks like a undifferentiated block ...
> +
> + fixed_channel_layout = get_bits1(&s->gb);
> +
> + /** calculate subframe len bits */
> + if (s->lossless) {
> + subframe_len_bits = av_log2(s->max_num_subframes - 1) + 1;
> + } else if (s->max_num_subframes == 16) {
> + subframe_len_zero_bit = 1;
> + subframe_len_bits = 3;
> + } else
> + subframe_len_bits = av_log2(av_log2(s->max_num_subframes)) + 1;
> +
> + /** loop until the frame data is split between the subframes */
> + while (missing_samples > 0) {
the missing_samples variable can be put in the else clause, instead of just
in the function
> + unsigned int channel_mask = 0;
> + int min_channel_len = s->samples_per_frame;
the init can be moved in the else below, as its not used in the if
[...]
> + /** if we have the choice get next subframe length from the
> + bitstream */
> + if (min_samples != missing_samples) {
> + int log2_subframe_len = 0;
> + /* 1 bit indicates if the subframe length is zero */
> + if (subframe_len_zero_bit) {
> + if (get_bits1(&s->gb)) {
> + log2_subframe_len =
> + get_bits(&s->gb,subframe_len_bits-1);
> + ++log2_subframe_len;
+1 can be merged in the previous statement
> + }
> + } else
> + log2_subframe_len = get_bits(&s->gb,subframe_len_bits);
> +
> + if (s->lossless) {
> + subframe_len =
> + s->samples_per_frame / s->max_num_subframes;
> + subframe_len *= log2_subframe_len + 1;
> + } else {
> + subframe_len =
> + s->samples_per_frame / (1 << log2_subframe_len);
> + }
> +
> + /** sanity check the length */
> + if (subframe_len < s->min_samples_per_subframe
> + || subframe_len > s->samples_per_frame) {
> + av_log(s->avctx, AV_LOG_ERROR,
> + "broken frame: subframe_len %i\n", subframe_len);
> + return -1;
> + }
> + } else
> + subframe_len = s->min_samples_per_subframe;
> +
> + for (c=0; c<s->num_channels;c++) {
> + WMA3ChannelCtx* chan = &s->channel[c];
> +
> + /** add subframes to the individual channels */
> + if (min_channel_len == chan->channel_len) {
> + --channels_for_cur_subframe;
> + if (!read_channel_mask ||
> + channel_mask & (1<<channels_for_cur_subframe)) {
if read_channel_mask == 0 then you could set channel_mask= -1 to simpliy this
[...]
> +/**
> + *@brief Calculate a decorrelation matrix from the bitstream parameters.
> + *@param s codec context
> + *@param chgroup channel group for which the matrix needs to be calculated
> + */
> +static void decode_decorrelation_matrix(WMA3DecodeContext* s,
> + WMA3ChannelGroup* chgroup)
> +{
> + int i;
> + int offset = 0;
> + char rotation_offset[WMAPRO_MAX_CHANNELS * WMAPRO_MAX_CHANNELS];
int8_t ? uint8_t ?
> + memset(chgroup->decorrelation_matrix,0,
> + sizeof(float) *s->num_channels * s->num_channels);
> +
> + for (i=0;i<chgroup->num_channels * (chgroup->num_channels - 1) >> 1;i++)
> + rotation_offset[i] = get_bits(&s->gb,6);
> +
> + for (i=0;i<chgroup->num_channels;i++) {
> + if (get_bits1(&s->gb)) {
> + chgroup->decorrelation_matrix[chgroup->num_channels * i + i]= 1.0;
> + } else
> + chgroup->decorrelation_matrix[chgroup->num_channels * i + i]= -1.0;
> + }
> +
> + for (i=1;i<chgroup->num_channels;i++) {
> + int x;
> + for (x=0;x<i;x++) {
> + int y;
> + float tmp1[WMAPRO_MAX_CHANNELS];
> + float tmp2[WMAPRO_MAX_CHANNELS];
> + memcpy(tmp1,
> + &chgroup->decorrelation_matrix[x * chgroup->num_channels],
> + sizeof(float) * (i + 1));
> + memcpy(tmp2,
> + &chgroup->decorrelation_matrix[i * chgroup->num_channels],
> + sizeof(float) * (i + 1));
> + for (y=0;y < i + 1 ; y++) {
> + float v1 = tmp1[y];
> + float v2 = tmp2[y];
> + int n = rotation_offset[offset + x];
> + float sinv;
> + float cosv;
> +
> + if (n<32) {
> + sinv = sin64[n];
> + cosv = sin64[32-n];
> + } else {
> + sinv = sin64[64-n];
> + cosv = -sin64[n-32];
> + }
> +
> + chgroup->decorrelation_matrix[y + x * chgroup->num_channels] =
> + (v1 * sinv) - (v2 * cosv);
> + chgroup->decorrelation_matrix[y + i * chgroup->num_channels] =
> + (v1 * cosv) + (v2 * sinv);
> + }
the memcpy and tmp1/tmp2 appear unneeded
[...]
> + if ( idx == HUFF_VEC4_SIZE - 1 ) {
> + for (i=0 ; i < 4 ; i+= 2) {
> + idx = get_vlc2(&s->gb, vec2_vlc.table, VLCBITS, VEC2MAXDEPTH);
> + if ( idx == HUFF_VEC2_SIZE - 1 ) {
> + vals[i] = get_vlc2(&s->gb, vec1_vlc.table, VLCBITS, VEC1MAXDEPTH);
> + if (vals[i] == HUFF_VEC1_SIZE - 1)
> + vals[i] += ff_wma_get_large_val(&s->gb);
> + vals[i+1] = get_vlc2(&s->gb, vec1_vlc.table, VLCBITS, VEC1MAXDEPTH);
> + if (vals[i+1] == HUFF_VEC1_SIZE - 1)
> + vals[i+1] += ff_wma_get_large_val(&s->gb);
> + } else {
> + vals[i] = (symbol_to_vec2[idx] >> 4) & 0xF;
> + vals[i+1] = symbol_to_vec2[idx] & 0xF;
> + }
> + }
> + } else {
> + vals[0] = (symbol_to_vec4[idx] >> 8) >> 4;
> + vals[1] = (symbol_to_vec4[idx] >> 8) & 0xF;
> + vals[2] = (symbol_to_vec4[idx] >> 4) & 0xF;
> + vals[3] = symbol_to_vec4[idx] & 0xF;
vertical align
> + }
> +
> + /** decode sign */
> + for (i=0;i<4;i++) {
> + if (vals[i]) {
> + int sign = get_bits1(&s->gb) - 1;
> + ci->coeffs[cur_coeff] = (vals[i]^sign) - sign;
> + num_zeros = zero_init;
> + } else {
> + /** switch to run level mode when subframe_len / 128 zeros
> + were found in a row */
> + rl_mode |= (num_zeros & rl_switchmask);
> + ++num_zeros;
> + }
> + ++cur_coeff;
> + }
i guess you tried
num_zeros=0
and rl_mode |= (++num_zeros > X)
?
[...]
> +/**
> + *@brief Apply sine window and reconstruct the output buffer.
> + *@param s codec context
> + */
> +static void window(WMA3DecodeContext *s)
> +{
> + int i;
> + for (i=0;i<s->channels_for_cur_subframe;i++) {
> + int c = s->channel_indexes_for_cur_subframe[i];
> + float* start;
> + float* window;
> + int winlen = s->channel[c].prev_block_len;
> + start = s->channel[c].coeffs - (winlen >> 1);
declaration and init can be merged
[...]
> +/**
> + *@brief Decode a single subframe (block).
> + *@param s codec context
> + *@return 0 if decoding failed, 1 on success
> + */
in ffmpeg <0 is error and the value is the AVERROR code
[...]
> + /** skip extended header if any */
> + if (get_bits1(&s->gb)) {
> + int num_fill_bits;
> + if (!(num_fill_bits = get_bits(&s->gb,2))) {
> + num_fill_bits = get_bits(&s->gb,4);
> + num_fill_bits = get_bits(&s->gb,num_fill_bits) + 1;
> + }
> +
> + if (num_fill_bits >= 0) {
> + if (get_bits_count(&s->gb) + num_fill_bits > s->num_saved_bits) {
> + av_log(s->avctx,AV_LOG_ERROR,"invalid number of fill bits\n");
> + return 0;
> + }
> +
> + skip_bits_long(&s->gb,num_fill_bits);
> + }
i think the reuse of num_fill_bits is a little obfuscating the code
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
It is dangerous to be right in matters on which the established authorities
are wrong. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090623/a360be80/attachment.pgp>
More information about the ffmpeg-devel
mailing list