[FFmpeg-devel] [PATCH] Add ATRAC3+ decoder

Sat Oct 19 00:39:18 CEST 2013

Thank you for your suggestions!
Please find an updated patch in my next email:
[PATCH] ATRAC3+ decoder, 2nd try

Below a couple of comments...

>> +/**
>> + * Generate canonical VLC table from given descriptor.
>> + *
>> + * @param[in]  cb       ptr to codebook descriptor
>> + * @param[in]  xlat     ptr to translation table or NULL
>> + * @param[out] out_vlc  ptr to vlc table to be generated
>> + */
>> +static av_cold void build_canonical_huff(const uint8_t *cb, const uint8_t *xlat,
>> +                                         VLC *out_vlc)
>> +{
>> +    int i, b;
>> +    uint16_t codes[256];
>> +    uint8_t bits[256];
>> +    unsigned code = 0;
>> +    int index = 0;
>> +    int min_len = *cb++; // get shortest codeword length
>> +    int max_len = *cb++; // get longest  codeword length
>> +
>> +    for (b = min_len; b <= max_len; b++) {
>> +        for (i = *cb++; i > 0; i--) {
>> +            bits[index]  = b;
>> +            codes[index] = code++;
>> +            index++;
> index should, for saftey be checked to be within the array.
> Its not a bug as this function is never feeded with untrusted data
> but it feels safer to me to check or assert() that index doesnt
> become too large

Added an av_assert0() - based check.

>> +        }
>> +        code <<= 1;
>> +    }
>> +
>> +    ff_init_vlc_sparse(out_vlc, max_len, index, bits, 1, 1, codes, 2, 2,
>> +                       xlat, 1, 1, 0);
>> +}
> build_canonical_huff() could be done to all tables and the resulting
> tables be put in the source. This would avoid the need for
> build_canonical_huff() and simplify the code
> also it would allow the tables to be shared between instances of a
> loaded lib as they would be static const and not generated at runtime.
>
> making that change would likely increase the size of static const
> tables by <50% though.
> If you prefer to leave it as it is to avoid these larger tables
> thats fine too

At the time being, I would prefer to leave the tables as is. Expanding 
these VLC tables is a long-term task and could be done later once the 
decoder has been merged.

> [...]
>> +static int get_num_ct_values(GetBitContext *gb, Atrac3pChanUnitCtx *ctx,
>> +                             AVCodecContext *avctx)
>> +{
>> +    int num_coded_vals;
>> +
>> +    if (get_bits1(gb)) {
>> +        num_coded_vals = get_bits(gb, 5);
>> +        if (num_coded_vals > ctx->used_quant_units) {
>> +            av_log(avctx, AV_LOG_ERROR,
>> +                   "Invalid number of code table indexes!\n");
> this could print the invalid value too in the errror message

Done.

> [...]
>
>> +                for (i = dst[sb].num_wavs - 1, first = 1; i >= 0; i--, first = 0) {
>> +                    if (first)
>> +                        iwav[i].freq_index = get_bits(gb, 10);
>> +                    else {
>> +                        nbits = av_log2(iwav[i + 1].freq_index) + 1;
>> +                        iwav[i].freq_index = get_bits(gb, nbits);
>> +                    }
>> +                }
> handling the "first" case outside the loop would avoid a check per
> loop iteration

Moved out of the loop...

> [...]
>> +static void decode_tones_phase(GetBitContext *gb, Atrac3pChanUnitCtx *ctx,
>> +                               int ch_num, int band_has_tones[])
>> +{
>> +    int sb, i;
>> +    Atrac3pWavesData *dst = ctx->channels[ch_num].tones_info;
>> +
>> +    for (sb = 0; sb < ctx->waves_info->num_tone_bands; sb++) {
>> +        if (!band_has_tones[sb])
>> +            continue;
>> +        for (i = 0; i < dst[sb].num_wavs; i++)
>> +            ctx->waves_info->waves[dst[sb].start_index + i].phase_index = get_bits(gb, 5);
> that looks a bit complex for an inner loop
> something like
>     phase_indexes = &ctx->waves_info->waves[dst[sb].start_index];
>     for(...)
>         phase_indexes[i].phase_index = get_bits(gb, 5);
>
> might be faster, same probably applies to other loops as well if they
> are speed relevant

This loop body has been simplified.

> [...]
>> +/** 3D base shape tables. The values are grouped together as follows:
>> + *  [num_start_values = 8][num_shape_tables = 16][num_seg_coeffs = 9]
>> + *  For each of the 8 start values there are 16 different shapes each
>> + *  9 coefficients long. */
>> +const int8_t ff_atrac3p_wl_shapes[8][16][9] = {
> some of the tables are only used from one file, these tables could
> be static, reducing the global namespace polution and might speed up
> linking in some cases on some platforms by a tiny amount

Most of these tables are only used from one file. Should I convert all 
of them to static?
Should I remove the prefix "ff_atrac3p_" as well?

> [...]
>> +
>> +typedef struct ATRAC3PContext {
>> +    GetBitContext gb;
>> +
>> +    DECLARE_ALIGNED(32, float, samples)[2][ATRAC3P_FRAME_SAMPLES];  ///< quantized MDCT spectrum
>> +    DECLARE_ALIGNED(32, float, mdct_buf)[2][ATRAC3P_FRAME_SAMPLES]; ///< output of the IMDCT
>> +    DECLARE_ALIGNED(32, float, time_buf)[2][ATRAC3P_FRAME_SAMPLES]; ///< output of the gain compensation
>> +    DECLARE_ALIGNED(32, float, outp_buf)[2][ATRAC3P_FRAME_SAMPLES];
>> +
>> +    AtracGCContext gainc_ctx;   ///< gain compensation context
>> +    FFTContext mdct_ctx;
>> +    FFTContext ipqf_dct_ctx;    ///< IDCT context used by IPQF
>> +
>> +    Atrac3pChanUnitCtx *ch_units;   ///< global channel units
>> +
>> +    int num_channel_blocks;     ///< number of channel blocks
>> +    uint8_t channel_blocks[5];  ///< channel configuration descriptor
>> +    uint64_t my_channel_layout; ///< current channel layout
>> +
>> +    ATRAC3PVlcTabs glob_vlcs;   ///< global VLC tables
> if they are global, they should probably be static and not in a struct
> unless iam missing something

All VLC tables have been made static. The above-mentioned stuff has been 
removed from the context.

>> +    switch (avctx->channel_layout) {
>> +    case AV_CH_LAYOUT_MONO:
>> +        ctx->num_channel_blocks = 1;
>> +        ctx->channel_blocks[0]  = CH_UNIT_MONO;
>> +        break;
>> +    case AV_CH_LAYOUT_STEREO:
>> +        ctx->num_channel_blocks = 1;
>> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
>> +        break;
>> +    case AV_CH_LAYOUT_SURROUND:
>> +        ctx->num_channel_blocks = 2;
>> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
>> +        ctx->channel_blocks[1]  = CH_UNIT_MONO;
>> +        break;
>> +    case AV_CH_LAYOUT_4POINT0:
>> +        ctx->num_channel_blocks = 3;
>> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
>> +        ctx->channel_blocks[1]  = CH_UNIT_MONO;
>> +        ctx->channel_blocks[2]  = CH_UNIT_MONO;
>> +        break;
>> +    case AV_CH_LAYOUT_5POINT1_BACK:
>> +        ctx->num_channel_blocks = 4;
>> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
>> +        ctx->channel_blocks[1]  = CH_UNIT_MONO;
>> +        ctx->channel_blocks[2]  = CH_UNIT_STEREO;
>> +        ctx->channel_blocks[3]  = CH_UNIT_MONO;
>> +        break;
>> +    case AV_CH_LAYOUT_6POINT1_BACK:
>> +        ctx->num_channel_blocks = 5;
>> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
>> +        ctx->channel_blocks[1]  = CH_UNIT_MONO;
>> +        ctx->channel_blocks[2]  = CH_UNIT_STEREO;
>> +        ctx->channel_blocks[3]  = CH_UNIT_MONO;
>> +        ctx->channel_blocks[4]  = CH_UNIT_MONO;
>> +        break;
>> +    case AV_CH_LAYOUT_7POINT1:
>> +        ctx->num_channel_blocks = 5;
>> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
>> +        ctx->channel_blocks[1]  = CH_UNIT_MONO;
>> +        ctx->channel_blocks[2]  = CH_UNIT_STEREO;
>> +        ctx->channel_blocks[3]  = CH_UNIT_STEREO;
>> +        ctx->channel_blocks[4]  = CH_UNIT_MONO;
>> +        break;
> could be simplified
> for example with something like this, if you like
>
> ctx->channel_blocks[0] = CH_UNIT_MONO;
> if (channels > 1)
>      ctx->channel_blocks[0] = CH_UNIT_STEREO;
> if (channels > 2)
>      ctx->channel_blocks[1] = CH_UNIT_MONO;
> if (channels > 3)
>      ctx->channel_blocks[2] = CH_UNIT_MONO;
> if (channels > 4) {
>      ctx->channel_blocks[2] = CH_UNIT_STEREO;
>      ctx->channel_blocks[3] = CH_UNIT_MONO;
> }
> if (channels > 6)
>      ctx->channel_blocks[4] = CH_UNIT_MONO;
> if (channels > 7)
>      ctx->channel_blocks[3] = CH_UNIT_STEREO;

It would just barely simplify the stuff because I need to estimate 
number of channel blocks as well.
Moreover, it makes the code harder to read IMHO. Therefore, I'd prefer 
to leave it as is...

>> +    ctx->ch_units = av_mallocz(sizeof(*ctx->ch_units) *
>> +                               ctx->num_channel_blocks);
> looks like a memleak

Good catch! Fixed...

> [...]
>> +    avpriv_float_dsp_init(&atrac3p_dsp, avctx->flags & CODEC_FLAG_BITEXACT);
> this initializes a global variable with a non global flag as input
> making the global potentially non constant

Fixed.

> [...]
>> +/* lookup table for fast modulo 23 op required for cyclic buffers of the IPQF */
>> +static int mod23_lut[26] = {
>> +    23,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11,
>> +    12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 0
>> +};
> should be static const

Fixed.

>> +
>> +/* First half of the 384-tap IPQF filtering coefficients. */
>> +static float ipqf_coeffs1[ATRAC3P_PQF_FIR_LEN][16] = {
> should be static const

Fixed.

> [...]
>
>> +/**
>> + *  Subband synthesis filter based on the polyphase quadrature (pseudo-QMF)
>> + *  filter bank.
>> + *
>> + *  @param[in]     dct_ctx  ptr to the pre-initialized IDCT context
>> + *  @param[in,out] hist     ptr to the filter history
>> + *  @param[in]     in       input data to process
>> + *  @param[out]    out      receives processed data
>> + */
>> +void ff_atrac3p_ipqf(FFTContext *dct_ctx, Atrac3pIPQFChannelCtx *hist,
>> +                     float *in, float *out)
> in could be const

Done.

> you also might want to add yourself to MAINTAINERS

Ok, postponed to the next round...