[FFmpeg-devel] [PATCH] Add ATRAC3+ decoder

Fri Oct 11 12:41:09 CEST 2013

Hi Maxim

On Thu, Oct 10, 2013 at 09:16:37PM +0200, Maxim Polijakowski wrote:
> Hi crews,
> 
> the attached patch adds an open-source decoder for Sony's ATRAC3+
> format. There is a partial description of its internals located
> here: http://wiki.multimedia.cx/index.php?title=ATRAC3plus
> 
> Samples: http://samples.mplayerhq.hu/A-codecs/ATRAC3+/
> 
> Because the patch is huge, I expect several review rounds.
> 
> Therefore, I would like to kindly ask you to sort your review
> comments according to the following priority list:
> --> functionality improvements (code refractions resulting in faster
> or smaller code)
> --> security improvements (code refractions resulting in more secure code)
> --> cosmetic improvements (grammar, spelling etc.)

feel free to ignore any or all cosmetic review comments from me

> 
> Any help regarding fuzzy testing and pointing out insecure code
> would be very appreciated!
> 
> Thank you in advance!
> Best regards
> Maxim

>  Changelog                    |    1 
>  configure                    |    1 
>  doc/general.texi             |    1 
>  libavcodec/Makefile          |    3 
>  libavcodec/allcodecs.c       |    1 
>  libavcodec/atrac3plus.c      | 1735 ++++++++++++++++++++++++++++++++++++++
>  libavcodec/atrac3plus.h      |  183 ++++
>  libavcodec/atrac3plus_data.c | 1940 +++++++++++++++++++++++++++++++++++++++++++
>  libavcodec/atrac3plus_data.h |  144 +++
>  libavcodec/atrac3plusdec.c   |  394 ++++++++
>  libavcodec/atrac3plusdsp.c   |  658 ++++++++++++++
>  libavcodec/version.h         |    2 
>  12 files changed, 5062 insertions(+), 1 deletion(-)

[...]

> diff --git a/libavcodec/atrac3plus.c b/libavcodec/atrac3plus.c
> new file mode 100644
> index 0000000..6ceda54
> --- /dev/null
> +++ b/libavcodec/atrac3plus.c
> @@ -0,0 +1,1735 @@
> +/*
> + * ATRAC3+ compatible decoder
> + *
> + * Copyright (c) 2010-2013 Maxim Poliakovski
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * Bitstream parser for ATRAC3+ decoder.
> + */
> +
> +#include "avcodec.h"
> +#include "get_bits.h"
> +#include "atrac3plus.h"
> +#include "atrac3plus_data.h"
> +

> +/**
> + * Generate canonical VLC table from given descriptor.
> + *
> + * @param[in]  cb       ptr to codebook descriptor
> + * @param[in]  xlat     ptr to translation table or NULL
> + * @param[out] out_vlc  ptr to vlc table to be generated
> + */
> +static av_cold void build_canonical_huff(const uint8_t *cb, const uint8_t *xlat,
> +                                         VLC *out_vlc)
> +{
> +    int i, b;
> +    uint16_t codes[256];
> +    uint8_t bits[256];
> +    unsigned code = 0;
> +    int index = 0;
> +    int min_len = *cb++; // get shortest codeword length
> +    int max_len = *cb++; // get longest  codeword length
> +
> +    for (b = min_len; b <= max_len; b++) {
> +        for (i = *cb++; i > 0; i--) {
> +            bits[index]  = b;
> +            codes[index] = code++;
> +            index++;

index should, for saftey be checked to be within the array.
Its not a bug as this function is never feeded with untrusted data
but it feels safer to me to check or assert() that index doesnt
become too large

> +        }
> +        code <<= 1;
> +    }
> +
> +    ff_init_vlc_sparse(out_vlc, max_len, index, bits, 1, 1, codes, 2, 2,
> +                       xlat, 1, 1, 0);
> +}

build_canonical_huff() could be done to all tables and the resulting
tables be put in the source. This would avoid the need for
build_canonical_huff() and simplify the code
also it would allow the tables to be shared between instances of a
loaded lib as they would be static const and not generated at runtime.

making that change would likely increase the size of static const
tables by <50% though.
If you prefer to leave it as it is to avoid these larger tables
thats fine too

[...]
> +static int get_num_ct_values(GetBitContext *gb, Atrac3pChanUnitCtx *ctx,
> +                             AVCodecContext *avctx)
> +{
> +    int num_coded_vals;
> +
> +    if (get_bits1(gb)) {
> +        num_coded_vals = get_bits(gb, 5);
> +        if (num_coded_vals > ctx->used_quant_units) {
> +            av_log(avctx, AV_LOG_ERROR,
> +                   "Invalid number of code table indexes!\n");

this could print the invalid value too in the errror message

[...]
> +static void decode_tones_frequency(GetBitContext *gb, Atrac3pChanUnitCtx *ctx,
> +                                   int ch_num, int band_has_tones[],
> +                                   ATRAC3PVlcTabs *vlc_tabs)
> +{
> +    int sb, i, direction, nbits, first, pred, delta;
> +    Atrac3pWaveParam *iwav, *owav;
> +    Atrac3pWavesData *dst = ctx->channels[ch_num].tones_info;
> +    Atrac3pWavesData *ref = ctx->channels[0].tones_info;
> +
> +    if (!ch_num || !get_bits1(gb)) { /* mode 0: fixed-length coding */
> +        for (sb = 0; sb < ctx->waves_info->num_tone_bands; sb++) {
> +            if (!band_has_tones[sb] || !dst[sb].num_wavs)
> +                continue;
> +            iwav      = &ctx->waves_info->waves[dst[sb].start_index];
> +            direction = (dst[sb].num_wavs > 1) ? get_bits1(gb) : 0;
> +            if (direction) { /** packed numbers in descending order */

> +                for (i = dst[sb].num_wavs - 1, first = 1; i >= 0; i--, first = 0) {
> +                    if (first)
> +                        iwav[i].freq_index = get_bits(gb, 10);
> +                    else {
> +                        nbits = av_log2(iwav[i + 1].freq_index) + 1;
> +                        iwav[i].freq_index = get_bits(gb, nbits);
> +                    }
> +                }

handling the "first" case outside the loop would avoid a check per
loop iteration

[...]
> +static void decode_tones_phase(GetBitContext *gb, Atrac3pChanUnitCtx *ctx,
> +                               int ch_num, int band_has_tones[])
> +{
> +    int sb, i;
> +    Atrac3pWavesData *dst = ctx->channels[ch_num].tones_info;
> +
> +    for (sb = 0; sb < ctx->waves_info->num_tone_bands; sb++) {
> +        if (!band_has_tones[sb])
> +            continue;

> +        for (i = 0; i < dst[sb].num_wavs; i++)
> +            ctx->waves_info->waves[dst[sb].start_index + i].phase_index = get_bits(gb, 5);

that looks a bit complex for an inner loop
something like
   phase_indexes = &ctx->waves_info->waves[dst[sb].start_index];
   for(...)
       phase_indexes[i].phase_index = get_bits(gb, 5);

might be faster, same probably applies to other loops as well if they
are speed relevant

[...]
> +/** 3D base shape tables. The values are grouped together as follows:
> + *  [num_start_values = 8][num_shape_tables = 16][num_seg_coeffs = 9]
> + *  For each of the 8 start values there are 16 different shapes each
> + *  9 coefficients long. */
> +const int8_t ff_atrac3p_wl_shapes[8][16][9] = {

some of the tables are only used from one file, these tables could
be static, reducing the global namespace polution and might speed up
linking in some cases on some platforms by a tiny amount

[...]
> diff --git a/libavcodec/atrac3plusdec.c b/libavcodec/atrac3plusdec.c
> new file mode 100644
> index 0000000..73d6616
> --- /dev/null
> +++ b/libavcodec/atrac3plusdec.c
> @@ -0,0 +1,394 @@
> +/*
> + * ATRAC3+ compatible decoder
> + *
> + * Copyright (c) 2010-2013 Maxim Poliakovski
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * Sony ATRAC3+ compatible decoder.
> + *
> + * Container formats used to store its data:
> + * RIFF WAV (.at3) and Sony OpenMG (.oma, .aa3).
> + *
> + * Technical description of this codec can be found here:
> + * http://wiki.multimedia.cx/index.php?title=ATRAC3plus
> + *
> + * Kudos to Benjamin Larsson and Michael Karcher
> + * for their precious technical help!
> + */
> +
> +#include <stdint.h>
> +#include <string.h>
> +
> +#include "libavutil/channel_layout.h"
> +#include "avcodec.h"
> +#include "get_bits.h"
> +#include "internal.h"
> +#include "atrac.h"
> +#include "atrac3plus.h"
> +#include "atrac3plus_data.h"
> +
> +typedef struct ATRAC3PContext {
> +    GetBitContext gb;
> +
> +    DECLARE_ALIGNED(32, float, samples)[2][ATRAC3P_FRAME_SAMPLES];  ///< quantized MDCT spectrum
> +    DECLARE_ALIGNED(32, float, mdct_buf)[2][ATRAC3P_FRAME_SAMPLES]; ///< output of the IMDCT
> +    DECLARE_ALIGNED(32, float, time_buf)[2][ATRAC3P_FRAME_SAMPLES]; ///< output of the gain compensation
> +    DECLARE_ALIGNED(32, float, outp_buf)[2][ATRAC3P_FRAME_SAMPLES];
> +
> +    AtracGCContext gainc_ctx;   ///< gain compensation context
> +    FFTContext mdct_ctx;
> +    FFTContext ipqf_dct_ctx;    ///< IDCT context used by IPQF
> +
> +    Atrac3pChanUnitCtx *ch_units;   ///< global channel units
> +
> +    int num_channel_blocks;     ///< number of channel blocks
> +    uint8_t channel_blocks[5];  ///< channel configuration descriptor
> +    uint64_t my_channel_layout; ///< current channel layout
> +

> +    ATRAC3PVlcTabs glob_vlcs;   ///< global VLC tables

if they are global, they should probably be static and not in a struct
unless iam missing something

> +} ATRAC3PContext;
> +
> +static av_cold int set_channel_params(ATRAC3PContext *ctx,
> +                                      AVCodecContext *avctx)
> +{
> +    memset(ctx->channel_blocks, 0, sizeof(ctx->channel_blocks));
> +

> +    switch (avctx->channel_layout) {
> +    case AV_CH_LAYOUT_MONO:
> +        ctx->num_channel_blocks = 1;
> +        ctx->channel_blocks[0]  = CH_UNIT_MONO;
> +        break;
> +    case AV_CH_LAYOUT_STEREO:
> +        ctx->num_channel_blocks = 1;
> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
> +        break;
> +    case AV_CH_LAYOUT_SURROUND:
> +        ctx->num_channel_blocks = 2;
> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
> +        ctx->channel_blocks[1]  = CH_UNIT_MONO;
> +        break;
> +    case AV_CH_LAYOUT_4POINT0:
> +        ctx->num_channel_blocks = 3;
> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
> +        ctx->channel_blocks[1]  = CH_UNIT_MONO;
> +        ctx->channel_blocks[2]  = CH_UNIT_MONO;
> +        break;
> +    case AV_CH_LAYOUT_5POINT1_BACK:
> +        ctx->num_channel_blocks = 4;
> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
> +        ctx->channel_blocks[1]  = CH_UNIT_MONO;
> +        ctx->channel_blocks[2]  = CH_UNIT_STEREO;
> +        ctx->channel_blocks[3]  = CH_UNIT_MONO;
> +        break;
> +    case AV_CH_LAYOUT_6POINT1_BACK:
> +        ctx->num_channel_blocks = 5;
> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
> +        ctx->channel_blocks[1]  = CH_UNIT_MONO;
> +        ctx->channel_blocks[2]  = CH_UNIT_STEREO;
> +        ctx->channel_blocks[3]  = CH_UNIT_MONO;
> +        ctx->channel_blocks[4]  = CH_UNIT_MONO;
> +        break;
> +    case AV_CH_LAYOUT_7POINT1:
> +        ctx->num_channel_blocks = 5;
> +        ctx->channel_blocks[0]  = CH_UNIT_STEREO;
> +        ctx->channel_blocks[1]  = CH_UNIT_MONO;
> +        ctx->channel_blocks[2]  = CH_UNIT_STEREO;
> +        ctx->channel_blocks[3]  = CH_UNIT_STEREO;
> +        ctx->channel_blocks[4]  = CH_UNIT_MONO;
> +        break;

could be simplified
for example with something like this, if you like

ctx->channel_blocks[0] = CH_UNIT_MONO;
if (channels > 1)
    ctx->channel_blocks[0] = CH_UNIT_STEREO;
if (channels > 2)
    ctx->channel_blocks[1] = CH_UNIT_MONO;
if (channels > 3)
    ctx->channel_blocks[2] = CH_UNIT_MONO;
if (channels > 4) {
    ctx->channel_blocks[2] = CH_UNIT_STEREO;
    ctx->channel_blocks[3] = CH_UNIT_MONO;
}
if (channels > 6)
    ctx->channel_blocks[4] = CH_UNIT_MONO;
if (channels > 7)
    ctx->channel_blocks[3] = CH_UNIT_STEREO;

> +    default:
> +        av_log(avctx, AV_LOG_ERROR,
> +               "Unsupported channel layout: %"PRIx64"!\n", avctx->channel_layout);
> +        return AVERROR_INVALIDDATA;
> +    }
> +
> +    return 0;
> +}
> +
> +static av_cold int decode_init(AVCodecContext *avctx)
> +{
> +    ATRAC3PContext *ctx = avctx->priv_data;
> +    int i, ch, ret;
> +
> +    ff_atrac3p_init_vlcs(&ctx->glob_vlcs);
> +
> +    /* initialize IPQF */
> +    ff_mdct_init(&ctx->ipqf_dct_ctx, 5, 1, 32.0 / 32768.0);
> +
> +    ff_atrac3p_init_imdct(avctx, &ctx->mdct_ctx);
> +
> +    ff_atrac_init_gain_compensation(&ctx->gainc_ctx, 6, 2);
> +
> +    ff_atrac3p_init_wave_synth();
> +
> +    if ((ret = set_channel_params(ctx, avctx)) < 0)
> +        return ret;
> +
> +    ctx->my_channel_layout = avctx->channel_layout;
> +

> +    ctx->ch_units = av_mallocz(sizeof(*ctx->ch_units) *
> +                               ctx->num_channel_blocks);

looks like a memleak

[...]
> +static AVFloatDSPContext atrac3p_dsp;
> +
> +/**
> + * ATRAC3+ uses two different MDCT windows:
> + * - The first one is just the plain sine window of size 256.
> + * - The 2nd one is the plain sine window of size 128
> + *   wrapped into zero (at the start) and one (at the end) regions.
> + *   Both regions are 32 samples long. */
> +static float mdct_wind_steep[128]; ///< second MDCT window
> +
> +av_cold void ff_atrac3p_init_imdct(AVCodecContext *avctx, FFTContext *mdct_ctx)
> +{
> +    int i;
> +

> +    avpriv_float_dsp_init(&atrac3p_dsp, avctx->flags & CODEC_FLAG_BITEXACT);

this initializes a global variable with a non global flag as input
making the global potentially non constant

[...]
> +/* lookup table for fast modulo 23 op required for cyclic buffers of the IPQF */
> +static int mod23_lut[26] = {
> +    23,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11,
> +    12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 0
> +};

should be static const

> +
> +/* First half of the 384-tap IPQF filtering coefficients. */
> +static float ipqf_coeffs1[ATRAC3P_PQF_FIR_LEN][16] = {

should be static const

[...]

> +/**
> + *  Subband synthesis filter based on the polyphase quadrature (pseudo-QMF)
> + *  filter bank.
> + *
> + *  @param[in]     dct_ctx  ptr to the pre-initialized IDCT context
> + *  @param[in,out] hist     ptr to the filter history
> + *  @param[in]     in       input data to process
> + *  @param[out]    out      receives processed data
> + */
> +void ff_atrac3p_ipqf(FFTContext *dct_ctx, Atrac3pIPQFChannelCtx *hist,
> +                     float *in, float *out)

in could be const

you also might want to add yourself to MAINTAINERS

[...]

Thanks

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Those who are best at talking, realize last or never when they are wrong.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20131011/a09597ac/attachment.asc>