[FFmpeg-devel] [PATCH] NellyMoser audio decoder v2

Tue Oct 2 00:48:29 CEST 2007

Hi

On Mon, Oct 01, 2007 at 02:25:59PM +0200, Benjamin Larsson wrote:
> Benjamin Larsson wrote:
>> [...]
>> Hi, I had some time yesterday and looked closer at the code. While there 
>> is no doubt that the code is a TDAC implementation it doesn't seem like 
>> the code can be replaced by the mdct code in ffmpeg. So I guess that the 
>> fft replacement is what can be clean up for now. I'll sort out the table 
>> generation for the rotations and the window.

if its not IMDCT then what is it?

[...]

> +static const int ff_band_sizes_table[23] = {
> +2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 4, 4, 5, 6, 6, 7, 8, 9, 10, 12, 14, 15
> +};

fits in uint8_t

> +
> +static const float ff_nelly_signal_table[64] = {
> +0.1250000000, 0.1249623969, 0.1248494014, 0.1246612966, 0.1243980974,
> +0.1240599006, 0.1236471012, 0.1231596991, 0.1225982010, 0.1219628006,
> +0.1212539002, 0.1204719990, 0.1196174994, 0.1186909974, 0.1176929995,
> +0.1166241020, 0.1154849008, 0.1142762005, 0.1129987016, 0.1116530001,
> +0.1102401987, 0.1087609008, 0.1072160974, 0.1056066975, 0.1039336994,
> +0.1021981016, 0.1004009023, 0.0985433012, 0.0966262966, 0.0946511030,
> +0.0926188976, 0.0905309021, 0.0883883014, 0.0861926004, 0.0839449018,
> +0.0816465989, 0.0792991966, 0.0769039020, 0.0744623989, 0.0719759986,
> +0.0694463030, 0.0668746978, 0.0642627999, 0.0616123006, 0.0589246005,
> +0.0562013984, 0.0534444004, 0.0506552011, 0.0478353985, 0.0449868999,
> +0.0421111993, 0.0392102003, 0.0362856016, 0.0333391018, 0.0303725004,
> +0.0273876991, 0.0243862998, 0.0213702004, 0.0183412991, 0.0153013002,
> +0.0122520998, 0.0091955997, 0.0061335000, 0.0030677000
> +};

this is cos(i/128.0*M_PI)/8
which should be mentioned in a comment at least, also the name should be
changed to reflect what it is

[...]
> +static void antialias(float *buf, float *audio)
> +{
> +    int i, end, mid_hi, mid_lo;
> +    float a, b, c, d, e, f;
> +
> +    end = NELLY_BUF_LEN-1;
> +    mid_hi = NELLY_BUF_LEN/2;
> +    mid_lo = mid_hi-1;
> +
> +    for (i = 0; i < NELLY_BUF_LEN/4; i++) {
> +        a = buf[end-2*i];
> +        b = buf[2*i];
> +        c = buf[2*i+1];
> +        d = buf[end-2*i-1];
> +
> +        e = tcos[i];
> +        f = tsin[i];
> +
> +        audio[2*i] = b*e-a*f;
> +        audio[2*i+1] = -(a*e+b*f);
> +
> +        a = tsin[mid_lo-i];
> +        b = tcos[mid_lo-i];
> +
> +        audio[end-2*i-1] = b*d-a*c;
> +        audio[end-2*i] = -(b*c+a*d);
> +    }

i think it would be more readable if instead of efab
the tcos/tsin would be stored in variables with the names s and c or
wr and wi maybe?

[...]

> +        audio[bot] = audio[mid_up]*t[bot]+state[bot]*t[top];
> +        audio[top] = state[bot]*t[bot]-audio[mid_up]*t[top];
> +        state[bot] = -audio[mid_down];
> +
> +        audio[mid_down] = s_top*t[mid_down]+state[mid_down]*t[mid_up];
> +        audio[mid_up] = state[mid_down]*t[mid_down]-s_top*t[mid_up];
> +        state[mid_down] = -s_bot;

audio[bot] =  audio[mid_up]*t[bot]+state[bot   ]*t[top];
audio[top] =  state[bot   ]*t[bot]-audio[mid_up]*t[top];
state[bot] = -audio[mid_down];

audio[mid_down] =  s_top         *t[mid_down]+state[mid_down]*t[mid_up];
audio[mid_up  ] = state[mid_down]*t[mid_down]-s_top          *t[mid_up];
state[mid_down] = -s_bot;

[...]
> +    bitsum = sum_bits(sbuf, shift_saved, off);
> +
> +    if (bitsum != NELLY_DETAIL_BITS) {
> +        shift = 0;
> +        diff = bitsum - NELLY_DETAIL_BITS;
> +
> +        while (FFABS(diff) <= 16383) {
> +            shift++;
> +            diff *= 2;
> +        }
> +
> +        diff = (diff * NELLY_BASE_OFF) >> 15;
> +        shift = shift_saved-(NELLY_BASE_SHIFT+shift-15);
> +
> +        diff = signed_shift(diff, shift);
> +
> +        for (j = 1; j < 20; j++) {
> +            tmp = off;
> +            off += diff;
> +            last_bitsum = bitsum;
> +
> +            bitsum = sum_bits(sbuf, shift_saved, off);
> +
> +            if ((bitsum-NELLY_DETAIL_BITS) * (last_bitsum-NELLY_DETAIL_BITS) <= 0)
> +                break;
> +        }
> +
> +        if (bitsum != NELLY_DETAIL_BITS) {
> +            if (bitsum > NELLY_DETAIL_BITS) {
> +                big_off = off;
> +                off = tmp;
> +                big_bitsum=bitsum;
> +                small_bitsum=last_bitsum;
> +            } else {
> +                big_off = tmp;
> +                big_bitsum=last_bitsum;
> +                small_bitsum=bitsum;
> +            }
> +
> +            while (bitsum != NELLY_DETAIL_BITS && j <= 19) {
> +                diff = (big_off+off)>>1;
> +                bitsum = sum_bits(sbuf, shift_saved, diff);
> +                if (bitsum > NELLY_DETAIL_BITS) {
> +                    big_off=diff;
> +                    big_bitsum=bitsum;
> +                } else {
> +                    off = diff;
> +                    small_bitsum=bitsum;
> +                }
> +                j++;
> +            }

this is a binary search and i think the variables could be named slightly
better and the whole can be simplified alot unless its buggy (=not a
working binary search)
and maybe it should be moved into its own function

> +
> +            if (abs(big_bitsum-NELLY_DETAIL_BITS) >=
> +                abs(small_bitsum-NELLY_DETAIL_BITS)) {

are the abs needed? isnt the sign of these expresions always the same?

[...]
> +        bitsum = NELLY_DETAIL_BITS;
> +        while (i < NELLY_FILL_LEN) {
> +            bits[i] = 0;
> +            i++;
> +        }
> +    }

the bitsum= here makes no sense, its not used

[...]
> +        antialias(buf, aptr);
> +        ff_fft_permute(&fftc, (FFTComplex*)aptr);
> +        ff_fft_calc(&fftc, (FFTComplex*)aptr);
> +
> +         for (j = 0; j<64; j++) {
> +             ((FFTComplex*)aptr)[j].im = -((FFTComplex*)aptr)[j].im;
> +         }
> +        complex2signal(aptr);

indention looks odd

[...]
> +void nelly_util_floats2shorts(float audio[256], short shorts[256])
> +{
> +    int i;
> +
> +    for (i = 0; i < 256; i++) {
> +        if (audio[i] >= 32767.0)
> +            shorts[i] = 32767;
> +        else if (audio[i] <= -32768.0)
> +            shorts[i] = -32768;
> +        else
> +            shorts[i] = (short)audio[i];
> +    }
> +}

shouldnt this be using the optimized dsp code?

[...]
> +    while (remaining_size >= NELLY_BLOCK_LEN) {
> +        nelly_decode_block(s, buf, s->float_buf);
> +        nelly_util_floats2shorts(s->float_buf, data);
> +        data += NELLY_SAMPLE_SIZE;
> +        *data_size += NELLY_SAMPLE_SIZE;
> +        buf += NELLY_BLOCK_LEN;
> +        remaining_size -= NELLY_BLOCK_LEN;
> +    }

this looks like theres a AVParser missing to split it into packets

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If a bugfix only changes things apparently unrelated to the bug with no
further explanation, that is a good sign that the bugfix is wrong.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20071002/bd3e3e6d/attachment.pgp>