[FFmpeg-devel] [PATCH] NellyMoser audio decoder v2
Benjamin Larsson
banan
Wed Oct 10 11:00:14 CEST 2007
Here's the comments for the Nellymoser update patch.
Michael Niedermayer wrote:
> Hi
>
> On Mon, Oct 01, 2007 at 02:25:59PM +0200, Benjamin Larsson wrote:
>
>> Benjamin Larsson wrote:
>>
>>> [...]
>>> Hi, I had some time yesterday and looked closer at the code. While there
>>> is no doubt that the code is a TDAC implementation it doesn't seem like
>>> the code can be replaced by the mdct code in ffmpeg. So I guess that the
>>> fft replacement is what can be clean up for now. I'll sort out the table
>>> generation for the rotations and the window.
>>>
>
> if its not IMDCT then what is it?
>
>
> [...]
>
>
>
>> +static const int ff_band_sizes_table[23] = {
>> +2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 4, 4, 5, 6, 6, 7, 8, 9, 10, 12, 14, 15
>> +};
>>
>
> fits in uint8_t
>
>
>
Fixed.
>> +
>> +static const float ff_nelly_signal_table[64] = {
>> +0.1250000000, 0.1249623969, 0.1248494014, 0.1246612966, 0.1243980974,
>> +0.1240599006, 0.1236471012, 0.1231596991, 0.1225982010, 0.1219628006,
>> +0.1212539002, 0.1204719990, 0.1196174994, 0.1186909974, 0.1176929995,
>> +0.1166241020, 0.1154849008, 0.1142762005, 0.1129987016, 0.1116530001,
>> +0.1102401987, 0.1087609008, 0.1072160974, 0.1056066975, 0.1039336994,
>> +0.1021981016, 0.1004009023, 0.0985433012, 0.0966262966, 0.0946511030,
>> +0.0926188976, 0.0905309021, 0.0883883014, 0.0861926004, 0.0839449018,
>> +0.0816465989, 0.0792991966, 0.0769039020, 0.0744623989, 0.0719759986,
>> +0.0694463030, 0.0668746978, 0.0642627999, 0.0616123006, 0.0589246005,
>> +0.0562013984, 0.0534444004, 0.0506552011, 0.0478353985, 0.0449868999,
>> +0.0421111993, 0.0392102003, 0.0362856016, 0.0333391018, 0.0303725004,
>> +0.0273876991, 0.0243862998, 0.0213702004, 0.0183412991, 0.0153013002,
>> +0.0122520998, 0.0091955997, 0.0061335000, 0.0030677000
>> +};
>>
>
> this is cos(i/128.0*M_PI)/8
> which should be mentioned in a comment at least, also the name should be
> changed to reflect what it is
>
>
Fixed.
> [...]
>
>> +static void antialias(float *buf, float *audio)
>> +{
>> + int i, end, mid_hi, mid_lo;
>> + float a, b, c, d, e, f;
>> +
>> + end = NELLY_BUF_LEN-1;
>> + mid_hi = NELLY_BUF_LEN/2;
>> + mid_lo = mid_hi-1;
>> +
>> + for (i = 0; i < NELLY_BUF_LEN/4; i++) {
>> + a = buf[end-2*i];
>> + b = buf[2*i];
>> + c = buf[2*i+1];
>> + d = buf[end-2*i-1];
>> +
>> + e = tcos[i];
>> + f = tsin[i];
>> +
>> + audio[2*i] = b*e-a*f;
>> + audio[2*i+1] = -(a*e+b*f);
>> +
>> + a = tsin[mid_lo-i];
>> + b = tcos[mid_lo-i];
>> +
>> + audio[end-2*i-1] = b*d-a*c;
>> + audio[end-2*i] = -(b*c+a*d);
>> + }
>>
>
> i think it would be more readable if instead of efab
> the tcos/tsin would be stored in variables with the names s and c or
> wr and wi maybe?
>
> [...]
>
Rewritten (compacted).
>
>> + audio[bot] = audio[mid_up]*t[bot]+state[bot]*t[top];
>> + audio[top] = state[bot]*t[bot]-audio[mid_up]*t[top];
>> + state[bot] = -audio[mid_down];
>> +
>> + audio[mid_down] = s_top*t[mid_down]+state[mid_down]*t[mid_up];
>> + audio[mid_up] = state[mid_down]*t[mid_down]-s_top*t[mid_up];
>> + state[mid_down] = -s_bot;
>>
>
> audio[bot] = audio[mid_up]*t[bot]+state[bot ]*t[top];
> audio[top] = state[bot ]*t[bot]-audio[mid_up]*t[top];
> state[bot] = -audio[mid_down];
>
> audio[mid_down] = s_top *t[mid_down]+state[mid_down]*t[mid_up];
> audio[mid_up ] = state[mid_down]*t[mid_down]-s_top *t[mid_up];
> state[mid_down] = -s_bot;
>
>
> [...]
>
Fixed.
>> + bitsum = sum_bits(sbuf, shift_saved, off);
>> +
>> + if (bitsum != NELLY_DETAIL_BITS) {
>> + shift = 0;
>> + diff = bitsum - NELLY_DETAIL_BITS;
>> +
>> + while (FFABS(diff) <= 16383) {
>> + shift++;
>> + diff *= 2;
>> + }
>> +
>> + diff = (diff * NELLY_BASE_OFF) >> 15;
>> + shift = shift_saved-(NELLY_BASE_SHIFT+shift-15);
>> +
>> + diff = signed_shift(diff, shift);
>> +
>> + for (j = 1; j < 20; j++) {
>> + tmp = off;
>> + off += diff;
>> + last_bitsum = bitsum;
>> +
>> + bitsum = sum_bits(sbuf, shift_saved, off);
>> +
>> + if ((bitsum-NELLY_DETAIL_BITS) * (last_bitsum-NELLY_DETAIL_BITS) <= 0)
>> + break;
>> + }
>> +
>> + if (bitsum != NELLY_DETAIL_BITS) {
>> + if (bitsum > NELLY_DETAIL_BITS) {
>> + big_off = off;
>> + off = tmp;
>> + big_bitsum=bitsum;
>> + small_bitsum=last_bitsum;
>> + } else {
>> + big_off = tmp;
>> + big_bitsum=last_bitsum;
>> + small_bitsum=bitsum;
>> + }
>> +
>> + while (bitsum != NELLY_DETAIL_BITS && j <= 19) {
>> + diff = (big_off+off)>>1;
>> + bitsum = sum_bits(sbuf, shift_saved, diff);
>> + if (bitsum > NELLY_DETAIL_BITS) {
>> + big_off=diff;
>> + big_bitsum=bitsum;
>> + } else {
>> + off = diff;
>> + small_bitsum=bitsum;
>> + }
>> + j++;
>> + }
>>
>
> this is a binary search and i think the variables could be named slightly
> better and the whole can be simplified alot unless its buggy (=not a
> working binary search)
> and maybe it should be moved into its own function
>
>
>
Looked at it but I don't really understand what it does. Help cleaning
this up would be appreciated.
>> +
>> + if (abs(big_bitsum-NELLY_DETAIL_BITS) >=
>> + abs(small_bitsum-NELLY_DETAIL_BITS)) {
>>
>
> are the abs needed? isnt the sign of these expresions always the same?
>
> [...]
>
They are needed.
>> + bitsum = NELLY_DETAIL_BITS;
>> + while (i < NELLY_FILL_LEN) {
>> + bits[i] = 0;
>> + i++;
>> + }
>> + }
>>
>
> the bitsum= here makes no sense, its not used
>
>
> [...]
>
Removed.
>> + antialias(buf, aptr);
>> + ff_fft_permute(&fftc, (FFTComplex*)aptr);
>> + ff_fft_calc(&fftc, (FFTComplex*)aptr);
>> +
>> + for (j = 0; j<64; j++) {
>> + ((FFTComplex*)aptr)[j].im = -((FFTComplex*)aptr)[j].im;
>> + }
>> + complex2signal(aptr);
>>
>
> indention looks odd
>
>
> [...]
>
Removed.
>> +void nelly_util_floats2shorts(float audio[256], short shorts[256])
>> +{
>> + int i;
>> +
>> + for (i = 0; i < 256; i++) {
>> + if (audio[i] >= 32767.0)
>> + shorts[i] = 32767;
>> + else if (audio[i] <= -32768.0)
>> + shorts[i] = -32768;
>> + else
>> + shorts[i] = (short)audio[i];
>> + }
>> +}
>>
>
> shouldnt this be using the optimized dsp code?
>
> [...]
>
Fixed.
>> + while (remaining_size >= NELLY_BLOCK_LEN) {
>> + nelly_decode_block(s, buf, s->float_buf);
>> + nelly_util_floats2shorts(s->float_buf, data);
>> + data += NELLY_SAMPLE_SIZE;
>> + *data_size += NELLY_SAMPLE_SIZE;
>> + buf += NELLY_BLOCK_LEN;
>> + remaining_size -= NELLY_BLOCK_LEN;
>> + }
>>
>
> this looks like theres a AVParser missing to split it into packets
>
>
> [...]
>
Rewritten.
MvH
Benjamin Larsson
More information about the ffmpeg-devel
mailing list