[FFmpeg-devel] Nellymoser encoder

Fri Aug 29 20:55:23 CEST 2008

Friday 29 August 2008 15:54:32 Michael Niedermayer napisa?(a):
> On Fri, Aug 29, 2008 at 03:11:59PM +0200, Bartlomiej Wolowiec wrote:
> > Friday 29 August 2008 00:02:36 Michael Niedermayer napisa?(a):
> > > > +#define LUT_init_add -3134
> > > > +#define LUT_init_size 31355 + LUT_init_add
> > > > +static int LUT_init_table[LUT_init_size];
> > >
> > > i do not belive that the table needs to be that large
> > >
> > > > +
> > > > +#define LUT_delta_add 11725
> > > > +#define LUT_delta_size 12975 + LUT_delta_add
> > > > +static int LUT_delta_table[LUT_delta_size];
> > > > +
> > > > +#define LUT_dequantization_mul 128.0
> > > > +#define LUT_dequantization_add LUT_dequantization_mul * 2.7
> > > > +#define LUT_dequantization_size (int)(LUT_dequantization_mul * 2.5 +
> > > > LUT_dequantization_add) +#define LUT_dequantization_maxbits 6
> > > > +static int
> > > > LUT_dequantization_table[LUT_dequantization_maxbits][LUT_dequantizati
> > > >on_s ize];
> > >
> > > neither do i belive that this one needs to be that large
> > > besides they both can be uint8_t instead of int
> > >
> > > and the tables for fewer bits dont need to be as large as the largest
> >
> > Ok, I've tried to change sizes of these arrays. Unfortunately, now I have
> > a problem, because I don't know how I can simply allocate memory for
> > LUT_dequantization_table so that the whole is thread-safety.
>
> drop all the messy stuff and the problems will disapear

Ok, I cleared it significantly. Now it looks much better. 

> > > > +
> > > > +void apply_mdct(NellyMoserEncodeContext *s, float *in, float *coefs)
> > > > +{
> > > > +    DECLARE_ALIGNED_16(float, in_buff[NELLY_SAMPLES]);
> > > > +
> > > > +    memcpy(&in_buff[0], &in[0], NELLY_SAMPLES * sizeof(float));
> > > > +    s->dsp.vector_fmul(in_buff, ff_sine_128, NELLY_BUF_LEN);
> > > > +    s->dsp.vector_fmul_reverse(in_buff + NELLY_BUF_LEN, in_buff +
> > > > NELLY_BUF_LEN, ff_sine_128, NELLY_BUF_LEN); +
> > > > ff_mdct_calc(&s->mdct_ctx, coefs, in_buff);
> > > > +}
> > >
> > > The data is copied once in encode_frame and twice here
> > > There is no need to copy the data 3 times.
> > > vector_fmul can be used with a singl memcpy to get the data into any
> > > destination, and vector_fmul_reverse doesnt even need 1 memcpy, so
> > > overall a single memcpy is enough
> >
> > Hope that you meant something similar to my solution.
>
> no, you still do 2 memcpy() but now the code is really messy as well.
>
> what you should do is, for each block of samples you get from the user
> 1. apply one half of the window onto it with vector_fmul_reverse and
>    destination of some internal buffer
> 2. memcpy into the 2nd destination and apply the other half of the
>    window onto it with vector_fmul
> 3. run the mdct as appropriate on the internal buffers.

Hmm, I considered it, but I don't understand exactly what should I change...
In the code I copy data two times: 
a) in encode_frame - I convert int16_t to float and copy data to s->buf - I 
need to do it somewhere because vector_mul requires float *. Additionally, 
part of the data is needed to the next call of encode_frame
b) in apply_mdct - here I think that some additional part of buffer is needed.
If I understood correctly I have to get rid of a), but how to get access to 
old data when the next call of encode_frame is performed and how call 
vector_fmul on int16_t?

> > > [...]
> > >
> > > > +#define find_best(val, table, LUT, LUT_add, LUT_mul, LUT_size) \
> > > > +    best_idx = \
> > > > +        LUT[av_clip (lrintf(val * LUT_mul + LUT_add), 0, LUT_size -
> > > > 1)]; \ +    if (fabs(val - table[best_idx]) > fabs(val -
> > > > table[best_idx + 1])) \ +        best_idx++;
> > >
> > > this can be an inline function which would be cleaner
> > >
> > >
> > > [...]
> >
> > it's possible, but I don't know how to do it, because table may have
> > float type, uint16_t or int16_t
>
> ok, ive indeed missed that, but if the type can be int and float then this
> is not acceptable as such because fabs() would convert to float and back
> when the input is int.
> which brings us back to the begin of that this macro isnt a good idea, and
> that the single use of the float one can be just written where its used and
> the other coukd either as well or be an inline function.
>
>
> [...]
>
> > +/**
> > + * @file nellymoserenc.c
> > + * Nellymoser encoder
> > + * by Bartlomiej Wolowiec
> > + *
> >
> > + * Generic codec information: libavcodec/nellymoserdec.c
> > + * Log search algorithm idea: http://www1.mplayerhq.hu/ASAO/ASAO.zip
> > + * (Copyright Joseph Artsimovich and UAB "DKD")
>
> is this still in the code?

Hmm... it's too distant from that idea... ;)

> > + *
> > + * for more information about nellymoser format, visit:
> > + * http://wiki.multimedia.cx/index.php?title=Nellymoser
> > + */
> > +
> > +#include "nellymoser.h"
> > +#include "avcodec.h"
> > +#include "dsputil.h"
> > +
> > +#define BITSTREAM_WRITER_LE
> > +#include "bitstream.h"
> > +
> > +#define POW_TABLE_SIZE (1<<11)
> > +#define POW_TABLE_OFFSET 3
> > +
> > +typedef struct NellyMoserEncodeContext {
> > +    AVCodecContext  *avctx;
> > +    int             last_frame;
> > +    int             bufsel;
> > +    int             have_saved;
> > +    DSPContext      dsp;
> > +    MDCTContext     mdct_ctx;
> > +    DECLARE_ALIGNED_16(float, mdct_out[NELLY_SAMPLES]);
> > +    DECLARE_ALIGNED_16(float, buf[2][3 * NELLY_BUF_LEN]);     ///<
> > sample buffer +} NellyMoserEncodeContext;
> > +
> > +static float pow_table[POW_TABLE_SIZE];     ///< -pow(2, -i / 2048.0 -
> > 3.0); +
> >
> > +#define LUT_init_mul (1.0/256.0)
>
> please do not use floats to >> of integers

val has float type - but there can be used also?>> - it doesn't matter because 
after this operation it is changed into int.
-- 
Bartlomiej Wolowiec
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nellymoser4.patch
Type: text/x-diff
Size: 13040 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080829/4b26ca30/attachment.patch>