[FFmpeg-devel] [PATCH] Common ACELP code & G.729 [4/7] - G.729 core

Fri Oct 17 11:33:05 CEST 2008

On Fri, Oct 17, 2008 at 03:14:27PM +0700, Vladimir Voroshilov wrote:
> 2008/10/17 Michael Niedermayer <michaelni at gmx.at>:
> 
> >> diff --git ffmpeg-r15432.orig/libavcodec/g729dec.c ffmpeg-r15432.mod/libavcodec/g729dec.c
> >> index 6bd814f..74b0665 100644
> >> --- ffmpeg-r15432.orig/libavcodec/g729dec.c
> >> +++ ffmpeg-r15432.mod/libavcodec/g729dec.c
> >
> >> @@ -30,6 +30,9 @@
> >>  #include "libavutil/avutil.h"
> >>  #include "bitstream.h"
> >>
> >> +#define FRAC_BITS 15
> >> +#include "mathops.h"
> >> +
> >>  #include "g729.h"
> >>  #include "lsp.h"
> >>  #include "acelp_math.h"
> >
> > this is unflexible and confusing, the used shift should not be file global
> >
> > [...]
> >> +static const G729_format_description formats[] =
> >> +{
> >> +  {8000, 10, 160, 13, 1018156},
> >
> >> +// Note: may not work
> >> +  {4400, 11, 176, 17, 1021547},
> >> +};
> >
> > elaborate on "may" please
> > if it does not work, it probably should be removed from the patch
> > or fixed so it does work.
> > if it does work what does the comment mean?
> > and if it has not been tested, test it.
> 
> 4400 mode is (i assume) Chinese extension of G.729. it is not covered
> by official specification.
> And since i have no samples i can't test it. This is why i put those comment.

What makes you think this format even exists?
Is there any source code for a decoder?
Or a specification?
Or is it mentioned in any manual for any hardware?
Iam just a little curious where this stuff comes from ...

> 
> According to above the best way will be just dropping it completely
> (but i whould prefer keeping format structure as is, because it will
> be necessary when G.729 will be added).
> 

> I am also not against dropping BFI code, since i feel myself having
> poor knowledge to compare it with other frame erasure algorithms.
> (But with BFI speech is understandable with 50% of input packets loss,
> afair. I can made this simple test again, if you wish).

and is it better or worse than repeating the previous packet?

> 
> >
> > [...]
> >> @@ -138,6 +531,108 @@ static inline int g729_get_parity(uint8_t value)
> >>                  14,
> >>                  ctx->subframe_size);
> >>
> >> +        memcpy(synth, ctx->syn_filter_data, 10 * sizeof(int16_t));
> >
> > there are 2 memcpies copying this around, i dont see why it would need 2.
> 
> I'll try to describe this as much as i understand it.
> 
> Well... Synthesis filter can be written as
> in[i]=sum{a[k]*out[i-k]},k=0..filter_order, i=0..subframe_size-1
> where a[k] - filter coefficients, a[0] == 1, in[i] - input, out[i] -
> output signal data
> 
> Due to optimization, a[0] is eliminated and filter (in code) is implemented as:
> buf[i]=buf[i] - sum{a[k]*buf[i-1-k]}, k=0..filter_order-1
> 
> Thus, i need a buffer of length subframe_size+(filter_order-1).
> But... the top of this buffer must contains k-1 samples from the end
> of previous filter run result.
> Brrr. I mean:
> buffer after Nth call:
> (0),..,(k-1),(k),..,[(subframe_size-k),..,(subframe_size-1)]
> 
> buffer before (N+1)th  call:
> [(0),..,(k-1)], (the rest is garbage yet)
> 
> data in brackets must be the same.

fine, so much i knew alraedy, that makes it one memcpy from the end to the
start, not 2

> 
> When synthesis filter is single filter working with this buffer, there
> is enough one memmove.
> Unfortunately there are also high-pass filter and postfilter, which
> also changes buffer.
> So, i have to save synthesis filter output data before passing them
> to below filters and restore saved
> data back before next call to synthesis filter.

IIRC (ive not rechecked the code)
there is data for 2 subframes
[0..k] frame0 frame1
by the time frame1 is done the memcpy would move the data from frame1
over frame0, i thought this would not affect subsequent steps run over
frame1?
but again ive not re read the code so i might be totally wrong.

[...]
> >
> >
> >> +
> >> +            ff_acelp_lp_synthesis_filter(
> >> +                    synth+10,
> >> +                    &lp[i][1],
> >> +                    ctx->exc  + i * ctx->subframe_size,
> >> +                    ctx->subframe_size,
> >> +                    10,
> >> +                    0,
> >> +                    0x800);
> >> +        }
> >
> >> +        /* Save data (without postfilter) for use in next subframe. */
> >> +        memcpy(ctx->syn_filter_data, synth+ctx->subframe_size, 10 * sizeof(int16_t));
> > [...]
> >> +    /* Save signal for use in next frame. */
> >
> > "signal" and "data" are not acceptable descriptions, these 2 words are just
> > random.
> >
> >
> >> +    memmove(ctx->exc_base, ctx->exc_base + 2 * ctx->subframe_size, (PITCH_DELAY_MAX+INTERPOL_LEN)*sizeof(int16_t));
> >> +}
> >> +
> >
> >> +/**
> >> + * \brief Decodes one G.729 frame (10 bytes long) into parameter vector.
> >> + * \param format used format (8k/4.4k/etc)
> >> + * \param buf 10 bytes of decoder parameters
> >> + * \param buf_size size of input buffer
> >> + * \param parm [out] decoded codec parameters
> >> + *
> >> + * \return 1 if frame erasure detected, 0 - otherwise
> >> + */
> >> +static int g729_bytes2parm(int format, const uint8_t *buf, int buf_size, G729_parameters *parm)
> >> +{
> >
> > You mix in this description the word parameter and frame completely
> > randomly, which considering this is said to convert a frame to parameters
> > just cause the reader to waste time as he will try to make sense of it until
> > he realizes that there is no sense in this and that reading the source is
> > needed.
> 
> In quoted comment i meant:
> "G729 frame (10 bytes long)" == packed frame
> "parameter vector" == structure, contains unpacked coefficients from above.
> "frame" == two (perhaps, can be more, i didn't read all Annexes yet)
> subframes == unpacked (decoded) frame
> Each subframe contains 10ms of speech in 8kHz case.

dont explain it to me, iam reading your code and ignoring your comments
anyway :)
please rather fix the docs

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The misfortune of the wise is better than the prosperity of the fool.
-- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20081017/a3a10da5/attachment.pgp>