[FFmpeg-devel] [PATCH] Common ACELP code & G.729 [1/7] - filters

Wed May 7 05:23:50 CEST 2008

On Wed, May 7, 2008 at 7:47 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> 
> On Wed, May 07, 2008 at 07:05:07AM +0700, Vladimir Voroshilov wrote:
>  > On Wed, May 7, 2008 at 3:42 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>  > >
>  > > On Tue, May 06, 2008 at 11:24:38PM +0700, Vladimir Voroshilov wrote:
>  > >  > On Mon, May 5, 2008 at 3:42 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>  > >  > > On Sun, May 04, 2008 at 01:19:10PM +0700, Vladimir Voroshilov wrote:
>  > >  > >  > On Sat, May 3, 2008 at 7:23 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
>  > >  > >  > > On Sat, May 03, 2008 at 03:24:54PM +0700, Vladimir Voroshilov wrote:
>  > >  > >  > >  > Michael Niedermayer wrote:
>  > >  > >  > >  [...]
>  > >  > >  > >  > +void ff_acelp_convolve_circ(
>  > >  > >  > >  > +        int16_t* fc_out,
>  > >  > >  > >  > +        const int16_t* fc_in,
>  > >  > >  > >  > +        const int16_t* filter,
>  > >  > >  > >  > +        int subframe_size)
>  > >  > >  > >  > +{
>  > >  > >  > >  > +    int i, k;
>  > >  > >  > >  > +
>  > >  > >  > >  > +    memset(fc_out, 0, subframe_size * sizeof(int16_t));
>  > >  > >  > >
>  > >  > >  > > > +
>  > >  > >  > >  > +    /* Since there are few pulses over all subframe (i.e. almost all
>  > >  > >  > >
>  > >  > >  > > > +       fc_in[i] are zero, in case of G.729D it is only two non-zero
>  > >  > >  > >  > +       samples of total 40), it is faster to swap two loops and process
>  > >  > >  > >  > +       non-zero samples only. This will reduce number of multiplications
>  > >  > >  > >
>  > >  > >  > > > +       from 40*40 to 2*40 for G.729D */
>  > >  > >  > >
>  > >  > >  > >  doesnt ff_acelp_fc_enchance_harmonics() increase the number of non 0
>  > >  > >  > >  elements above 2 ?
>  > >  > >  >
>  > >  > >  > Perhaps i misspelled sentence.
>  > >  > >  > I meant that using swapped loops with checking for non-zero will
>  > >  > >  > require 2*40 multiplications,
>  > >  > >
>  > >  > >  The sentance is fine.
>  > >  > >  What i meant is that ff_acelp_fc_enchance_harmonics() can increase the number
>  > >  > >  of non zero samples above 2. Or do i miss somehing that prevents this?
>  > >  >
>  > >  > This is exactly what this filter intended to do.
>  > >  > It get buffer with few pulses and smooth them over all subframe.
>  > >  > In reference code it is caller "pitch_shrp" ("pitch sharpening" i guess).
>  > >
>  > >  What iam trying to say is, if ff_acelp_fc_enchance_harmonics() increases the
>  > >  number of non zero samples above 2 then the comment is no longer correct.
>  > >  As its no longer exactly 2 for g729d
>  >
>  > I'm afraid you still don't understand what I wanted to say.
>  > 1. At each call "in" buffer for g729d contains exactly two no-zero values.
>  
>  Then please explain me the following:
>  
>         memset(fc, 0, sizeof(int16_t) * ctx->subframe_size);
>         formats[ctx->format].decode_fixed(fc, parm->fc_indexes[i], parm->pulses_signs[i]);
>  ** here fc will have exactly 2 non zero elements **
>         ff_acelp_fc_enchance_harmonics(fc, pitch_delay_int, ctx->pitch_sharp, 

damn. totally forget about this and messed up convolve_circ and enhance_harmonics.

[...]

>  ------
>  The reason why id like this clarified is because if there really are just
>  2 non zero elements in there then a simple array of 2 int could be used to
>  store them and this should be much simpler. But it does not look like there
>  are just 2 (or maybe the code is buggy or iam stupid ...)

No. You are right and I am stupid :)
Comment in convolve circ is totally wrong and will be 
removed (or changed - i'll try to check how many zeroes
can stay in buffer after enhancing).

>  > 2. During process these two pulses are smoothed over all subframe and
>  > written to "output" (!) buffer. "in" buffer remains the same.
>  > 3. When used formula from doxygen comments, filter will make 40*40
>  > multiplications (regardless of values in "in" buffer).
>  > 4. When two loops are swapped (comparing to formula in doxygen
>  > comment) and check for non-zero is added, filter will make only 2*40
>  > multiplications (since input buffer is never touched).
>  >
>  > This is why i asked for phrase correctness (i meant not english, but
>  > phrase meaning).
>  >
>  > >  > +void ff_acelp_high_pass_filter(
>  > >  > +        int16_t* out,
>  > >  > +        int16_t* hpf_z,
>  > >  > +        int* hpf_f,
>  > >  > +        const int16_t* in,
>  > >  > +        int length)
>  > >  > +{
>  > >  > +    int i;
>  > >  > +
>  > >  > +    for(i=0; i<length; i++)
>  > >  > +    {
>  > >  > +        memmove(hpf_z + 1, hpf_z, 2 * sizeof(hpf_z[0]));
>  > >  > +        hpf_z[0] = in[i];
>  > >  > +        hpf_f[0] =  MULL(hpf_f[1], 15836);                     /* (14.13) = (13.13) * (1.13) */
>  > >  > +        hpf_f[0] += MULL(hpf_f[2], -7667);                     /* (13.13) = (13.13) * (0.13) */
>  > >  > +        hpf_f[0] += 7699 * (hpf_z[0] - 2*hpf_z[1] + hpf_z[2]); /* (14.13) =  (0.13) * (14.0) */
>  > >
>  > > > +
>  > >  > +        /* Multiplication by 2 with rounding can cause short type
>  > >  > +           overflow, thus clipping is required. */
>  > >  > +
>  > >  > +        out[i] = av_clip_int16((hpf_f[0] + 0x800) >> 12);      /* (15.0) = 2 * (13.13) = (14.13) */
>  > >  > +
>  > >  > +        memmove(hpf_f + 1, hpf_f, 2 * sizeof(hpf_f[0]));
>  > >  > +    }
>  > >
>  > >
>  > >  t =  MULL(hpf_f[0], 15836);                     /* (14.13) = (13.13) * (1.13) */
>  > >    + MULL(hpf_f[1], -7667);                     /* (13.13) = (13.13) * (0.13) */
>  > >    + 7699 * (in[i] - 2*in[i-1] + in[i-2]);      /* (14.13) =  (0.13) * (14.0) */
>  > >
>  > >  out[i] = av_clip_int16((t + 0x800) >> 12);      /* (15.0) = 2 * (13.13) = (14.13) */
>  > >
>  > >  hpf_f[1]= hpf_f[0];
>  > >  hpf_f[0]= t;
>  >
>  > This will require either two samples before start in "in" buffer, copied from
>  > the previous subframe or
>  
>  yes, i thought that would be just something like
>  memcpy(in-2, in+len-2, 2*sizeof(*in))
>  
>  ?
>  
>  If so that should be simpler than it is currently
>  
Current code work as following:

context
{
  syn_filter_data
  post_filter_data
  hpf_z
  hpf_z
}

mainloop
  {
   ...
1  syn_filter(syn_filter_data+10)
2  memcpy(syn_filter_dta+10, syn_filter_data+subframe_size,10)
3  postfilter(syn_filter_data+10, postfilter_data) 
6  high_pass_filter(out_data,syn_filter_data+10,hpf_z,hpf_f)
7 }

syn_filter requires first 10 items of previous data saved just after call to it.

postfilter does not requires previous data (it uses post_filter_data as such),
but it changes contents of syn_filter data (from 10 to the end of buffer).

high_pass_filter (in your suggestion) requires previous data in buffer,
but buffer does not contains it already (were overwritten in line 2)

The problem is in fact that syn_filter requires unfiltered past data, while
high_pass_filter requires them filtered with postfilter.

Moreover, since top of buffer already contains data saved for future
syn_filter call, i cant save there
copy of hpf_z (can't do memcpy(in,hpf_z) inside filter).

>  [...]
>  
>  --
>  Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>  
>  The greatest way to live with honor in this world is to be what we pretend
>  to be. -- Socrates
>  
> -----BEGIN PGP SIGNATURE-----
>  Version: GnuPG v1.4.6 (GNU/Linux)
>  
>  iD8DBQFIIPwNYR7HhwQLD6sRAgBIAJ9DmQCpCDL2hzuSmibUosPgjvLHeACcCM3L
>  vr71RzY3iklK/lTwNI8XQYw=
>  =CLrC
>  -----END PGP SIGNATURE-----
>  

-- 
Regards,
Vladimir Voroshilov mailto:voroshil at gmail.com
JID: voroshil at gmail.com, voroshil at jabber.ru
ICQ: 95587719