[FFmpeg-devel] [PATCH] WMA Voice postfilter
Vitor Sessak
vitor1001
Thu Mar 18 21:33:30 CET 2010
Ronald S. Bultje wrote:
> Hi,
>
> see attached, please be kind.
Small problem:
> vitor at vitor-laptop:~/ffmpeg/ffmpeg.2$ svn up libavcodec/wmavoice{.c,_data.h}
> At revision 22593.
> At revision 22593.
> vitor at vitor-laptop:~/ffmpeg/ffmpeg.2$ patch -p1 < /tmp/wmavoice-apf.patch
> patching file libavcodec/wmavoice.c
> Hunk #8 FAILED at 1517.
> Hunk #9 succeeded at 1773 (offset 1 line).
> Hunk #10 succeeded at 1799 (offset 1 line).
> Hunk #11 succeeded at 1956 (offset 1 line).
> Hunk #12 succeeded at 1984 (offset 1 line).
> Hunk #13 succeeded at 2002 (offset 1 line).
> 1 out of 13 hunks FAILED -- saving rejects to file libavcodec/wmavoice.c.rej
> patching file libavcodec/wmavoice_data.h
>
> Index: ffmpeg-svn/libavcodec/wmavoice.c
> ===================================================================
> --- ffmpeg-svn.orig/libavcodec/wmavoice.c 2010-03-16 18:57:01.000000000 -0400
> +++ ffmpeg-svn/libavcodec/wmavoice.c 2010-03-18 14:16:35.000000000 -0400
> @@ -36,6 +36,8 @@
> #include "acelp_filters.h"
> #include "lsp.h"
> #include "libavutil/lzo.h"
> +#include "avfft.h"
> +#include "fft.h"
>
> #define MAX_BLOCKS 8 ///< maximum number of blocks per frame
> #define MAX_LSPS 16 ///< maximum filter order
> @@ -142,6 +144,12 @@
>
> int do_apf; ///< whether to apply the averaged
> ///< projection filter (APF)
> + int denoise_strength; ///< strength of denoising in Wiener filter
> + ///< [0-11]
> + int denoise_tilt_corr; ///< Whether to apply tilt correction to the
> + ///< Wiener filter coefficients (postfilter)
> + int dc_level; ///< Predicted amount of DC noise, based
> + ///< on which a DC removal filter is used
I would add a
/* postfilter specific */
comment to separate it from the other global values.
> +static void adaptive_gain_control(float *buf_out, const float *speech_synth,
> + int size, float alpha, float *gain_mem)
> +{
> + int i;
> + float speech_energy = 0.0, postfilter_energy = 0.0, gain_scale_factor;
> + float mem = *gain_mem;
> +
> + for (i = 0; i < size; i++) {
> + speech_energy += fabs(speech_synth[i]);
> + postfilter_energy += fabs(buf_out[i]);
fabsf() is probably faster on x64.
> + /* calculate the Hilbert transform of the gains, which we do (since this
> + * is a sinus input) by doing a phase shift (in theory, H(sin())=cos()).
> + * Because input is symmetric (mirror above), every im[n] is zero. */
> + ff_rdft_calc(&s->rdft, &lpcs[1]);
> + lpcs[1] = lpcs[2];
> + lpcs[2] = lpcs[0] = 0;
> + ff_rdft_calc(&s->irdft, lpcs);
I think this deserve to be in a separate function (and that would
include the mirroring), it could be reused in case we need a Hilbert
transform in another codec. Also I think it should be possible to do it
with a half as big FFT...
> +/**
> + * Averaging projection filter, the postfilter used in WMAVoice.
> + *
> + * This uses the following steps:
> + * - A zero-synthesis filter (generate excitation from synth signal)
> + * - Kalman smoothing on excitation, based on pitch
> + * - Re-synthesized smoothened output
> + * - Iterative Wiener denoise filter
> + * - Adaptive gain filter
> + * - DC filter
> + *
> + * @param s WMAVoice decoding context
> + * @param synth Speech synthesis output (before postfilter)
> + * @param samples Output buffer for filtered samples
> + * @param size Buffer size of synth & samples
> + * @param lpcs Generated LPCs used for speech synthesis
> + * @param fcb_type Frame type (silence, hardcoded, AW-pulses or FCB-pulses)
> + * @param pitch Pitch of the input signal
> + */
> +static void postfilter(WMAVoiceContext *s, const float *synth,
> + float *samples, int size,
> + const float *lpcs, float *zero_exc_pf,
> + int fcb_type, int pitch)
size is always 80, so it's better to fix it with a define.
I'll give a second look at it later when I have the time.
-Vitor
More information about the ffmpeg-devel
mailing list