[FFmpeg-devel] [PATCH] Channels correlation

Fri Oct 30 01:42:29 CET 2009

On Thu, Oct 29, 2009 at 08:02:55PM +0100, Nicolas George wrote:
> L'octidi 8 brumaire, an CCXVIII, Reimar D?ffinger a ?crit?:
> > There is a loop over the samples around that, so you do
> > loop over samples
> >    loop over channel combinations
> > I suggested to try
> > loop over channel combinations
> >     loop over samples
> > The latter way should cause no extra cache pressure due to m, though
> > it has a higher cache pressure due to reading the samples once per
> > channel combination so it could be worse.
> > It would however be much better if a planar channel layout was used -
> > it might be worth investigating if you get better results if you split
> > the interleaved channels into one array per channel at some other place.
> 
> Ok, I did not get that. I tried inverting the loops like that, but it
> results in a significant slowdown.
> 
> > Though even if you don't want to do that least you could
> > if you ensure that nc1 != 0 (probably a good idea anyway) do
> > something like below, freeing the registers otherwise used for n1 and n2
> > (or just use a variable n = FFMIN(n1, n2) )
> > d1_end = d1 + nc1 * FFMIN(n1, n2);
> 
> I tried various variations on that theme, but none showed any enhancement
> compared with the current code, and some simple changes that should
> obviously speed up things ended up slowing them.
> 
> I think that I am interfering with the compiler optimizations, and that any
> benchmark results would be highly sensible to the version of the compiler.
> 
> I quite like the current version because it makes the symmetrical role of
> the various variables (d1 and d2, n1 and n2, i and j) obvious.
> 
> > And are nc1 and nc2 allowed to differ? I guess yes, but I am not sure.
> 
> Yes. The typical use for this code would be nc1 = 6 and nc2 = 2, to know how
> surround content has been downmixed to stereo.
> 
> > No point in calculating more than one half of a self-correlation
> > (and thus symmetric) matrix.
> 
> I did that optimization, as it shows a small but visible speedup.
> 
> For some reason, computing the lower half of the matrix, on the other hand,
> causes a big slowdown. Again, I think this is too near the compiler
> optimization process to make any conclusion: I expect that depending on the
> architecture and compiler version, almost any version can be the fastest or
> the slowest.
> 
> I also changed some variables names, hopefully for the better.
[...]

> +/**
> + * Perform a Gau? pivot on a rectangular matrix
> + *
> + * @param m the matrix
> + * @param l number of lines
> + * @param c number of columns
> + * @return  1 if success, 0 if the upper square matrix is singular
> + *
> + * l must be greater than c.
> + * When pivot returns, the upper c?c matrix is the identity and the relevant
> + * result is in the lower (l-c)?c matrix.
> + */
> +static int pivot(double *m, unsigned l, unsigned c)
> +{
> +    unsigned i, j, k;
> +    unsigned pivot_col;
> +    double coef;
> +
> +    for (i = 0; i < c; i++) {
> +        pivot_col = i;
> +        for (j = i; j < c; j++)
> +            if (fabs(m[i * c + j]) > fabs(m[i * c + pivot_col]))
> +                pivot_col = j;
> +        if (pivot_col != i)
> +            for (j = 0; j < l; j++)
> +                FFSWAP(double, m[j * c + i], m[j * c + pivot_col]);
> +        coef = m[i * c + i];
> +        if (coef == 0)
> +            return 0;

> +        for (j = 0; j < l; j++)
> +            m[j * c + i] /= coef;

coeff= 1/coeff;
...*= coeff
is faster

> +        for (j = 0; j < c; j++) {
> +            if (j == i)
> +                continue;
> +            coef = m[i * c + j];
> +            for (k = 0; k < l; k++)
> +                m[k * c + j] -= coef * m[k * c + i];

k can start with a value > 0 as the first few columns should already be 0
except the main diagonal elements

btw, isnt libavutil/lls.c/h useable?

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The educated differ from the uneducated as much as the living from the
dead. -- Aristotle 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20091030/cff71468/attachment.pgp>