[FFmpeg-devel] [PATCH] ALAC Encoder

Michael Niedermayer michaelni
Wed Aug 20 16:20:41 CEST 2008


On Wed, Aug 20, 2008 at 02:21:56AM -0300, Ramiro Polla wrote:
> Hi Michael,
> 
> On Mon, Aug 18, 2008 at 11:01 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Mon, Aug 18, 2008 at 09:38:53AM -0300, Ramiro Polla wrote:
> >> Hi,
> >>
> >> On Sun, Aug 17, 2008 at 12:55 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> > On Sun, Aug 17, 2008 at 11:17:27AM -0300, Ramiro Polla wrote:
> >> >> On Sun, Aug 17, 2008 at 10:15 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> >> > On Sun, Aug 17, 2008 at 09:09:00AM +0530, Jai Menon wrote:
> >> >> >> Hi,
> >> >> >>
> >> >> >> On Sunday 17 Aug 2008 8:05:14 am Michael Niedermayer wrote:
> >> >> >> > On Sun, Aug 17, 2008 at 04:14:43AM +0530, Jai Menon wrote:
> >> >> > [...]
> >> >> >> > > +static void alac_stereo_decorrelation(AlacEncodeContext *s)
> >> >> >> > > +{
> >> >> >> > > +    int32_t *left = s->sample_buf[0], *right = s->sample_buf[1];
> >> >> >> > > +    int32_t tmp;
> >> >> >> > > +    int i;
> >> >> >> > > +
> >> >> >> > > +    for(i=0; i<s->avctx->frame_size; i++) {
> >> >> >> > > +        tmp = left[i];
> >> >> >> > > +        left[i] = (tmp + right[i]) >> 1;
> >> >> >> > > +        right[i] = tmp - right[i];
> >> >> >> > > +    }
> >> >> >> > >
> >> >> >> > > +    s->interlacing_leftweight = 1;
> >> >> >> > > +    s->interlacing_shift = 1;
> >> >> >> >
> >> >> >> > i do not belive this is optimal
> >> >> >> >
> >> >> >>
> >> >> >> It may not be optimal in the sense that I do not adaptively select the
> >> >> >> decorrelation scheme, but this is just the first iteration which aims at
> >> >> >> getting a basic encoder into svn. And it is better than doing no
> >> >> >> deorrelation. I did initially try out an adaptive approach but the difference
> >> >> >> in compression wasn't that great. I'm looking into how this can be done in a
> >> >> >> better manner. Till then, I was hoping if we could go with this.
> >> >> >
> >> >> > see the pca.c/h i posted in a reply to ramiro a few days ago
> >> >> > it might be worth a try ...
> >> >>
> >> >> Speaking of that... I haven't finished integrating it in MLP (I'm
> >> >> working on some other stuff atm), but it seems to be what I need.
> >> >
> >> >> Could you get it cleaned up and committed like you suggested?
> >> >
> >> > done
> >>
> >> I can get it working in my tests, but not in MLP =(
> >>
> >> I take something like this (I'll name it here s[channels][samples]):
> >>             [samples]
> >> [channel 0] 0 1 2 3 4 5 6 7 8
> >> [channel 1] 0 1 2 3 4 5 6 7 8
> >> [noise   0] 0 1 2 3 4 5 6 7 8
> >> [noise   1] 0 1 2 3 4 5 6 7 8
> >>
> >> Pass it through the pca, <num_channels> samples at a time
> >> pca_add(s[0][0], s[1][0], s[2][0], s[3][0])
> >> pca_add(s[0][1], s[1][1], s[2][1], s[3][1])
> >> ...
> >> pca_add(s[0][x], s[1][x], s[2][x], s[3][x])
> >>
> >> Solve the pca and get the eigenvectors (I'll name it here as e[][])
> >>
> >> Multiply them both to a new buffer (s2[channels][samples])
> >> s2[0][0] = s[0][0] * e[0][0] + s[1][0] * e[0][1] +
> >>            s[2][0] * e[0][2] + s[3][0] * e[0][3]
> >> s2[0][1] = s[0][1] * e[0][0] + s[1][1] * e[0][1] +
> >>            s[2][1] * e[0][2] + s[3][1] * e[0][3]
> >> ...
> >> s2[0][x] = s[0][x] * e[0][0] + s[1][x] * e[0][1] +
> >>            s[2][x] * e[0][2] + s[3][x] * e[0][3]
> >>
> >> s2[1][0] = s[0][0] * e[1][0] + s[1][0] * e[1][1] +
> >>            s[2][0] * e[1][2] + s[3][0] * e[1][3]
> >> s2[1][1] = s[0][1] * e[1][0] + s[1][1] * e[1][1] +
> >>            s[2][1] * e[1][2] + s[3][1] * e[1][3]
> >> ...
> >> s2[1][x] = s[0][x] * e[1][0] + s[1][x] * e[1][1] + s[2][x] * e[1][2] +
> >> s[3][x] * e[1][3]
> >>
> >> In this case I'd get some values in s2[0][x], and 0 for s2[i>0][x],
> >> since all channels are equal.
> >>
> >> I then multiply again by the eigenvectors (with something transposed
> >> in the middle, I forgot which), and get s[][] again.
> >>
> >> But MLP doesn't have the intermediate s2 buffer. It overwrites s[][]
> >> directly. From my stereo samples, it usually only infers s[1][] out of
> >> s[0,2,3][] (2, 3 being noise channels). So there's only one lossless
> >> matrix for channel 1. It works like:
> >> s[1][0] = s[0][0] * e[1][0] + s[1][0] * e[1][1] +
> >>           s[2][0] * e[1][2] + s[3][0] * e[1][3]
> >> s[1][1] = s[0][1] * e[1][0] + s[1][1] * e[1][1] +
> >>           s[2][1] * e[1][2] + s[3][1] * e[1][3]
> >> ...
> >> s[1][x] = s[0][x] * e[1][0] + s[1][x] * e[1][1] +
> >>           s[2][x] * e[1][2] + s[3][x] * e[1][3]
> >>
> >> If there were more matrices, it would always use the previously
> >> overwritten values.
> >>
> >> Can I still achieve this with PCA?
> >
> > yes
> >
> > there are many things that can be tried ...
> > first is to simply just decorrelate only one channel
> > and leave the others as is, this should be easy and similar to the
> > other encoder
> >
> > s[r] = s[0]e[0]/e[r] + s[1]e[1]/e[r] + ...+ s[n]e[n]/e[r]
> >
> > e here is the eigenvector with the smallest eigenvalue
> > r is choosen so that e[r] is largest
> 
> I'm still stuck on this. I couldn't even get the first one (which
> should work like the other encoder).
> 
> If I divide e[n]/e[r], I have to use different coeff values for the
> encoder and for the decoder. The matrix coeffs in the encoder and the
> decoder aren't exactly the same I suppose...

they are, just the signs change
let me give you a concrete example

3 channels, only 1 channel is changed
encoder:    ch[0] -= A*ch[1] + B*ch[2];

decoder:    ch[0] += A*ch[1] + B*ch[2];


3 channels, 2 channels are changed
encoder:    ch[0] -= A*ch[1] + B*ch[2];
            ch[1] -= C*ch[0] + D*ch[2];

decoder:    ch[1] += C*ch[0] + D*ch[2];
            ch[0] += A*ch[1] + B*ch[2];

3 channels, 3 channels are changed
encoder:    ch[0] -= A*ch[1] + B*ch[2];
            ch[1] -= C*ch[0] + D*ch[2];
            ch[2] -= E*ch[0] + F*ch[1];

decoder:    ch[2] += E*ch[0] + F*ch[1];
            ch[1] += C*ch[0] + D*ch[2];
            ch[0] += A*ch[1] + B*ch[2];
you can trivially verifiy by elementary school math that the decoder reverts
what the encoder did


> 
> Also in your example you give only one index for each value. what
> channel does s[n] relate to,

"s[r] = s[0]e[0]/e[r] + s[1]e[1]/e[r] + ...+ s[n]e[n]/e[r]"

s[n] is a sample from the last channel, s[0] is from the first
s[r] is from the r-th


> and what row is e[r]?

"e here is the eigenvector with the smallest eigenvalue"
that is
ff_pca(pca, eigenvector, eigenvalue);
for(i=0; i<n; i++)
    if(eigenvalue[i] < min){
        min=eigenvalue[i];
        ei= i;
    }
(maybe ei is always n-1, i dont remember, i wrote pca.c 4 years ago ...)

"r is choosen so that e[r] is largest"
for(i=0; i<n; i++)
    if(fabs(eigenvector[ei + i*n]) > max){
        max= fabs(eigenvector[ei + i*n]);
        r= i;
    }

and e[x] == eigenvector[ei + x*n]

(hopefully i made no typos ...)

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Observe your enemies, for they first find out your faults. -- Antisthenes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080820/906c0f85/attachment.pgp>



More information about the ffmpeg-devel mailing list