[FFmpeg-devel] [PATCH] add E-AC-3 support to AC-3 decoder

Michael Niedermayer michaelni
Mon Jul 14 04:26:05 CEST 2008


On Sun, Jul 13, 2008 at 07:46:35PM -0400, Justin Ruggles wrote:
> Michael Niedermayer wrote:
> > On Sat, Jul 12, 2008 at 01:12:41PM -0400, Justin Ruggles wrote:
> >> Hi,
> >>
> >> Michael Niedermayer wrote:
> >>> On Sat, Jun 07, 2008 at 10:30:31AM -0400, Justin Ruggles wrote:
> >>>> +        for (i = 1; i < 6; i++) {
> >>>> +            tmp += ((int64_t)idct_cos_tab[blk][i-1] * (int64_t)s->pre_mantissa[i][ch][bin]) >> 23;
> >>>> +        }
> >>>> +        s->fixed_coeffs[ch][bin] = tmp >> s->dexps[ch][bin];
> >>>> +    }
> >>>> +}
> >>> there are symmetries in the idct, this brute force solution is a little
> >>> umm ...
> >> The only solution I could come up with which takes advantage of some
> >> symmetry is still brute-force for the first 3 blocks, but essentially
> >> just copies the data with a sign flip for odd index values in the second
> >> 3 blocks.  This increases the speed of the function by 40% and overall
> >> E-AC3 decoding by 7% when AHT is used.  Is that adequate?
> > 
> > The transformation matrix is:
> > 
> > 1.000000  1.366025  1.224745  1.000000  0.707107  0.366025 
> > 1.000000  1.000000  0.000000 -1.000000 -1.414214 -1.000000 
> > 1.000000  0.366025 -1.224745 -1.000000  0.707107  1.366025 
> > 1.000000 -0.366025 -1.224745  1.000000  0.707107 -1.366025 
> > 1.000000 -1.000000 -0.000000  1.000000 -1.414214  1.000000 
> > 1.000000 -1.366025  1.224745 -1.000000  0.707107 -0.366025 
> > 
> > after one pass of butterflies thats just
> > 1.000000 0.000000  1.224745  0.000000  0.707107  0.000000 
> > 1.000000 0.000000 -0.000000  0.000000 -1.414214  0.000000 
> > 1.000000 0.000000 -1.224745  0.000000  0.707107  0.000000 
> > 0.000000 0.366025  0.000000 -1.000000  0.000000  1.366025 
> > 0.000000 1.000000  0.000000 -1.000000  0.000000 -1.000000 
> > 0.000000 1.366025  0.000000  1.000000  0.000000  0.366025 
> > 
> > Even a lame brute force implementation of that should be more than 40%
> > faster, that is due to the simple and repeated coefficients
> 
> This is my first time trying to break down a DCT.  I found some basic
> info on how to do it and made a flow chart in OpenOffice.  This is the
> forward transform.  Now I need to figure out how to do the inverse.  

Its very easy to convert these from forward to inverse ...


> I
> just wanted to post this to make sure I'm on the right path to figuring
> out a good solution.
> 
> http://justin.ruggles.googlepages.com/dct6_diagram.pdf

Your flow chart is messy :)

X0        \    / \ / \/ X0
X1   \  /  \  /   X  /\ X4
X2\/  \/    \/   / \    X2
X3/\  /\    /\
X4   /  \  /  \  \/ [more stuff here]
X5        /    \ /\

looks nicer
I also suspect that the odd side can be done at least with fewer temporaries
but anyway, this is all not that important considering the amount of time
you apparently spend on it ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

it is not once nor twice but times without number that the same ideas make
their appearance in the world. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080714/598afc35/attachment.pgp>



More information about the ffmpeg-devel mailing list