[FFmpeg-devel] [PATCH] AAC: unroll parts of decode_spectrum_and_dequant()

Michael Niedermayer michaelni
Tue Dec 9 13:08:36 CET 2008


On Mon, Dec 08, 2008 at 08:04:10PM -0800, Jason Garrett-Glaser wrote:
> On Mon, Dec 8, 2008 at 7:58 PM, Jason Garrett-Glaser
> <darkshikari at gmail.com> wrote:
> > On Mon, Dec 8, 2008 at 7:34 PM, Alex Converse <alex.converse at gmail.com> wrote:
> >> On Mon, Dec 8, 2008 at 9:33 PM, Jason Garrett-Glaser
> >> <darkshikari at gmail.com>wrote:
> >>
> >>> On Mon, Dec 8, 2008 at 3:43 PM, Alex Converse <alex.converse at gmail.com>
> >>> wrote:
> >>> > Hi,
> >>> >
> >>> > The attached patch unrolling sections of decode spectrum saves me 5.48%
> >>> on
> >>> > my mpeg4-lc-256kbps stream on my core2 duo.
> >>> >
> >>> > Regards,
> >>> > Alex Converse
> >>>
> >>> If dim can only be 2 or 4, wouldn't it be better to do
> >>>
> >>> if( dim == 4 ) {
> >>> do dim 4 stuff
> >>> }
> >>> do dim 2 stuff
> >>>
> >>> The switch seems unnecessary.
> >>>
> >>
> >> Idiomatically I like the switch better but your way is faster. When I did
> >> that I also tried reverting access back to forward order and got a slight
> >> speed up. This way made the unsigned loop just like the other three, so I
> >> added that one for another benchmarked verified speed up.
> >>
> >> The net gain is a 12% decrease in cycles over the original vs 5% before.
> >
> > if (vq_ptr[2]) coef[coef_tmp_idx + 2] = 1 - 2*(int)get_bits1(gb);
> > if (vq_ptr[3]) coef[coef_tmp_idx + 3] = 1 - 2*(int)get_bits1(gb);
> >
> > Isn't that a rather unnecessary int -> float conversion?  I'd think
> > you could do much better than that considering there are only two
> > possible input values...
> >
> > Dark Shikari
> >
> 
> Simple proposal for the above:
> 
> static const float lookup[2] = {1.0, -1.0};
> if (vq_ptr[2]) coef[coef_tmp_idx + 2] = lookup[get_bits1(gb)];


something like:
if (vq_ptr[2]) ((uint32_t*)coef)[coef_tmp_idx + 2] = (get_bits1(gb)<<31) + 0x3F800000;

might be even faster
but i agree with robert that this should be a seperate patch

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I hate to see young programmers poisoned by the kind of thinking
Ulrich Drepper puts forward since it is simply too narrow -- Roman Shaposhnik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20081209/1400d924/attachment.pgp>



More information about the ffmpeg-devel mailing list