[MPlayer-dev-eng] [OT] C-code Optimiation Contest

Felix Buenemann atmosfear at users.sourceforge.net
Tue Jul 15 02:24:53 CEST 2003


On Tuesday 15 July 2003 02:12, Arpi wrote:
> Hi,
>
> > if you like to have some fun, try optimizing the attached simple matrix
> > multiply and post your results.
> >
> > The Rules:
> > 1. you may only modify multiply_d.[ch] (NUM should stay 512 though)
> > 2. you may change compiler and optims in Makefile
> > 3. precision must stay the same
> > 4. to compare, you should compile the orginal code:
> >     make && copy matrix matrix.org && (./matrix.org >res.org)
> > 5. then later you can compare results via:
> >    make && (./matrix >res.txt) && diff -q res.org res.txt
> >
> > Have fun!
> >
> > Btw. current results by me is 738% of original speed, arpis results are
> > even better, as he included my tips in his alredy optimized code =))))
>
> ok here is my (ok, our:)) contribution:
>
> to get best results (19.628 faster than original) set DIM to 516 in the .h
>
[...arpis code...]
Arpis code seems to be very fast on Pentium4, but performs rather poor on 
Pentium III (310% using icc, 279% using gcc 3.2.2). So I attached my sofar 
fastest code on P3, for it you should tweak DIM to 528 on P3.
(Attached multiply_d.c.good4)

I must say this algo is very interesting as it seems optimization is very 
different for different CPU-Types. Btw. if you wanna try this on non x86 cpus 
be sure to modify the rdtsc() function to a proper replacement.

>
> A'rpi / Astral & ESP-team
>

-- 
Best Regards,
        Atmos
____________________________________________
- MPlayer Developer - http://mplayerhq.hu/ -
____________________________________________
-------------- next part --------------
A non-text attachment was scrubbed...
Name: multiply_d.c.good4
Type: text/x-csrc
Size: 3352 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20030715/b4b3ade6/attachment.c>


More information about the MPlayer-dev-eng mailing list