[MPlayer-dev-eng] [OT] C-code Optimiation Contest
Michael Niedermayer
michaelni at gmx.at
Tue Jul 15 10:51:09 CEST 2003
Hi
On Tuesday 15 July 2003 04:29, Felix Buenemann wrote:
> On Tuesday 15 July 2003 03:35, Michael Niedermayer wrote:
> > Hi
> >
> > On Tuesday 15 July 2003 02:12, Arpi wrote:
> > > Hi,
> > >
> > > > if you like to have some fun, try optimizing the attached simple
> > > > matrix multiply and post your results.
> > > >
> > > > The Rules:
> > > > 1. you may only modify multiply_d.[ch] (NUM should stay 512 though)
> > > > 2. you may change compiler and optims in Makefile
> > > > 3. precision must stay the same
> > > > 4. to compare, you should compile the orginal code:
> > > > make && copy matrix matrix.org && (./matrix.org >res.org)
> > > > 5. then later you can compare results via:
> > > > make && (./matrix >res.txt) && diff -q res.org res.txt
> > > >
> > > > Have fun!
> > > >
> > > > Btw. current results by me is 738% of original speed, arpis results
> > > > are even better, as he included my tips in his alredy optimized code
> > > > =))))
> > >
> > > ok here is my (ok, our:)) contribution:
> > >
> > > to get best results (19.628 faster than original) set DIM to 516 in the
> > > .h
> >
> > hmm, only ~20x faster ...
> > i get ~1000x faster (see attached) :)))
> >
> > [...]
>
> now you know why we needed you @ LinuxTag, we would have won all coding
> contests! (That means at least 4 Notebooks =))
hmm, probably they would have dissqualified us rather ...
>
> (michael's code: Cycles: 7423830 1121.179)
>
> Btw. what algorithm are you implementing or did you invent it yourself?
as arpi already said, the cheater method :)
btw, we could also check if the input matrixes match the expected and then
switch between a matrix multiply and the simple solution for the special case
or we could check if the input is just integers and then use integer math
or use SSE2
there are also some algos which can actually solve the problem with fewer
instructions (see http://www.library.cornell.edu/nr/bookcpdf/c2-11.pdf) for
example, no i dunno if it would be practically faster
--
Michael
level[i]= get_vlc(); i+=get_vlc(); (violates patent EP0266049)
median(mv[y-1][x], mv[y][x-1], mv[y+1][x+1]); (violates patent #5,905,535)
buf[i]= qp - buf[i-1]; (violates patent #?)
for more examples, see http://mplayerhq.hu/~michael/patent.html
stop it, see http://petition.eurolinux.org & http://petition.ffii.org/eubsa/en
More information about the MPlayer-dev-eng
mailing list