[MPlayer-dev-eng] [OT] C-code Optimiation Contest

Michael Niedermayer michaelni at gmx.at
Tue Jul 15 13:35:02 CEST 2003


Hi

On Tuesday 15 July 2003 13:10, Michael Niedermayer wrote:
> Hi
>
> On Tuesday 15 July 2003 13:06, Arpi wrote:
> > Hi,
> >
> > > > > (michael's code: Cycles: 7423830  1121.179)
> > > > >
> > > > > Btw. what algorithm are you implementing or did you invent it
> > > > > yourself?
> > > >
> > > > as arpi already said, the cheater method :)
> > >
> > > ok, heres a serious try ...
> > > PS: set DIM=528
> >
> > btw, do you have an idea why changing DIM from 512 helps so lot?
>
> yes
its effect of the cache
caches are organised in cache lines (32byte for P3, larger for P4)
when a access misses the cache then a complete aligned cache line is read ...
the 5 least significant bits of the address select which of the 32 bytes of a 
cache line is used (for P3), the middle bits select which set value is used, 
each set value can only point to a very small number of different cache lines 
(= set associativity) (its 4 for P3 & P4 AFAIK) and there are only cachesize 
/ (linesize * set_associativity) set values

so if u have a 2-way set associative 8k cache then 3 bytes at addresses 0, 4k, 
8k can never be in the cache at the same time, so accessing stuff in some 
array with stride = 2^n in vertical order will be quite slow

[...]
-- 
Michael
level[i]= get_vlc(); i+=get_vlc();		(violates patent EP0266049)
median(mv[y-1][x], mv[y][x-1], mv[y+1][x+1]);	(violates patent #5,905,535)
buf[i]= qp - buf[i-1];				(violates patent #?)
for more examples, see http://mplayerhq.hu/~michael/patent.html
stop it, see http://petition.eurolinux.org & http://petition.ffii.org/eubsa/en



More information about the MPlayer-dev-eng mailing list