CFLAGS benchmarks, part 2 (Re: [MPlayer-dev-eng] [PATCH] CFLAGS cleanups)

Dominik 'Rathann' Mierzejewski dominik at rangers.eu.org
Sun Jan 15 22:49:20 CET 2006


On Sunday, 15 January 2006 at 02:19, Dominik 'Rathann' Mierzejewski wrote:
> On Friday, 13 January 2006 at 23:51, Dominik 'Rathann' Mierzejewski wrote:
> > On Friday, 13 January 2006 at 23:41, Guillaume POIRIER wrote:
> > > Hi,
> > > 
> > > On 1/13/06, Dominik 'Rathann' Mierzejewski <dominik at rangers.eu.org> wrote:
> > > > This patch cleans up OPTFLAGS in Gui, removes hardcoded -Wall from
> > > > libaf's and libmenu's Makefiles and -g from main Makefile.
> > > > Now we can add -Wall to default CFLAGS and be sure it's used
> > > > consistently.
> > > 
> > > Good idea! There's actually some more flags that are now stupid to put
> > > by default, like fomit-frame-pointer (useless with AMD64 AKAIK).
> > > 
> > > I'll add this to my todo list.
> > 
> > I think somebody should run some benchmarks with different CFLAGS on
> > various platforms and come up with the combination that produces the
> > fastest binaries. ;)
> > 
> > But that's a lot of work:
> > -O2 vs -O3
> 
> Ok, I did test the above on my Athlon:
> 
> for i in `seq 1 5`; do mplayer -benchmark -frames 1000 -fs -nosound \
> file.mp4 | grep BENCHMARK >> mplayer-<arch>-<opt>-benchmark.log ; done
> 
> The results (average from 5 runs) are as follows:
> $ benchavg.pl < mplayer-i386-O2-benchmark.log
> 5 results processed
> average  VC is: 20.315s
> average  VO is: 4.472s
> average SYS is: 0.293s
> average tot is: 25.079s
> $ benchavg.pl < mplayer-i386-O3-benchmark.log
> 5 results processed
> average  VC is: 19.692s
> average  VO is: 4.382s
> average SYS is: 0.280s
> average tot is: 24.353s
> $ benchavg.pl < mplayer-athlon-O2-benchmark.log
> 5 results processed
> average  VC is: 19.650s
> average  VO is: 4.493s
> average SYS is: 0.284s
> average tot is: 24.426s
> $ benchavg.pl < mplayer-athlon-O3-benchmark.log
> 5 results processed
> average  VC is: 19.451s
> average  VO is: 4.387s
> average SYS is: 0.278s
> average tot is: 24.115s
> 
> i386 binary was compiled with runtime detection and -march=i386
> -mtune=pentium4 (the default OPTFLAGS in Fedora). Yes, I know it's
> a stupid default. -O3 is 3% faster in this case.
> athlon binary was compiled with -march=athlon. -O3 is 1.3% faster.
> 
> Next up: more CFLAGS tests

Small clarification: -O3 above and below is actually "-O3 -ffast-math".

Here they are:
Compiled with -O3 -ftree-loop-linear -ftree-loop-im -ftree-loop-ivcanon -fivopts
$ benchavg.pl <mplayer-athlon-O3+-benchmark.log 5 results processed
average  VC is: 19.474s
average  VO is: 4.387s
average SYS is: 0.280s
average tot is: 24.142s

Athlon-optimized binary is actually slightly (0.1%) slower...

$ benchavg.pl <mplayer-i386-O3+-benchmark.log 5 results processed
average  VC is: 19.571s
average  VO is: 4.393s
average SYS is: 0.283s
average tot is: 24.248s

... while generic i386 binary is still a bit (0.4%) faster.

Conclusion for now: keep our OPTFLAGS as they are (since there's no -O4,
they're equivalent to -O3 -ffast-math).

It remains to be seen if -ftree-vectorize does any good.

Regards,
R.

-- 
MPlayer RPMs maintainer: http://rpm.greysector.net/mplayer/
"I am Grey. I stand between the candle and the star. We are Grey.
 We stand between the darkness ... and the light."
        -- Delenn in Grey Council in Babylon 5:"Babylon Squared"




More information about the MPlayer-dev-eng mailing list