[MPlayer-dev-eng] Using -O4 vs. -Os
degger at fhm.edu
Wed Oct 15 09:18:05 CEST 2003
On Tue, Oct 14, 2003 at 05:14:05PM -0500, Zoltan Hidvegi wrote:
> For the discussion about using -O4 vs. -Os, I've run sume benchmarks,
> on my Athlon XP Thoroughbred 2233 MHz, 194MHz fsb machine, using
> gcc-3.3.2 prerelease (debian unstable 3.3.2-0pre5). Compile options
> for the -Os compile were -Os -march=athlon-4 -mcpu=athlon-4 -pipe
> -ffast-math -fomit-frame-pointer, and the same with -O4 instead of -Os
> for the -O4 tests. Most of the time there is not much difference
> between -O4 and -Os, -O4 is usually faster, but sometimes -Os is
> slightly faster (e.g. for the gaussian scale of denoise3d filters).
> However, for hqdn3d, -Os is 5x slower, which is very strange.
First of all: there ain't no thing as -O4, -O3 is the highest
optimisation level. I hope you ran the tests more than just once to
eliminate fluctuation; if so you should also supply the number of
runs, the mean and the deviation for each run as well as the
performance improvement/loss in percent.
-O3 runs a set of more complicated optimisation which can pay off
sometimes but typicailly bloats the code. You should also use proper
alignment but IIRC this is automatically implied by -mcpu=<cpu>.
-Os optimises for size which means that it's a good cache saver;
good locality especially in caches can dramatically boost performance
and make software fly.
More information about the MPlayer-dev-eng