[MPlayer-dev-eng] Compiling MPlayer with GCC profile-based optimisations

Adam Rice adamrice at ntlworld.com
Wed Dec 17 00:03:02 CET 2003


I spent some time investigating whether using the profile-based optimisation
included in GCC 3.3 and later would help MPlayer performance. The short answer
is no. For the longer answer, read on :-)

The idea of profile-based optimisation is to use information collected during
runtime to help the compiler produce better code. Specifically, GCC can
improve branch prediction by counting which branch is taken more often in
practice. It can also move blocks of code that are executed in sequence
together, and re-order functions to move hotspots together to improve VM and
processor cache usage.

It is quite simple to use in practice. Compile the program with normal
optimisation arguments plus -fprofile-arcs. The compiler adds code to count
the number of times each basic block is executed. When the resulting binary is
run, it generates .da files at exit. One .da file is created for each .c
source file, with the same name and directory. If .da files from a previous
run exist, they are merged.

In order for the data to be meaningful, it needs to be collected during
"normal" use of the program. I played a bunch of .avi files and a DVD to make
use of different codecs, seeked around and ran the GUI. One time I played a
long .avi file, then realised I'd just played it with my installed copy of
mplayer instead of the profiling version :-)

Once the .da files are created, the program is then recompiled with the
-fbranch-probabilities (and the normal optimisation options) to create a
profile-optimised binary. It doesn't need to be told where to find the .da
files--it picks them up automatically. I also added the -ftracer option which
supposedly helps re-arrange functions.

To test the performance of the profile-optimised binary, I did a benchmark run
with an mplayer compiled with the usual options, and then benchmarked the same
file with the profile-optimised binary. The results did not show a marked
improvement. For the sake of completeness, here are the before-and-after
results:

Before optimisation:

BENCHMARKs: VC:  10.388s VO:   5.640s A:   0.000s Sys:   0.751s =   16.779s
BENCHMARK%: VC: 61.9109% VO: 33.6137% A:  0.0000% Sys:  4.4754% = 100.0000%

10.700u 0.410s 0:17.22 64.5%    0+0k 0+0io 2338pf+0w

After:

BENCHMARKs: VC:  11.176s VO:   2.194s A:   0.000s Sys:   0.883s =   14.254s
BENCHMARK%: VC: 78.4087% VO: 15.3951% A:  0.0000% Sys:  6.1961% = 100.0000%

10.070u 0.330s 0:15.04 69.1%    0+0k 0+0io 2301pf+0w

Although it appears that VO performance may have improved, the numbers were
quite variable between runs so it may not be significant. I made the mistake
of only doing one benchmark with the "before" version, so my numbers are not
much use. I also should have also compiled the "before" version with -ftracer,
as that might have changed the results. But it is clear that even if there is
in improvement, it is marginal at best. This is not too surprising, as the
critical portions of MPlayer are already heavily optimised.

These features of GCC seem very immature. A couple of times it corrupted its
.da files (maybe it's not thread-safe?), and mp3lib gave errors and squeaking
and popping playing one file with -fprofile-arcs that it played perfectly with
both the before- and after- versions. It may be worth re-examining the issue
with a future release of GCC.

It also may be worth investigating whether the optimisation options mplayer is
currently using are actually optimal. There's an interesting paper at
http://www.coyotegulch.com/acovea/ on using genetic algorithms to find optimal
compiler options. Although the technique is not directly applicable to
mplayer, some of the conclusions are insightful.

Adam Rice

-- 
Adam Rice -- adamrice at ntlworld.com -- Blackburn, Lancashire, England



More information about the MPlayer-dev-eng mailing list