[MPlayer-dev-eng] Remove internal mp3lib forked copy

Thomas Orgis thomas-forum at orgis.org
Sun Mar 25 18:32:59 CEST 2012


Am Thu, 22 Mar 2012 09:43:14 +0100
schrieb Thomas Orgis <thomas-forum at orgis.org>: 

> Am Fri, 16 Mar 2012 13:30:34 +0000 (UTC)
> schrieb Carl Eugen Hoyos <cehoyos at ag.or.at>: 
> 
> > Please provide performance results for without SSE2 (and without SSE).
> 

Well, if it is of interest, I ran some comparisons on a Duron now.
Heading plain mpg123 against the stand-alone ffmpeg program for the
task of decoding to 32 bit floating point values (ffmpeg is slower when
decoding to 16 bit):

shell$ perf stat src/mpg123 -q -s -e f32 --cpu
i386 /dev/shm/convergence.mp3  > /dev/null

 Performance counter stats for 'src/mpg123 -q -s -e f32 --cpu
i386 /dev/shm/convergence.mp3':

   27253.374677  task-clock-msecs         #      0.996 CPUs 
            259  context-switches         #      0.000 M/sec
              0  CPU-migrations           #      0.000 M/sec
            294  page-faults              #      0.000 M/sec
    21769623622  cycles                   #    798.786 M/sec
    26791090615  instructions             #      1.231 IPC  
     8268629165  cache-references         #    303.398 M/sec
       11587835  cache-misses             #      0.425 M/sec

   27.355488493  seconds time elapsed

shell$ perf stat ./ffmpeg -v 0 -acodec mp3float
-i /dev/shm/convergence.mp3  -sample_fmt flt -f f32le  - > /dev/null 
 Performance counter stats for './ffmpeg -v 0 -acodec mp3float
-i /dev/shm/convergence.mp3 -sample_fmt flt -f f32le -':

   29581.429380  task-clock-msecs         #      0.994 CPUs 
            300  context-switches         #      0.000 M/sec
              0  CPU-migrations           #      0.000 M/sec
            946  page-faults              #      0.000 M/sec
    23615793924  cycles                   #    798.332 M/sec
    27249074172  instructions             #      1.154 IPC  
     8039738440  cache-references         #    271.783 M/sec
       36930978  cache-misses             #      1.248 M/sec

   29.757587610  seconds time elapsed

It is rather obvious that ffmpeg does not rely on (SSE) assembly routines
here. It is similar for mpg123 in the case of decoding to floats; when
decoding to 16 bit integers, the 3DNowExt code
accomplishes the job in less than 22 seconds.

It is interesting that the 3DNowExt DCT optimizations are actually a
bit slower than the equivalent C code on this machine, so by disabling
those, I finally managed to produce a clear win of mpg123 against old
mp3lib also on my Duron test setup: As you can read at the end of
http://mpg123.org/beating_mp3lib_in_mplayer/ , ad_mpg123 with mpg123
trunk needs 24.7 seconds for a decoding task that takes mp3lib 25.5
seconds (not exactly the same task as for for ffmpeg and
mpg123 above).

The comparison against ffmp3float is also fine on this box when used
from MPlayer (which gives mpg123 an apparently inevitable extra
performance hit), because mpg123 can play its 3DNowExt synthesis.

I showed previously that mpg123 is faster than mp3lib on a K6-III+
(using 3DnowExt)  and on Athlon 64 as well as Core 2 (using SSE).

The comparison with ffmp3float is not consistent across differing CPUs
and is a separate topic anyway.


Alrighty then,

Thomas.


PS: Before bashing the "simply crap" 3DNow / 3DNowExt code in mpg123
consider that here I talked about the dct36 and dct64 routines, which
are rather identical between mpg123 and mp3lib. Also the 3DNowExt
synth_1to1 is directly extracted from mp3lib's inline assembly.
Of course that does not mean that they could not be improved with more
care; and also the synth indeed does its job of being faster than C.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20120325/60d21a20/attachment.asc>


More information about the MPlayer-dev-eng mailing list