[MPlayer-dev-eng] Remove internal mp3lib forked copy
Thomas Orgis
thomas-forum at orgis.org
Sun Mar 25 18:32:59 CEST 2012
Am Thu, 22 Mar 2012 09:43:14 +0100
schrieb Thomas Orgis <thomas-forum at orgis.org>:
> Am Fri, 16 Mar 2012 13:30:34 +0000 (UTC)
> schrieb Carl Eugen Hoyos <cehoyos at ag.or.at>:
>
> > Please provide performance results for without SSE2 (and without SSE).
>
Well, if it is of interest, I ran some comparisons on a Duron now.
Heading plain mpg123 against the stand-alone ffmpeg program for the
task of decoding to 32 bit floating point values (ffmpeg is slower when
decoding to 16 bit):
shell$ perf stat src/mpg123 -q -s -e f32 --cpu
i386 /dev/shm/convergence.mp3 > /dev/null
Performance counter stats for 'src/mpg123 -q -s -e f32 --cpu
i386 /dev/shm/convergence.mp3':
27253.374677 task-clock-msecs # 0.996 CPUs
259 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
294 page-faults # 0.000 M/sec
21769623622 cycles # 798.786 M/sec
26791090615 instructions # 1.231 IPC
8268629165 cache-references # 303.398 M/sec
11587835 cache-misses # 0.425 M/sec
27.355488493 seconds time elapsed
shell$ perf stat ./ffmpeg -v 0 -acodec mp3float
-i /dev/shm/convergence.mp3 -sample_fmt flt -f f32le - > /dev/null
Performance counter stats for './ffmpeg -v 0 -acodec mp3float
-i /dev/shm/convergence.mp3 -sample_fmt flt -f f32le -':
29581.429380 task-clock-msecs # 0.994 CPUs
300 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
946 page-faults # 0.000 M/sec
23615793924 cycles # 798.332 M/sec
27249074172 instructions # 1.154 IPC
8039738440 cache-references # 271.783 M/sec
36930978 cache-misses # 1.248 M/sec
29.757587610 seconds time elapsed
It is rather obvious that ffmpeg does not rely on (SSE) assembly routines
here. It is similar for mpg123 in the case of decoding to floats; when
decoding to 16 bit integers, the 3DNowExt code
accomplishes the job in less than 22 seconds.
It is interesting that the 3DNowExt DCT optimizations are actually a
bit slower than the equivalent C code on this machine, so by disabling
those, I finally managed to produce a clear win of mpg123 against old
mp3lib also on my Duron test setup: As you can read at the end of
http://mpg123.org/beating_mp3lib_in_mplayer/ , ad_mpg123 with mpg123
trunk needs 24.7 seconds for a decoding task that takes mp3lib 25.5
seconds (not exactly the same task as for for ffmpeg and
mpg123 above).
The comparison against ffmp3float is also fine on this box when used
from MPlayer (which gives mpg123 an apparently inevitable extra
performance hit), because mpg123 can play its 3DNowExt synthesis.
I showed previously that mpg123 is faster than mp3lib on a K6-III+
(using 3DnowExt) and on Athlon 64 as well as Core 2 (using SSE).
The comparison with ffmp3float is not consistent across differing CPUs
and is a separate topic anyway.
Alrighty then,
Thomas.
PS: Before bashing the "simply crap" 3DNow / 3DNowExt code in mpg123
consider that here I talked about the dct36 and dct64 routines, which
are rather identical between mpg123 and mp3lib. Also the 3DNowExt
synth_1to1 is directly extracted from mp3lib's inline assembly.
Of course that does not mean that they could not be improved with more
care; and also the synth indeed does its job of being faster than C.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20120325/60d21a20/attachment.asc>
More information about the MPlayer-dev-eng
mailing list