[MPlayer-dev-eng] [PATCH] mp3 decoding performance on ARM

Siarhei Siamashka siarhei.siamashka at gmail.com
Sun Aug 20 10:52:36 CEST 2006


On Monday 14 August 2006 23:59, Michael Niedermayer wrote:

> > So I guess that replacing all the other performance critical macros with
> > optimized versions,  ffmp3 could probably get much faster and closer to
> > libmad. But that's probably outside of scope of this mailing list, maybe
> > it is better to subscribe to ffmpeg mailing list and submit any ffmpeg
> > related patches there.
>
> yes
> arm opimizations are very welcome ...

The first patch with some minor performance improvements using ARM inline
assembler is attached.

Nokia 770, ARM926EJS 250MHz cpu
128kbit constant bitrate soundtrack, length 6:13

lowest time of 3 runs for each test:
# time ./mplayer -quiet -ac ffmp3 -ao pcm:fast:file=/dev/null test.mp3

without patch, high quality (CONFIG_MPEGAUDIO_HP enabled):
real    1m 27.27s
user    1m 25.94s
sys     0m 0.77s

without patch, low quality (CONFIG_MPEGAUDIO_HP disabled):
real    1m 15.15s
user    1m 14.31s
sys     0m 0.64s

with patch, high quality (CONFIG_MPEGAUDIO_HP enabled):
real    1m 21.09s
user    1m 20.18s
sys     0m 0.60s

with patch, low quality (CONFIG_MPEGAUDIO_HP disabled):
real    1m 10.38s
user    1m 9.42s
sys     0m 0.72s

Also verified that decompressed wav files are identical with and
without patch on ARM. I assume that the result of MULL macro 
should always fit 32 bits, is that true?

Also there seems to be some minor portability problem in this decoder 
as arm and x86  builds produce different results when decoding mp3 
using ffmp3 (and this problem is not related to my patch). But both wav 
files play ok if you listen to them.

Posted patch in this mailing list as I noticed that there are some 
other ARM users here. It would be good to have them test this patch 
and confirm results.

The next improvement can be inline asm for MACS (it does not seem to 
be used though) and MULS macros. I have a patch for them too and it
reduces decoding time for another 5-8 seconds, but it requires the availablity
of armv5 edsp instructions (and mplayer currently requires only armv4
architecture + can be configured to use iwmmx on intel xscale).

For more noticeable performance optimizations I guess it is better to
benchmark mp3 decoding with valgrind (callgrind tool), find what takes 
the most time and focus on optimizing it (and hope that performance
bottlenecks are the same for x86 and arm). At least I can verify if the code
generated by the compiler is efficient or not in that parts.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ffmp3-arm-tweaks1.diff
Type: text/x-diff
Size: 1648 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20060820/046549e5/attachment.diff>


More information about the MPlayer-dev-eng mailing list