[MPlayer-dev-eng] [PATCH] mp3 decoding performance on ARM

Michael Niedermayer michaelni at gmx.at
Sun Aug 20 21:27:40 CEST 2006


Hi

On Sun, Aug 20, 2006 at 11:52:36AM +0300, Siarhei Siamashka wrote:
> On Monday 14 August 2006 23:59, Michael Niedermayer wrote:
> 
> > > So I guess that replacing all the other performance critical macros with
> > > optimized versions,  ffmp3 could probably get much faster and closer to
> > > libmad. But that's probably outside of scope of this mailing list, maybe
> > > it is better to subscribe to ffmpeg mailing list and submit any ffmpeg
> > > related patches there.
> >
> > yes
> > arm opimizations are very welcome ...
> 
> The first patch with some minor performance improvements using ARM inline
> assembler is attached.
> 
> Nokia 770, ARM926EJS 250MHz cpu
> 128kbit constant bitrate soundtrack, length 6:13
> 
> lowest time of 3 runs for each test:
> # time ./mplayer -quiet -ac ffmp3 -ao pcm:fast:file=/dev/null test.mp3
> 
> without patch, high quality (CONFIG_MPEGAUDIO_HP enabled):
> real    1m 27.27s
> user    1m 25.94s
> sys     0m 0.77s
> 
> without patch, low quality (CONFIG_MPEGAUDIO_HP disabled):
> real    1m 15.15s
> user    1m 14.31s
> sys     0m 0.64s
> 
> with patch, high quality (CONFIG_MPEGAUDIO_HP enabled):
> real    1m 21.09s
> user    1m 20.18s
> sys     0m 0.60s
> 
> with patch, low quality (CONFIG_MPEGAUDIO_HP disabled):
> real    1m 10.38s
> user    1m 9.42s
> sys     0m 0.72s
> 
> Also verified that decompressed wav files are identical with and
> without patch on ARM. I assume that the result of MULL macro 
> should always fit 32 bits, is that true?

yes i think so


> 
> Also there seems to be some minor portability problem in this decoder 
> as arm and x86  builds produce different results when decoding mp3 
> using ffmp3 (and this problem is not related to my patch). But both wav 
> files play ok if you listen to them.

how different are they? +-1 differences or something significant?

[...]
> 
> The next improvement can be inline asm for MACS (it does not seem to 
> be used though) and MULS macros. I have a patch for them too and it
> reduces decoding time for another 5-8 seconds, but it requires the availablity
> of armv5 edsp instructions (and mplayer currently requires only armv4
> architecture + can be configured to use iwmmx on intel xscale).
> 
> For more noticeable performance optimizations I guess it is better to
> benchmark mp3 decoding with valgrind (callgrind tool), find what takes 
> the most time and focus on optimizing it (and hope that performance
> bottlenecks are the same for x86 and arm). At least I can verify if the code
> generated by the compiler is efficient or not in that parts.

well, a few ideas
* change code so that multiplies need exactly 32bit >> (not easy)
* look at the *dct functions, compare against other OSS decoders and port
  the fastest to lavc (easy)
* do the same with *synth_filter()


[...]
> -#define MULL(a,b) (((int64_t)(a) * (int64_t)(b)) >> FRAC_BITS)
> +#if defined(ARCH_ARMV4L) && (FRAC_BITS == 15)
> +static always_inline int MULL(int a, int b)
> +{
> +    int lo, hi;
> +    asm ("smull %0, %1, %2, %3\n\t"
> +         "mov   %0, %0, lsr #15\n\t"
> +         "add   %1, %0, %1, lsl #17\n\t"
> +         : "=r"(lo), "=r"(hi)
> +         : "%r"(a), "r"(b));
> +    return hi;
> +}
> +#elif defined(ARCH_ARMV4L) && (FRAC_BITS == 23)
> +static always_inline int MULL(int a, int b)
> +{
> +    int lo, hi;
> +    asm ("smull %0, %1, %2, %3\n\t"
> +         "mov   %0, %0, lsr #23\n\t"
> +         "add   %1, %0, %1, lsl #9\n\t"
> +         : "=r"(lo), "=r"(hi)
> +         : "%r"(a), "r"(b));
> +    return hi;
> +}
> +#else
> +#define MULL(a,b) (int)(((int64_t)(a) * (int64_t)(b)) >> FRAC_BITS)
> +#endif

have you tried to set -mcpu/-march/-mtune correctly?
if your code is still faster then please merge the 2
"i" or something similar should produce a constant for the shift

also maybe try
#define MULL(a,b) (int32_t)(((int64_t)(a) * (int64_t)(b)) >> FRAC_BITS)

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is



More information about the MPlayer-dev-eng mailing list