[MPlayer-dev-eng] [PATCH] mp3 decoding performance on ARM
Michael Niedermayer
michaelni at gmx.at
Sun Aug 20 21:27:40 CEST 2006
Hi
On Sun, Aug 20, 2006 at 11:52:36AM +0300, Siarhei Siamashka wrote:
> On Monday 14 August 2006 23:59, Michael Niedermayer wrote:
>
> > > So I guess that replacing all the other performance critical macros with
> > > optimized versions, ffmp3 could probably get much faster and closer to
> > > libmad. But that's probably outside of scope of this mailing list, maybe
> > > it is better to subscribe to ffmpeg mailing list and submit any ffmpeg
> > > related patches there.
> >
> > yes
> > arm opimizations are very welcome ...
>
> The first patch with some minor performance improvements using ARM inline
> assembler is attached.
>
> Nokia 770, ARM926EJS 250MHz cpu
> 128kbit constant bitrate soundtrack, length 6:13
>
> lowest time of 3 runs for each test:
> # time ./mplayer -quiet -ac ffmp3 -ao pcm:fast:file=/dev/null test.mp3
>
> without patch, high quality (CONFIG_MPEGAUDIO_HP enabled):
> real 1m 27.27s
> user 1m 25.94s
> sys 0m 0.77s
>
> without patch, low quality (CONFIG_MPEGAUDIO_HP disabled):
> real 1m 15.15s
> user 1m 14.31s
> sys 0m 0.64s
>
> with patch, high quality (CONFIG_MPEGAUDIO_HP enabled):
> real 1m 21.09s
> user 1m 20.18s
> sys 0m 0.60s
>
> with patch, low quality (CONFIG_MPEGAUDIO_HP disabled):
> real 1m 10.38s
> user 1m 9.42s
> sys 0m 0.72s
>
> Also verified that decompressed wav files are identical with and
> without patch on ARM. I assume that the result of MULL macro
> should always fit 32 bits, is that true?
yes i think so
>
> Also there seems to be some minor portability problem in this decoder
> as arm and x86 builds produce different results when decoding mp3
> using ffmp3 (and this problem is not related to my patch). But both wav
> files play ok if you listen to them.
how different are they? +-1 differences or something significant?
[...]
>
> The next improvement can be inline asm for MACS (it does not seem to
> be used though) and MULS macros. I have a patch for them too and it
> reduces decoding time for another 5-8 seconds, but it requires the availablity
> of armv5 edsp instructions (and mplayer currently requires only armv4
> architecture + can be configured to use iwmmx on intel xscale).
>
> For more noticeable performance optimizations I guess it is better to
> benchmark mp3 decoding with valgrind (callgrind tool), find what takes
> the most time and focus on optimizing it (and hope that performance
> bottlenecks are the same for x86 and arm). At least I can verify if the code
> generated by the compiler is efficient or not in that parts.
well, a few ideas
* change code so that multiplies need exactly 32bit >> (not easy)
* look at the *dct functions, compare against other OSS decoders and port
the fastest to lavc (easy)
* do the same with *synth_filter()
[...]
> -#define MULL(a,b) (((int64_t)(a) * (int64_t)(b)) >> FRAC_BITS)
> +#if defined(ARCH_ARMV4L) && (FRAC_BITS == 15)
> +static always_inline int MULL(int a, int b)
> +{
> + int lo, hi;
> + asm ("smull %0, %1, %2, %3\n\t"
> + "mov %0, %0, lsr #15\n\t"
> + "add %1, %0, %1, lsl #17\n\t"
> + : "=r"(lo), "=r"(hi)
> + : "%r"(a), "r"(b));
> + return hi;
> +}
> +#elif defined(ARCH_ARMV4L) && (FRAC_BITS == 23)
> +static always_inline int MULL(int a, int b)
> +{
> + int lo, hi;
> + asm ("smull %0, %1, %2, %3\n\t"
> + "mov %0, %0, lsr #23\n\t"
> + "add %1, %0, %1, lsl #9\n\t"
> + : "=r"(lo), "=r"(hi)
> + : "%r"(a), "r"(b));
> + return hi;
> +}
> +#else
> +#define MULL(a,b) (int)(((int64_t)(a) * (int64_t)(b)) >> FRAC_BITS)
> +#endif
have you tried to set -mcpu/-march/-mtune correctly?
if your code is still faster then please merge the 2
"i" or something similar should produce a constant for the shift
also maybe try
#define MULL(a,b) (int32_t)(((int64_t)(a) * (int64_t)(b)) >> FRAC_BITS)
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is
More information about the MPlayer-dev-eng
mailing list