[FFmpeg-devel] [PATCH 0/9] DCA (DTS) decoder optimisations for ARMv6

Tue Jul 16 13:41:32 CEST 2013

Hi Christophe,

On Mon, 15 Jul 2013 19:32:48 +0100, Christophe Gisquet <christophe.gisquet at gmail.com> wrote:
> I'm not sure this may actually help, but I also did some generic
> optimizations, in addition to x86 ones here:
> https://github.com/kurosu/libav/commits/dcadsp

Thanks, that's interesting. I see you've been working on some similar
areas - for example the bit in qmf_32_subbands() that inverts the matrix
and negates half the samples. Obviously there are different hostspots
with ARM than x86 (for example, function call overhead isn't as high) but
the fact that I've split out that bit of code so we can have platform-
specific implementations might be useful to you.

> For instance, I wasn't able to find samples exercising dca_lfe_fir
> with resampling, and the code becomes a lot simpler on x86 if each
> case is split.

Yes, I wasn't able to find any examples either, but I also recognised
that any given stream was only likely to have one decifactor or the
other. Rather than two separate function pointers, I used a macro within
the assembly, then an if/then/else structure around two expansions of the
macro. With good branch predictors that modern CPUs have, the overhead of
this is very minimal.

Ben