[FFmpeg-devel] [PATCH] Heavy optimization of IFF decoder

Ronald S. Bultje rsbultje
Tue Apr 27 18:09:18 CEST 2010


Hi,

2010/4/27 M?ns Rullg?rd <mans at mansr.com>:
> "Ronald S. Bultje" <rsbultje at gmail.com> writes:
>> On Mon, Apr 26, 2010 at 7:39 PM, Sebastian Vater
>> <cdgs.basty at googlemail.com> wrote:
>>> + ? ?const uint32_t lut[] = {0x0000000,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x1000000 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x0010000 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x1010000 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x0000100 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x1000100 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x0010100 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x1010100 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x0000001 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x1000001 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x0010001 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x1010001 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x0000101 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x1000101 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x0010101 << plane,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?0x1010101 << plane};
>>
>> I really can't imagine that a static const lut[][] isn't faster. which
>> file did you use to test this? (Is it on mphq/samples?)
>
> A static table whose values are shifted in the loop is 7% faster on ARM.

It's a little slower on x86 (~12%). However, a (static) 2D array is
faster (3%) over the original patch. Mans just said that's fine on ARM
as well, so you should probably implement that (don't forget that
plane is const, so do a const *lut = table[plane] before entering the
loop, else gcc messes up).

Ronald



More information about the ffmpeg-devel mailing list