[FFmpeg-devel] [PATCH] Heavy optimization of IFF decoder

Sebastian Vater cdgs.basty
Tue Apr 27 21:59:13 CEST 2010


Ronald S. Bultje a ?crit :
> Hi,
>
> On Tue, Apr 27, 2010 at 3:05 PM, Sebastian Vater
> <cdgs.basty at googlemail.com> wrote:
>   
>> M?ns Rullg?rd a ?crit :
>>     
>>> Sebastian Vater <cdgs.basty at googlemail.com> writes:
>>>       
>>>>  {
>>>>      GetBitContext gb;
>>>>      unsigned int i;
>>>>      const unsigned b = (buf_size * 8) + bps - 1;
>>>> +    const unsigned b32 = b & ~3;
>>>>      init_get_bits(&gb, buf, buf_size * 8);
>>>> -    for(i = 0; i < b; i++) {
>>>> +    for(i = 0; i < b32; i += 4) {
>>>> +        const uint32_t v = decodeplane8_tab[plane][get_bits(&gb, 4)];
>>>> +        AV_WN32A(dst+i, AV_RN32A(dst+i) | v);
>>>> +    }
>>>>
>>>>         
>>> I suggest using a local variable here, like this:
>>>
>>>     const uint32_t *lut = decodeplane8_tab[plane];
>>>     [...]
>>>     uint32_t v = lut[get_bits(...)];
>>>
>>> I don't trust gcc to do that on its own.
>>>
>>>       
>> Didn't change anything, i.e. still 20% slower...what now?
>>     
>
> With that, it becomes from 20% slower to 3-4% faster for me. Check
> your performance again, or increase the amount of cycles in
> decodeplane8() (as Mans suggested earlier in this thread).
>   

Well, do you know what I think now?

It's just that my GCC 4.2.4 doesn't optimize that as well as
yours...updated patch attached!

-- 

Best regards,
                   :-) Basty/CDGS (-:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: iff-opt-dp8-static.patch
Type: text/x-patch
Size: 1994 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100427/d0df1524/attachment.bin>



More information about the ffmpeg-devel mailing list