[FFmpeg-devel] [PATCH] Optimization of original IFF codec

Måns Rullgård mans
Mon Apr 26 21:09:04 CEST 2010


Sebastian Vater <cdgs.basty at googlemail.com> writes:

> Sebastian Vater a ?crit :
>> M?ns Rullg?rd a ?crit :
>>   
>>> Sebastian Vater <cdgs.basty at googlemail.com> writes:
>>>
>>>   
>>>     
>>>> I also took a look on disassembly output...the shift out-side the loop
>>>> for lut init is only done once, gcc optimizes that and just puts the
>>>> precalculated shift-result into the correct positions.
>>>>     
>>>>       
>>> How many different shift positions are there?  What hardware are you
>>> benchmarking this on
>>>     
>> AMD Athlon XP+ 2100.
>>
>> If you look at the source code in libavcodec/iff.c, you will notice that
>> decodeplane8 is ONLY and ONLY called after a change to plane, so that
>> has to be recalculated anyway for each call to it. ;-)
>>
>> I think that really explains it all! But if you wish to look at
>> disassembly output of both, here is it, gcc is really clever this time.
>>
>>   
> Btw, you brought me to a nice idea with your complaints...I could
> precalculate all these values for each plane in decode_init and then
> just memcpy it in decodeplane8/24 to local stack, what do you think of this?

Skip the memcpy and make the table static const.

> This will yield in 8 (planes)*4 (uint32_t's)*16 (sizeof (struct lut)) =
> 512 bytes of tables for decodeplane8

512 bytes is nothing to worry about.

> and 24 (planes)*4 (uint32_t's)*16 (sizeof (struct lut))*4
> (lut[0123]) = 6144 bytes.

6k isn't a lot either.  Just store it statically.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list