[FFmpeg-devel] [PATCH] Optimization of original IFF codec
Mon Apr 26 21:09:04 CEST 2010
Sebastian Vater <cdgs.basty at googlemail.com> writes:
> Sebastian Vater a ?crit :
>> M?ns Rullg?rd a ?crit :
>>> Sebastian Vater <cdgs.basty at googlemail.com> writes:
>>>> I also took a look on disassembly output...the shift out-side the loop
>>>> for lut init is only done once, gcc optimizes that and just puts the
>>>> precalculated shift-result into the correct positions.
>>> How many different shift positions are there? What hardware are you
>>> benchmarking this on
>> AMD Athlon XP+ 2100.
>> If you look at the source code in libavcodec/iff.c, you will notice that
>> decodeplane8 is ONLY and ONLY called after a change to plane, so that
>> has to be recalculated anyway for each call to it. ;-)
>> I think that really explains it all! But if you wish to look at
>> disassembly output of both, here is it, gcc is really clever this time.
> Btw, you brought me to a nice idea with your complaints...I could
> precalculate all these values for each plane in decode_init and then
> just memcpy it in decodeplane8/24 to local stack, what do you think of this?
Skip the memcpy and make the table static const.
> This will yield in 8 (planes)*4 (uint32_t's)*16 (sizeof (struct lut)) =
> 512 bytes of tables for decodeplane8
512 bytes is nothing to worry about.
> and 24 (planes)*4 (uint32_t's)*16 (sizeof (struct lut))*4
> (lut) = 6144 bytes.
6k isn't a lot either. Just store it statically.
mans at mansr.com
More information about the ffmpeg-devel