[FFmpeg-devel] [PATCH] Optimization of original IFF codec
Mon Apr 26 20:45:07 CEST 2010
Sebastian Vater a ?crit :
> M?ns Rullg?rd a ?crit :
>> Sebastian Vater <cdgs.basty at googlemail.com> writes:
>>> I also took a look on disassembly output...the shift out-side the loop
>>> for lut init is only done once, gcc optimizes that and just puts the
>>> precalculated shift-result into the correct positions.
>> How many different shift positions are there? What hardware are you
>> benchmarking this on
> AMD Athlon XP+ 2100.
> If you look at the source code in libavcodec/iff.c, you will notice that
> decodeplane8 is ONLY and ONLY called after a change to plane, so that
> has to be recalculated anyway for each call to it. ;-)
> I think that really explains it all! But if you wish to look at
> disassembly output of both, here is it, gcc is really clever this time.
Btw, you brought me to a nice idea with your complaints...I could
precalculate all these values for each plane in decode_init and then
just memcpy it in decodeplane8/24 to local stack, what do you think of this?
This will yield in 8 (planes)*4 (uint32_t's)*16 (sizeof (struct lut)) =
512 bytes of tables for decodeplane8 and 24 (planes)*4 (uint32_t's)*16
(sizeof (struct lut))*4 (lut) = 6144 bytes.
:-) Basty/CDGS (-:
More information about the ffmpeg-devel