[FFmpeg-devel] [PATCH] Optimization of original IFF codec
Sun Apr 25 23:48:53 CEST 2010
On Sun, Apr 25, 2010 at 01:49:54PM +0200, Sebastian Vater wrote:
> Hey to all!
> I have a new (and my first git patch ;-)) ready for optimizing the stuff.
> I also did move some if's out of a critical loop (checking whether we
> have 8 bit or 32 bit output, as well as interleaved).
> This elimitates most of the inner-loop branches and thus reduces stalls
> because of wrong branch prediction, which is quite expensive.
the code, checks pix_fmt and codec_tag once per row of bits or less often
this is not inside the innermost loop, nor changing thus not wrongly predicted
and its not expensive either.
> Michael Niedermayer a ?crit :
> > amongth all these optimizations, i am wondering how much faster things become
> > does that inline speed the code up?
> > does the changing to unsigned?
> > you can test easily by using the START/STOP_TIMER makros
> I was relooking at that piece of code again and just found that the
> division is not required at all.
> Unsigned changes because it allows to assume the compiler that it can
> replace * 8 with << 3.
no, *8 and <<3 are identical operations for signed numbers as well
the difference is with divisions
> > (buf_size * 8 + bps - 1) / bps
> > could be done outside the loop
> > and the 2 loops look like they could be done as one loop
> > that loop then can be unrolled by a factor of 4 and its inside for the
> > uint8_t type case be implemented like:
> > v= lut[get_bits(&gb, 4)];
> > AV_WN32A(dst+b, AV_RN32A(dst+b) | v);
> The thing is that type can be both uint8_t and uint32_t. It's a #define
> macro which gets the type (uint8_t or uint32_t) passed by.
> So not fixed yet because I'm unsure here, if those two lines can be done
> with dst being uint32_t also.
they can, and it will speed the uint8 case up significantly
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Avoid a single point of failure, be that a person or equipment.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel