[FFmpeg-devel] Extend/optimize RGB to RGB conversions funcs into rgb2rgb.c

yann.lepetitcorps at free.fr yann.lepetitcorps at free.fr
Mon Sep 10 01:45:19 CEST 2012


Exact, I have rebench it but with -O9 parameter on GCC and the runtime
difference between to originals and new versions is relatively small :

Test original rgb24to32() func : 28 ms
Test new rgb24to32_alpha() func : 28 ms
Test original rgba32to24() func : 24 ms
Test modified rgba32to24() func : 23 ms

rgb24to32() : original=28ms modified=28ms (0ms 0.00%)

rgba32to24() : original=24ms modified=23ms (1ms 4.35%)

Note that results are relatively fluctuant with diiferences between -15% and
+15%
(the "new" rgba32to24() seem generally more fast than the "old" but the new
rgb24to32_alpha() is regulary less fast than rgb24to32() [but it handle the
alpha parameter where rgb24to32() always set the alpha to 255)


@+
Yannoo



Selon Loren Merritt <lorenm at u.washington.edu>:

> On Mon, 10 Sep 2012, yann.lepetitcorps at free.fr wrote:
> > Selon Reimar Döffinger <Reimar.Doeffinger at gmx.de>:
> >
> >> Though one thing I wonder is why exactly that is faster, and why your
> >> compiler can't figure out how to optimize it on its own.
> >> There is also a bit the issue that compared to NEON-optimizing the code
> >> this is rather a very minor optimization.
> >
> > I think that is a little more speed because of this :
> >
> > -        dst[3 * i + 0] = src[4 * i + 2];
> > -        dst[3 * i + 1] = src[4 * i + 1];
> > -        dst[3 * i + 2] = src[4 * i + 0];
> >
> > +        dst[0] = psrc[2];
> > +        dst[1] = psrc[1];
> > +        dst[2] = psrc[0];
> >
> > => the copy is make with a "direct" adressing, cf. without multiplications
> or
> > additions into the [] array adressing
> > (can the compilator handle automaticaly the * 3 multiplication for free ?)
>
> It's not that a *3 is free, but rather that the addressing mode of the
> generated instructions doesn't have to be the same as the one in the
> source code. GCC is normally capable of switching from index variables to
> pointer incrementing or vice versa, though it doesn't always choose
> optimally when to do so.
>
> --Loren Merritt




More information about the ffmpeg-devel mailing list