[FFmpeg-devel] [RFC] abs vs FFABS
Michael Niedermayer
michaelni
Sat Jan 17 17:45:21 CET 2009
On Sat, Jan 17, 2009 at 10:32:10AM -0500, Ronald S. Bultje wrote:
> Hi,
>
> On Sat, Jan 17, 2009 at 9:02 AM, Stefan Gehrer <stefan.gehrer at gmx.de> wrote:
> > currently there is a mixture of abs and FFABS in the code,
> > for example in dsputil.c 96 of the former and 36 of the latter.
> > In cavs* I use abs() throughout, so I wonder if there is
> > a preference for one of the two and then if the other one
> > should be replaced.
>
> abs() appears faster, so maybe FFABS() should be removed in favour of abs().
>
> I'm wondering if context makes a difference in which one is faster in
> particular cases? I wouldn't want to have to test every single case...
> :-).
>
> Ronald
>
> [1] 1billion abs/FFABS cycles + some checks, m = macro (FFABS()), b =
> built-in (abs()), see attached code
> i686-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build
> 5367), compiled with -O3
> mac121641:/tmp ronaldbultje$ time ./test m 1000000000
[...]
> #include <math.h>
> #include <stdlib.h>
> #include <stdio.h>
>
> #define start_test \
> int n, res, start = 0; \
> int increase = 0x1234FEDC; \
> for (n = 0; n < total; n++) {
> #define end_test \
> start += increase; \
> res = (increase & 0xFFFFFFF) << 4 | (increase & 0xF0000000) >> 28; \
> increase = res; \
> } \
> return 0
> #define ABS(x) x < 0 ? -x : x
>
> static int test_macro(int total)
> {
> start_test
> if ((res = ABS(start)) < 0)
> return 1;
> end_test;
> }
this code is flawed in more ways than it has lines
first the test for < 0 is just idiotic, we arent doing that in lav* and
thus dont care how it performs
second, you are feeding very regular numbers into abs/ABS thus the branch
prediction will have a very high chance to predict it correctly, theres
nothing wrong with testing this case but it seems from your code this
wasnt what you intended just passing -1 always has the same effect.
to quote knuth (from memory), random functions are poor random number
generators
and even once you fix your generator from converging to -1 due to increase
being signed and >>28 not doing what you think it does it still is a
very poor generator
third as others have noticed a non broken gcc optmizes the whole away
fourth, make sure you compile with correct cpu/arch/tune so gcc can use
things like cmov
and heres a good random number generator the cpus branch prediction wont
predict. But even a LCG will perform better than your generator.
typedef struct KISSState{
uint32_t z,w,jsr,jcong;
}KISSState;
static uint32_t get_random(KISSState *s){
s->z = (s->z>>16) + 36969*(s->z&0xFFFF);
s->w = (s->w>>16) + 18000*(s->w&0xFFFF);
s->jsr ^= s->jsr<<17;
s->jsr ^= s->jsr>>13;
s->jsr ^= s->jsr<< 5;
s->jcong= 1234567 + 69069*s->jcong;
return ((s->w + (s->z<<16)) ^ s->jcong) + s->jsr;
}
static void init_random(KISSState *s, uint32_t seed){
s->z = 362436069 ^ seed;
s->w = 521288629 ^ (987654321*seed);
s->jsr = 123456789 + seed;
s->jcong= 380116160 - 314159265*seed;
}
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
it is not once nor twice but times without number that the same ideas make
their appearance in the world. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090117/e650948f/attachment.pgp>
More information about the ffmpeg-devel
mailing list