[FFmpeg-devel] [RFC] abs vs FFABS

Michael Niedermayer michaelni
Sat Jan 17 17:45:21 CET 2009


On Sat, Jan 17, 2009 at 10:32:10AM -0500, Ronald S. Bultje wrote:
> Hi,
> 
> On Sat, Jan 17, 2009 at 9:02 AM, Stefan Gehrer <stefan.gehrer at gmx.de> wrote:
> > currently there is a mixture of abs and FFABS in the code,
> > for example in dsputil.c 96 of the former and 36 of the latter.
> > In cavs* I use abs() throughout, so I wonder if there is
> > a preference for one of the two and then if the other one
> > should be replaced.
> 
> abs() appears faster, so maybe FFABS() should be removed in favour of abs().
> 
> I'm wondering if context makes a difference in which one is faster in
> particular cases? I wouldn't want to have to test every single case...
> :-).
> 
> Ronald
> 
> [1] 1billion abs/FFABS cycles + some checks, m = macro (FFABS()), b =
> built-in (abs()), see attached code
> i686-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build
> 5367), compiled with -O3
> mac121641:/tmp ronaldbultje$ time ./test m 1000000000
[...]
> #include <math.h>
> #include <stdlib.h>
> #include <stdio.h>
> 
> #define start_test \
>     int n, res, start = 0; \
>     int increase = 0x1234FEDC; \
>     for (n = 0; n < total; n++) {
> #define end_test \
>         start += increase; \
>         res = (increase & 0xFFFFFFF) << 4 | (increase & 0xF0000000) >> 28; \
>         increase = res; \
>     } \
>     return 0
> #define ABS(x) x < 0 ? -x : x
> 
> static int test_macro(int total)
> {
>     start_test
>         if ((res = ABS(start)) < 0)
>             return 1;
>     end_test;
> }

this code is flawed in more ways than it has lines

first the test for < 0 is just idiotic, we arent doing that in lav* and
thus dont care how it performs

second, you are feeding very regular numbers into abs/ABS thus the branch
prediction will have a very high chance to predict it correctly, theres
nothing wrong with testing this case but it seems from your code this
wasnt what you intended just passing -1 always has the same effect.
to quote knuth (from memory), random functions are poor random number
generators
and even once you fix your generator from converging to -1 due to increase
being signed and >>28 not doing what you think it does it still is a
very poor generator 

third as others have noticed a non broken gcc optmizes the whole away

fourth, make sure you compile with correct cpu/arch/tune so gcc can use
things like cmov

and heres a good random number generator the cpus branch prediction wont
predict. But even a LCG will perform better than your generator.

typedef struct KISSState{
    uint32_t z,w,jsr,jcong;
}KISSState;

static uint32_t get_random(KISSState *s){
    s->z    = (s->z>>16) + 36969*(s->z&0xFFFF);
    s->w    = (s->w>>16) + 18000*(s->w&0xFFFF);
    s->jsr ^= s->jsr<<17;
    s->jsr ^= s->jsr>>13;
    s->jsr ^= s->jsr<< 5;
    s->jcong= 1234567 + 69069*s->jcong; 
    return ((s->w + (s->z<<16)) ^ s->jcong) + s->jsr;
}

static void init_random(KISSState *s, uint32_t seed){
    s->z    = 362436069 ^   seed;
    s->w    = 521288629 ^ (987654321*seed);
    s->jsr  = 123456789 +   seed;
    s->jcong= 380116160 -  314159265*seed;
}

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

it is not once nor twice but times without number that the same ideas make
their appearance in the world. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090117/e650948f/attachment.pgp>



More information about the ffmpeg-devel mailing list