[FFmpeg-devel] [PATCH] force dnxhd encoder to be independent of qsort internals

Reimar Döffinger Reimar.Doeffinger
Sun Sep 20 23:39:47 CEST 2009


On Sun, Sep 20, 2009 at 11:29:38PM +0200, Michael Niedermayer wrote:
> On Sun, Sep 20, 2009 at 02:13:03PM +0200, Reimar D?ffinger wrote:
> [...]
> > +static void radix_count(const RCCMPEntry *data, int size, int *buckets, int shift)
> > +{
> > +    int i;
> > +    int offset;
> > +    memset(buckets, 0, sizeof(*buckets) * NBUCKETS);
> 
> > +    for (i = 0; i < size; i++)
> > +        buckets[get_bucket(data[i].value, shift)]++;
> > +    offset = size;
> 
> maybe the following is faster
> 
> for (i = 0; i < size; i++){
>     unsigned int v= data[i].value;
>     buckets[0][v&255]++; v>>=8;
>     buckets[1][v&255]++; v>>=8;
>     buckets[2][v&255]++; v>>=8;
>     buckets[3][v    ]++;
> }
> 
> also if buckets[3][0] == size the one pass can be skiped, similarly
> the others

I tried adding all kinds of checks, including something similar to this.
I didn't get over around 8% faster (of this function that is about 2% of the
surrounding function, which is only a small part of the full encoding process).
I'll maybe somewhen try this suggestion, too, though code simplicity seems
quite relevant considering how little overall difference it makes.

> > +    for (i = NBUCKETS - 1; i >= 0; i--)
> > +        buckets[i] = offset -= buckets[i];
> > +    assert(!buckets[0]);
> > +}
> > +
> > +static void radix_sort_pass(RCCMPEntry *dst, const RCCMPEntry *data, int size, int pass)
> > +{
> > +    int i;
> 
> > +    int shift = pass * av_log2(NBUCKETS);
> 
> is this as fast as a compiletime constant?

I don't have exact numbers, but this is called just a few times per frame (though maybe
I should test with lower frame rates), so I considered it irrelevant.



More information about the ffmpeg-devel mailing list