[FFmpeg-devel] [PATCH] Add a G.722 encoder

Michael Niedermayer michaelni
Fri Sep 17 11:53:23 CEST 2010


On Fri, Sep 17, 2010 at 09:31:15AM +0300, Martin Storsj? wrote:
> On Fri, 17 Sep 2010, Michael Niedermayer wrote:
> 
> > On Thu, Sep 16, 2010 at 06:20:25PM +0300, Martin Storsj? wrote:
> > > On Thu, 16 Sep 2010, Michael Niedermayer wrote:
> > > 
> > > > On Thu, Sep 16, 2010 at 10:50:18AM +0300, Martin Storsj? wrote:
> > > > > On Wed, 15 Sep 2010, Michael Niedermayer wrote:
> > > > > 
> > > > > > On Wed, Sep 15, 2010 at 05:42:39PM +0300, Martin Storsj? wrote:
> > > > > > > 
> > > > > > > I managed to do this, so I've got the following two versions:
> > > > > > > 
> > > > > > >     limit = limit + 1 << 10;
> > > > > > >     while (limit > low_quant[i] * state->scale_factor)
> > > > > > > 
> > > > > > > 885 dezicycles in encode_low, 1047991 runs, 585 skips
> > > > > > > 
> > > > > > > and
> > > > > > > 
> > > > > > >     limit = limit + 1 << 10;
> > > > > > >     limit = (limit + state->scale_factor - 1) / state->scale_factor;
> > > > > > >     while (limit > low_quant[i])
> > > > > > > 
> > > > > > > 988 dezicycles in encode_low, 1047886 runs, 690 skips
> > > > > > > 
> > > > > > > So the former version is a bit faster, even if it does more 
> > > > > > > multiplications.
> > > > > > > 
> > > > > > > Removing the i < 29 check with a sentinel works fine for normal samples, 
> > > > > > > but the reference test vectors fail on this if using the former version, 
> > > > > > > since the numeric range of 32 bit numbers isn't enough.
> > > > > > > 
> > > > > > > I.e., low_quant[i] * scale_factor mustn't overflow, so the the sentinel 
> > > > > > > must be INT_MAX/max_scale_factor, which is (2^31-1)/4096, and limit << 10 
> > > > > > > can be larger than this.
> > > > > > 
> > > > > > multiple sentinels should then work
> > > > > 
> > > > > Good idea, I was able to get this to work, too, in this form:
> > > > > 
> > > > >     limit = limit + 1 << 10;
> > > > >     while (limit > low_quant[i] * state->scale_factor)
> > > > >         i++;
> > > > >     i = av_clip(i, 0, 29);
> > > > > 
> > > > > When the initial version took 919 dezicycles today, it went up to 948 when 
> > > > > I changed the low_quant table to be int instead of int16_t (which is 
> > > > > needed for the sentinels to fit), dropped to 914 when I removed the i < 29 
> > > > > check from the loop, and increased to 922 when I added the av_clip at the 
> > > > > end.
> > > > > 
> > > > > If I instead of the av_clip use an if (i > 29) i = 29; I got it down to 
> > > > > 918, one dezicycle less than where I started.
> > > > 
> > > > that doesnt feel right
> > > > can you show us some statistic on how often each of the 0..29 occurs on
> > > > your test file / normal audio?
> > > 
> > > On my test file, which is a movie trailer with both speech and music, the 
> > > distribution is like this:
> > > 
> > > 66152 73807 84030 83795 77497 84602 75193 79712 72637 72394 63477 62738 
> > > 54695 51064 42842 39389 30935 27155 20366 17044 11724 9528 5710 4342 2408 
> > > 1560 745 442 165 201
> > > 
> > > And graphically: http://albin.abo.fi/~mstorsjo/low_code.pdf
> > > 
> > > So it quite heavily biased towards the lower values.
> > > 
> > > So, which one do you prefer, the one with multiple sentinels and clipping 
> > > of the value after the loop, or simply with the condition inside the loop?
> > 
> > have you tried to put a
> > 
> > if(limit > low_quant[X] * state->scale_factor)
> >     i=X+1;
> > 
> > before the loop ?
> > value of X would be somewhat left of the middle
> > 
> > and i prefer the simpler code if speed matches
> > though i smell compiler messup
> 
> Yes, this actually improved the performance quite a bit, down from 917 
> dezicycles to 855. Since the performance of the version with multiple 
> sentinels and a boundary check at the end didn't differ much, I removed 
> that change.
> 
> I also checked this on another compiler version and got similar results.
> 
> Updated version attached, the current version of the code discussed looks 
> like this:
> 
>     limit = limit + 1 << 10;
>     if (limit > low_quant[8] * state->scale_factor)
>         i = 9;

and after that:
if (limit > low_quant[i+X] * state->scale_factor)
    i+= X+1;

might be faster (or maybe not ...)

you can also try a complete binary search with these


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The bravest are surely those who have the clearest vision
of what is before them, glory and danger alike, and yet
notwithstanding go out to meet it. -- Thucydides
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100917/8a2bdbb3/attachment.pgp>



More information about the ffmpeg-devel mailing list