[FFmpeg-devel] [PATCH] Add a G.722 encoder
Martin Storsjö
martin
Thu Sep 16 09:50:18 CEST 2010
On Wed, 15 Sep 2010, Michael Niedermayer wrote:
> On Wed, Sep 15, 2010 at 05:42:39PM +0300, Martin Storsj? wrote:
> >
> > I managed to do this, so I've got the following two versions:
> >
> > limit = limit + 1 << 10;
> > while (limit > low_quant[i] * state->scale_factor)
> >
> > 885 dezicycles in encode_low, 1047991 runs, 585 skips
> >
> > and
> >
> > limit = limit + 1 << 10;
> > limit = (limit + state->scale_factor - 1) / state->scale_factor;
> > while (limit > low_quant[i])
> >
> > 988 dezicycles in encode_low, 1047886 runs, 690 skips
> >
> > So the former version is a bit faster, even if it does more
> > multiplications.
> >
> > Removing the i < 29 check with a sentinel works fine for normal samples,
> > but the reference test vectors fail on this if using the former version,
> > since the numeric range of 32 bit numbers isn't enough.
> >
> > I.e., low_quant[i] * scale_factor mustn't overflow, so the the sentinel
> > must be INT_MAX/max_scale_factor, which is (2^31-1)/4096, and limit << 10
> > can be larger than this.
>
> multiple sentinels should then work
Good idea, I was able to get this to work, too, in this form:
limit = limit + 1 << 10;
while (limit > low_quant[i] * state->scale_factor)
i++;
i = av_clip(i, 0, 29);
When the initial version took 919 dezicycles today, it went up to 948 when
I changed the low_quant table to be int instead of int16_t (which is
needed for the sentinels to fit), dropped to 914 when I removed the i < 29
check from the loop, and increased to 922 when I added the av_clip at the
end.
If I instead of the av_clip use an if (i > 29) i = 29; I got it down to
918, one dezicycle less than where I started.
> > > and maybe that can also be used in encode_high() to get rid of the branch
> >
> > I don't see how the branches in encode_high could be avoided thanks to
> > this - the only branches there map index into 0, 1, 2 or 3 depending on if
> > diff >= 0 and diff < pred.
>
> something like:
> ((diff ^ (diff>>31)) < pred) + 2*(diff>=0)
> might work, no idea about speed
Ah, yes, that does work, thanks! This made the encode_high function a bit
faster, from 1396 to 1243 dezicycles. (This value is quite a bit larger
than encode_low since it also updates the predictor.)
// Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-a-G.722-encoder.patch
Type: text/x-diff
Size: 6383 bytes
Desc:
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100916/567b27e5/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Add-initial-trellis-support-in-the-G.722-encoder.patch
Type: text/x-diff
Size: 8257 bytes
Desc:
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100916/567b27e5/attachment-0001.patch>
More information about the ffmpeg-devel
mailing list