[FFmpeg-devel] [PATCH] Faster ff_sqrt()
Michael Niedermayer
michaelni
Sun Jan 13 20:13:00 CET 2008
On Sun, Jan 13, 2008 at 05:50:52PM +0100, Vitor Sessak wrote:
> Hi,
>
> I saw in the thread "Copy-on-write pages statistics" that roqaudioenc.c
> is a big copy-on-write memory usage offender. I think that
> increasing the memory usage of FFmpeg by 16kb just for a table used in a
> game format encoder is not a good idea. So I'm sending a new, faster
> ff_sqrt(), so I can remove the table completely without too much speed
> loss.
>
> Unfortunately, even with the new ff_sqrt, the encoder is not as fast as
> before :
> (for a 3 min wav file input)
>
> Before:
> 0m0.340s
>
> After:
> 0m0.556s
>
> But it's the audio encoder of a VQ codec and it takes forever to encode the
> video. So I don't think a few ms more to encode the audio can make any
> difference...
>
> The code for ff_sqrt() was taken from public domain code at
> http://atoms.alife.co.uk/sqrt/ (Warning: original in java).
[...]
> +/**
> + * Fast square-root. Completely accurate for x < 2147483648 (i.e. 2^31)...
> + * Code from http://atoms.alife.co.uk/sqrt/
> + */
> +static inline int ff_sqrt(int x) {
> + int xn;
>
...
> + if (x >= 0x10000) {
> + if (x >= 0x1000000) {
> + if (x >= 0x10000000) {
> + if (x >= 0x40000000) {
> + xn = ff_sqrt_tab[x >> 24] << 8;
> + } else {
> + xn = ff_sqrt_tab[x >> 22] << 7;
> + }
> + } else {
> + if (x >= 0x4000000) {
> + xn = ff_sqrt_tab[x >> 20] << 6;
> + } else {
> + xn = ff_sqrt_tab[x >> 18] << 5;
> + }
> + }
>
...
> + xn = (xn + 1 + (x / xn)) >> 1;
> + xn = (xn + 1 + (x / xn)) >> 1;
> + return ((xn * xn) > x) ? --xn : xn;
> + } else {
> + if (x >= 0x100000) {
> + if (x >= 0x400000) {
> + xn = ff_sqrt_tab[x >> 16] << 4;
> + } else {
> + xn = ff_sqrt_tab[x >> 14] << 3;
> + }
> + } else {
> + if (x >= 0x40000) {
> + xn = ff_sqrt_tab[x >> 12] << 2;
> + } else {
> + xn = ff_sqrt_tab[x >> 10] << 1;
> + }
> + }
> +
> + xn = (xn + 1 + (x / xn)) >> 1;
> +
> + return ((xn * xn) > x) ? --xn : xn;
> }
> + } else {
> + if (x >= 0x100) {
> + if (x >= 0x1000) {
> + if (x >= 0x4000) {
> + xn = (ff_sqrt_tab[x >> 8]) + 1;
> + } else {
> + xn = (ff_sqrt_tab[x >> 6] >> 1) + 1;
> + }
> + } else {
> + if (x >= 0x400) {
> + xn = (ff_sqrt_tab[x >> 4] >> 2) + 1;
> + } else {
> + xn = (ff_sqrt_tab[x >> 2] >> 3) + 1;
> + }
> + }
> +
> + return ((xn * xn) > x) ? --xn : xn;
> + } else {
> + if (x >= 0) {
> + return ff_sqrt_tab[x] >> 4;
> + }
> + }
> }
> - return ret;
> +
> + return -1;
for a less idiotic and simpler variant i wrote a few years ago see
http://guru.multimedia.cx/fast-integer-square-root/
if anyone can simplify this further that would be welcome :)
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
The worst form of inequality is to try to make unequal things equal.
-- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080113/356e4af8/attachment.pgp>
More information about the ffmpeg-devel
mailing list