[FFmpeg-devel] [PATCHv2 2/2] lavu/lfg: switch to Ziggurat algorithm for normal random number generation (WIP)

Michael Niedermayer michael at niedermayer.cc
Sat Mar 12 17:42:00 CET 2016


On Sat, Mar 12, 2016 at 11:11:32AM -0500, Ganesh Ajjanagadde wrote:
> On Sat, Mar 12, 2016 at 11:02 AM, Michael Niedermayer
> <michael at niedermayer.cc> wrote:
> > On Sat, Mar 12, 2016 at 08:58:41AM -0500, Ganesh Ajjanagadde wrote:
> >> Code taken from the Julia project, licensed under MIT:
> >> https://github.com/JuliaLang/julia/blob/master/base/random.jl, in turn
> >> derived from: "The Ziggurat Method for generating random variables" - Marsaglia and Tsang.
> >>
> >> Paper and reference code: http://www.jstatsoft.org/v05/i08/. This is
> >> well known to be the fastest method empirically for generating normal random
> >> variables for a fixed PRNG source.
> >>
> >> Note that there are a large number of implementations with
> >> various tunings, this was one of the simpler ones and also has a friendly
> >> license.
> >>
> >> This results in ~ 3x speedup of random number generation:
> >> old:
> >>   15090 decicycles in av_bmg_get,       1 runs,      0 skips
> >>   13140 decicycles in av_bmg_get,       2 runs,      0 skips
> >>   10117 decicycles in av_bmg_get,       4 runs,      0 skips
> >>   [...]
> >>    2133 decicycles in av_bmg_get,  524268 runs,     20 skips=60.4x
> >>    2134 decicycles in av_bmg_get, 1048531 runs,     45 skips=61.3x
> >>    2135 decicycles in av_bmg_get, 2097061 runs,     91 skips=61.9x
> >>
> >> new:
> >>    7650 decicycles in av_gaussian_get,       1 runs,      0 skips
> >>    5490 decicycles in av_gaussian_get,       2 runs,      0 skips
> >>    3982 decicycles in av_gaussian_get,       4 runs,      0 skips
> >>    [...]
> >>     812 decicycles in av_gaussian_get,  524281 runs,      7 skips
> >>     813 decicycles in av_gaussian_get, 1048563 runs,     13 skips
> >>     813 decicycles in av_gaussian_get, 2097131 runs,     21 skips
> >>
> >> and accordingly a ~2% speedup in aac encoding (-march=native, Haswell, clang):
> >>
> >> old:
> >> ffmpeg -f lavfi -i anoisesrc -t 300 -y sin_new.aac  5.30s user 0.02s system 99% cpu 5.322 total
> >> new:
> >> ffmpeg -f lavfi -i anoisesrc -t 300 -y sin_new.aac  5.16s user 0.03s system 99% cpu 5.198 total
> >>
> >> Function added as av_gaussian_get with documentation, minor bumped.
> >>
> > [...]
> >
> >> +static inline double ziggurat(AVLFG *lfg)
> >> +{
> >> +    while (1) {
> >
> >> +        uint64_t r = (((uint64_t)av_lfg_get(lfg) << 32) + av_lfg_get(lfg)) & 0x000fffffffffffff;
> >
> > the order of Function evaluations is undefined here i think
> 
> And why would that matter? Aren't these just approximations to truly

reproduceability
a user has case where something fails the developer cannot reproduce
as his compiler evaluates them in a different order

also for regression tests it would be better if the values roughly are
the same between platforms

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When the tyrant has disposed of foreign enemies by conquest or treaty, and
there is nothing more to fear from them, then he is always stirring up
some war or other, in order that the people may require a leader. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160312/328daa4e/attachment.sig>


More information about the ffmpeg-devel mailing list