[FFmpeg-devel] [PATCH] avutil/eval: Use even better PRNG
Michael Niedermayer
michael at niedermayer.cc
Thu Jan 11 04:39:27 EET 2024
On Wed, Jan 10, 2024 at 11:48:33PM +0100, Stefano Sabatini wrote:
> On date Tuesday 2024-01-09 02:55:21 +0100, Michael Niedermayer wrote:
[...]
> >
> > static const AVClass eval_class = {
> > @@ -174,7 +175,7 @@ struct AVExpr {
> > } a;
> > struct AVExpr *param[3];
> > double *var;
> > - uint64_t *var_uint64;
> > + SFC64 *prng_state;
> > };
> >
> > static double etime(double v)
> > @@ -233,10 +234,15 @@ static double eval_expr(Parser *p, AVExpr *e)
> >
> > #define COMPUTE_NEXT_RANDOM() \
> > int idx = av_clip(eval_expr(p, e->param[0]), 0, VARS-1); \
> > - uint64_t r = p->var_uint64[idx] ? p->var_uint64[idx] : (isnan(p->var[idx]) ? 0 : p->var[idx]);\
> > - r = r * 1664525 + 1013904223; \
> > + SFC64 *s = p->prng_state + idx; \
> > + uint64_t r; \
> > + \
> > + if (!s->counter) { \
> > + r = isnan(p->var[idx]) ? 0 : p->var[idx]; \
>
> > + sfc64_init(s, r, r, r, 12); \
>
> for the record, why 12?
The reference has 3 init functions
* one that uses one seed for the 3 parameters, it uses 12 rounds
* one that uses 3 seperate seeds that uses 18 rounds
* one that has "fast" in its name and does 8 rounds with one seed in 3 parameters
I will document this better
[...]
> > return e->value * (p->var[index]= d2);
> > }
> > case e_hypot:return e->value * hypot(d, d2);
> > @@ -356,7 +362,7 @@ void av_expr_free(AVExpr *e)
> > av_expr_free(e->param[1]);
> > av_expr_free(e->param[2]);
> > av_freep(&e->var);
> > - av_freep(&e->var_uint64);
> > + av_freep(&e->prng_state);
> > av_freep(&e);
> > }
> >
> > @@ -744,8 +750,8 @@ int av_expr_parse(AVExpr **expr, const char *s,
> > goto end;
> > }
> > e->var= av_mallocz(sizeof(double) *VARS);
> > - e->var_uint64= av_mallocz(sizeof(uint64_t) *VARS);
> > - if (!e->var || !e->var_uint64) {
> > + e->prng_state = av_mallocz(sizeof(*e->prng_state) *VARS);
> > + if (!e->var || !e->prng_state) {
> > ret = AVERROR(ENOMEM);
> > goto end;
> > }
> > @@ -787,7 +793,7 @@ double av_expr_eval(AVExpr *e, const double *const_values, void *opaque)
> > {
> > Parser p = { 0 };
> > p.var= e->var;
> > - p.var_uint64= e->var_uint64;
> > + p.prng_state= e->prng_state;
> >
> > p.const_values = const_values;
> > p.opaque = opaque;
> > diff --git a/libavutil/sfc64.h b/libavutil/sfc64.h
> > new file mode 100644
> > index 00000000000..25bc43abef1
> > --- /dev/null
> > +++ b/libavutil/sfc64.h
> > @@ -0,0 +1,59 @@
> > +/*
> > + * Copyright (c) 2024 Michael Niedermayer <michael-ffmpeg at niedermayer.cc>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> > + *
>
> > + * This is a implementation of SFC64 a 64-bit PRNG by Chris Doty-Humphrey.
>
> nit: This is a implementation of SFC64, a 64-bit PRNG by Chris Doty-Humphrey.
>
> > + *
> > + * This Generator is much faster (0m1.872s) than 64bit KISS (0m3.823s) and PCG-XSH-RR-64/32 (0m2.700s)
>
> what are these benchmarks against?
a loop that computes alot of random numbers and at the end prints their sum.
The behavior was btw quite different if the numbers are not summed and printed
as the compiler can then optimize some things out but noone would run a PRNG
and not use the values.
[...]
> > +static inline uint64_t sfc64_get(SFC64 *s) {
> > + uint64_t tmp = s->a + s->b + s->counter++;
> > + s->a = s->b ^ (s->b >> 11);
> > + s->b = s->c + (s->c << 3); // This is a multiply by 9
> > + s->c = ((s->c << 24) | (s->c >> 40)) + tmp;
> > + return tmp;
> > +}
> > +
> > +static inline void sfc64_init(SFC64 *s, uint64_t seeda, uint64_t seedb, uint64_t seedc, int rounds) {
> > + s->a = seeda;
> > + s->b = seedb;
> > + s->c = seedc;
> > + s->counter = 1;
> > + while (rounds--)
> > + sfc64_get(s);
> > +}
> > +
> > +#endif // AVUTIL_SFC64_H
>
> nit: probably it still makes sense to use ff/FF prefixes even if the
> header is not public (and if this is useful, probably it could be made
> public as a faster/smaller alternative to lfg).
ok
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Take away the freedom of one citizen and you will be jailed, take away
the freedom of all citizens and you will be congratulated by your peers
in Parliament.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240111/13f8ae23/attachment.sig>
More information about the ffmpeg-devel
mailing list