[FFmpeg-devel] [PATCH] avutil/mathematics: correct documentation for av_gcd

Fri Oct 30 00:49:44 CET 2015

On Thu, Oct 29, 2015 at 4:51 PM, Michael Niedermayer
<michael at niedermayer.cc> wrote:
> On Wed, Oct 28, 2015 at 10:49:08PM -0400, Ganesh Ajjanagadde wrote:
>> On Wed, Oct 28, 2015 at 10:09 AM, Ganesh Ajjanagadde
>> <gajjanagadde at gmail.com> wrote:
>> > On Tue, Oct 27, 2015 at 8:18 PM, Ganesh Ajjanagadde
>> > <gajjanagadde at gmail.com> wrote:
>> >> av_gcd is now always defined regardless of input. This documents this
>> >> change in the "documented API". Two benefits (closely related):
>> >> 1. The function is robust, and there is no need to worry about INT64_MIN, etc.
>> >>
>> >> 2. Clients of av_gcd, like av_reduce, can now be made fully correct. Currently,
>> >> av_reduce can trigger undefined behavior if e.g num is INT64_MIN due to
>> >> integer overflow in the FFABS. Furthermore, this undefined behavior is
>> >> completely undocumented, and could be a fuzzer's paradise. The FFABS was needed in the past as
>> >> av_gcd was undefined for negative inputs. In order to make av_reduce
>> >> robust, it is essential to guarantee that av_gcd works for all int64_t.
>> >>
>> >> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
>> >> ---
>> >>  libavutil/mathematics.h | 6 +++---
>> >>  1 file changed, 3 insertions(+), 3 deletions(-)
>> >>
>> >> diff --git a/libavutil/mathematics.h b/libavutil/mathematics.h
>> >> index ac94488..6fc2577 100644
>> >> --- a/libavutil/mathematics.h
>> >> +++ b/libavutil/mathematics.h
>> >> @@ -77,9 +77,9 @@ enum AVRounding {
>> >>  };
>> >>
>> >>  /**
>> >> - * Return the greatest common divisor of a and b.
>> >> - * If both a and b are 0 or either or both are <0 then behavior is
>> >> - * undefined.
>> >> + * Compute the greatest common divisor of a and b.
>> >> + *
>> >> + * @return gcd of a and b up to sign; if a and b are both zero returns 0
>> >>   */
>> >>  int64_t av_const av_gcd(int64_t a, int64_t b);
>> >>
>> >> --
>> >> 2.6.2
>> >>
>> >
>> > Patch dropped for now, undefined behavior is still possible with
>> > av_gcd: take a and b to be both INT64_MIN. Need to examine this a
>> > little more closely to make it robust without losing performance.
>>
>> Request to reconsider; patch making av_gcd more robust sent.
>
> I guess if the stricter API doesnt prevent any possibly optimizations
> then the change is a good idea

I can't think of any other clean, performant ways of solving our
issues with av_reduce etc that will depend on taking gcd of negative
values (we can't do an abs due to the fact that FFABS is unsafe on
INT64_MIN). Of course, one could have special cases for INT64_MIN etc
but that will reduce performance and even readability to some degree.
Hence, negative behavior of av_gcd must be defined, and moreover it
must not only return a value, but a mathematically correct gcd up to
sign. I leave the sign flexible in such cases to allow some wiggle
room for optimizations, and this partially addresses your point.

A low hanging optimization would come from changing everything to
uint64_t, as that would allow removal of the llabs (cost of two
branches). In fact, given the documentation currently, I wonder why
the signature was not uint64_t av_gcd(uint64_t a, uint64_t b). That
would have made work easier. However, I pesonally don't deem this
sufficient justification to create e.g an av_gcd2 with the above
signature.

As another remark, I doubt optimizing gcd further beyond simple
changes like that is very useful. It is not very critical, readability
is important, and attention may be devoted elsewhere. In fact,
although you did explain to me where optimizations could be useful at
an algorithmic level privately, I take this as an opportunity to
request for general comments on places that people think could benefit
from performance at a C/algorithmic level (NOT asm/simd stuff for me
personally): yet another ping for Clement.

That is also related to a potential GsOC/Outreachy idea (may have been
proposed already, definitely been remarked upon in the past): creation
of some kind of infrastructure to view performance vs time. For
instance, this could tie in with the current FATE, e.g one could have
a fate-perf target that does some good profiling, plots graphs, etc
that may be viewed at fate.ffmpeg.org. This could also be useful as a
public image boost, since everyone gets to see our optimizations at
work :). I don't know how to do this myself. At the risk of offending
some here, https://www.youtube.com/watch?v=kWnx6eOGVYo by Mans looked
pretty decent as a starting point on GNU/Linux.

However, you reminded me of another thing that must be documented: on
both a and b being nonnegative, we must return a nonnegative result.
Reasons:
1. A negative gcd in such cases is weird (though I won't go so far to
say it is incorrect mathematically).
2. More seriously, not doing so is an API breakage, since clients
before could have relied on this. FFmpeg itself definitely did rely on
this.

If above reasoning seems fine, I will push with an additional remark
on the nonnegativity for nonnegative arguments (satisfied by av_gcd in
its current form).

>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>