[FFmpeg-devel] [PATCHv4] avutil/common: add av_rint64_clip

Ganesh Ajjanagadde gajjanag at mit.edu
Sat Nov 14 03:51:51 CET 2015


On Fri, Nov 13, 2015 at 7:17 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> Hi,
>
> On Fri, Nov 13, 2015 at 6:16 PM, Ganesh Ajjanagadde <gajjanag at mit.edu>
> wrote:
>
>> On Fri, Nov 13, 2015 at 4:52 PM, Ronald S. Bultje <rsbultje at gmail.com>
>> wrote:
>> > Hi,
>> >
>> > On Fri, Nov 13, 2015 at 4:28 PM, Ganesh Ajjanagadde <gajjanag at mit.edu>
>> > wrote:
>> >
>> >> On Fri, Nov 13, 2015 at 1:28 PM, Ronald S. Bultje <rsbultje at gmail.com>
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > On Fri, Nov 13, 2015 at 12:17 PM, Ganesh Ajjanagadde <
>> gajjanag at mit.edu>
>> >> > wrote:
>> >> >
>> >> >> On Fri, Nov 13, 2015 at 12:10 PM, Ronald S. Bultje <
>> rsbultje at gmail.com>
>> >> >> wrote:
>> >> >> > Hi Ganesh,
>> >> >> > On Nov 13, 2015 12:02 PM, "Ganesh Ajjanagadde" <
>> >> gajjanagadde at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> The rationale for this function is reflected in the documentation
>> for
>> >> >> >> it, and is copied here:
>> >> >> >>
>> >> >> >> Clip a double value into the long long amin-amax range.
>> >> >> >> This function is needed because conversion of floating point to
>> >> integers
>> >> >> > when
>> >> >> >> it does not fit in the integer's representation does not
>> necessarily
>> >> >> > saturate
>> >> >> >> correctly (usually converted to a cvttsd2si on x86) which
>> saturates
>> >> >> > numbers
>> >> >> >> > INT64_MAX to INT64_MIN. The standard marks such conversions as
>> >> >> undefined
>> >> >> >> behavior, allowing this sort of mathematically bogus conversions.
>> >> This
>> >> >> > provides
>> >> >> >> a safe alternative that is slower obviously but assures safety and
>> >> >> better
>> >> >> >> mathematical behavior.
>> >> >> >> API:
>> >> >> >> @param a value to clip
>> >> >> >> @param amin minimum value of the clip range
>> >> >> >> @param amax maximum value of the clip range
>> >> >> >> @return clipped value
>> >> >> >>
>> >> >> >> Note that a priori if one can guarantee from the calling side that
>> >> the
>> >> >> >> double is in range, it is safe to simply do an explicit/implicit
>> >> cast,
>> >> >> >> and that will be far faster. However, otherwise this function
>> should
>> >> be
>> >> >> >> used.
>> >> >> >>
>> >> >> >> avutil minor version is bumped.
>> >> >> >>
>> >> >> >> Reviewed-by: Ronald S. Bultje <rsbultje at gmail.com>
>> >> >> >> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
>> >> >> >> ---
>> >> >> >>  libavutil/common.h  | 30 ++++++++++++++++++++++++++++++
>> >> >> >>  libavutil/version.h |  2 +-
>> >> >> >>  2 files changed, 31 insertions(+), 1 deletion(-)
>> >> >> >>
>> >> >> >> diff --git a/libavutil/common.h b/libavutil/common.h
>> >> >> >> index 6f0f582..f4687ab 100644
>> >> >> >> --- a/libavutil/common.h
>> >> >> >> +++ b/libavutil/common.h
>> >> >> >> @@ -298,6 +298,33 @@ static av_always_inline av_const double
>> >> >> > av_clipd_c(double a, double amin, double
>> >> >> >>      else               return a;
>> >> >> >>  }
>> >> >> >>
>> >> >> >> +/**
>> >> >> >> + * Clip and convert a double value into the long long amin-amax
>> >> range.
>> >> >> >> + * This function is needed because conversion of floating point
>> to
>> >> >> > integers when
>> >> >> >> + * it does not fit in the integer's representation does not
>> >> necessarily
>> >> >> > saturate
>> >> >> >> + * correctly (usually converted to a cvttsd2si on x86) which
>> >> saturates
>> >> >> > numbers
>> >> >> >> + * > INT64_MAX to INT64_MIN. The standard marks such conversions
>> as
>> >> >> > undefined
>> >> >> >> + * behavior, allowing this sort of mathematically bogus
>> conversions.
>> >> >> > This provides
>> >> >> >> + * a safe alternative that is slower obviously but assures safety
>> >> and
>> >> >> > better
>> >> >> >> + * mathematical behavior.
>> >> >> >> + * @param a value to clip
>> >> >> >> + * @param amin minimum value of the clip range
>> >> >> >> + * @param amax maximum value of the clip range
>> >> >> >> + * @return clipped value
>> >> >> >> + */
>> >> >> >> +static av_always_inline av_const int64_t av_rint64_clip_c(double
>> a,
>> >> >> > int64_t amin, int64_t amax)
>> >> >> >> +{
>> >> >> >> +#if defined(HAVE_AV_CONFIG_H) && defined(ASSERT_LEVEL) &&
>> >> ASSERT_LEVEL
>> >> >> >>= 2
>> >> >> >> +    if (amin > amax) abort();
>> >> >> >> +#endif
>> >> >> >> +    // INT64_MAX+1,INT64_MIN are exactly representable as IEEE
>> >> doubles
>> >> >> >> +    if (a >=  9223372036854775808.0 || llrint(a) >= amax)
>> >> >> >> +        return amax;
>> >> >> >> +    if (a <= -9223372036854775808.0 || llrint(a) <= amin)
>> >> >> >> +        return amin;
>> >> >> >
>> >> >> > Doesn't this allow negative overflows in the max check? I think you
>> >> need
>> >> >> > both overflow checks before the min/max checks. Try the next float
>> val
>> >> >> > smaller than int64_min as input with a small amax, eg 0. I bet it
>> >> >> returns 0
>> >> >> > instead of amin.
>> >> >>
>> >> >> They are needed. As you and others can clearly see, I am quite bad
>> >> >> with this stuff.
>> >> >
>> >> >
>> >> > Hm, so, getting back to my computer, I wanted to test this, and I have
>> >> this
>> >> > problem: llrint() works correctly for me for the "undefined" cases,
>> i.e.,
>> >> > it already does what you're trying to fix in av_rint64_clip_c.
>> >> >
>> >> > llrint(-10223372056756029440.000000) returns -9223372036854775808
>> >> > llrint(10223372056756029440.000000) returns 9223372036854775807
>> >> >
>> >> > So, how do I reproduce that llrint() overflows?
>> >>
>> >> The link I gave originally
>> >>
>> http://blog.frama-c.com/index.php?post/2013/10/09/Overflow-float-integer
>> >> gives an illustration. Maybe the weird behavior happens only on
>> >> 9223372036854775808.0. This happens because INT64_MAX+1 is not
>> >> representable in long long, and hence signed overflow occurs yielding
>> >> INT64_MIN (of course undefined). Here is a minimal test program:
>> >> #include <stdio.h>
>> >> #include <math.h>
>> >>
>> >> int main(void) {
>> >>     printf("%lld\n", llrint(9223372036854775808.0));
>> >>     return 0;
>> >> }
>> >>
>> >
>> > 9223372036854775807
>> >
>> > Seems apple's libc got one thing right :)
>>
>> I personally am not that charitable, looking more carefully at your
>> asm shows a cmplesd, suggesting slowdown. Here is a source reference:
>> https://opensource.apple.com/source/Libm/Libm-2026/Source/ARM/llrint.c.
>> As usual, Apple dumps many implementations of llrint and it is unclear
>> which is actually being used on OS X at the moment (see e.g other
>> https://opensource.apple.com/source/Libm/Libm-92/i386.subproj/llrint.c),
>> but I digress.
>>
>> They essentially all put special case code like the patch above. Thus
>> their function is inherently slower than the conformant GNU libm
>> implementation. A client may very well want a branch free llrint for
>> speed. Apple offers no performance choice here, forcing a fast llrint
>> to use cvt2dsi inline or equivalent. Don't know if FFmpeg is affected
>> by this slowdown.
>
>
> I think FFmpeg should consider using Apple's version as a x86
> implementation for av_rint64_clip :)

I don't agree with this: it is a far less readable implementation with
many more lines of code, and worse yet only handles the llrint aspect
and not the clipping. Regardless, belongs to a separate patch/thread.
Pushed. Thanks all for reviews.

>
> Ronald
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


More information about the ffmpeg-devel mailing list