[FFmpeg-devel] [PATCH] change av_ts_make_time_string precision

Wed Mar 13 08:03:43 EET 2024

On Tuesday, March 12, 2024 at 02:24:47 PM PDT, Marton Balint <cus at passwd.hu> wrote: 
> On Tue, 12 Mar 2024, Allan Cady via ffmpeg-devel wrote:
>> On Monday, March 11, 2024 at 12:11:45 PM PDT, Marton Balint <cus at passwd.hu> wrote:
>>> On Mon, 11 Mar 2024, Andreas Rheinhardt wrote:
>>> Allan Cady via ffmpeg-devel:
>>>> From: "Allan Cady" <allancady at yahoo.com>
>>>>
>>>> I propose changing the format to "%.6f", which will
>>>> give microsecond precision for all timestamps, regardless of
>>>> offset. Trailing zeros can be trimmed from the fraction, without
>>>> losing precision. If the length of the fixed-precision formatted
>>>> timestamp exceeds the length of the allocated buffer
>>>> (AV_TS_MAX_STRING_SIZE, currently 32, less one for the
>>>> terminating null), then we can fall back to scientific notation, though
>>>> this seems almost certain to never occur, because 32 characters would
>>>> allow a maximum timestamp value of (32 - 1 - 6 - 1) = 24 characters.
>>>> By my calculation, 10^24 seconds is about six orders of magnitude
>>>> greater than the age of the universe.
>>>>
>>>> The fix proposed here follows the following logic:
>>>>
>>>> 1) Try formatting the number of seconds using "%.6f". This will
>>>> always result in a string with six decimal digits in the fraction,
>>>> possibly including trailing zeros. (e.g. "897234.73200").
>>>>
>>>> 2) Check if that string would overflow the buffer. If it would, then
>>>> format it using scientific notation ("%.8g").
>>>>
>>>> 3) If the original fixed-point format fits, then trim any trailing
>>>> zeros and decimal point, and return that result.
>>>>
>>>> Making this change broke two fate tests, filter-metadata-scdet,
>>>> and filter-metadata-silencedetect. To correct this, I've modified
>>>> tests/ref/fate/filter-metadata-scdet and
>>>> tests/ref/fate/filter-metadata-silencedetect to match the
>>>> new output.
>>>>
>>>> Signed-off-by: Allan Cady <allancady at yahoo.com>
>>>> ---
>>>>   libavutil/timestamp.h                        | 53 +++++++++++++++++++-
>>>>   tests/ref/fate/filter-metadata-scdet        | 12 ++---
>>>>   tests/ref/fate/filter-metadata-silencedetect |  2 +-
>>>>   3 files changed, 58 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/libavutil/timestamp.h b/libavutil/timestamp.h
>>>> index 2b37781eba..2f04f9bb2b 100644
>>>> --- a/libavutil/timestamp.h
>>>> +++ b/libavutil/timestamp.h
>>>> @@ -25,6 +25,7 @@
>>>>   #define AVUTIL_TIMESTAMP_H
>>>>
>>>>   #include "avutil.h"
>>>> +#include <stdbool.h>
>>>>
>>>>   #if defined(__cplusplus) && !defined(__STDC_FORMAT_MACROS) && !defined(PRId64)
>>>>   #error missing -D__STDC_FORMAT_MACROS / #define __STDC_FORMAT_MACROS
>>>> @@ -53,6 +54,32 @@ static inline char *av_ts_make_string(char *buf, int64_t ts)
>>>>   */
>>>>   #define av_ts2str(ts) av_ts_make_string((char[AV_TS_MAX_STRING_SIZE]){0}, ts)
>>>>
>>>> +/**
>>>> + * Strip trailing zeros and decimal point from a string. Performed
>>>> + * in-place on input buffer. For local use only by av_ts_make_time_string.
>>>> + *
>>>> + * e.g.:
>>>> + * "752.378000" -> "752.378"
>>>> + *        "4.0" -> "4"
>>>> + *      "97300" -> "97300"
>>>> + */
>>>> +static inline void av_ts_strip_trailing_zeros_and_decimal_point(char *str) {
>>>> +    if (strchr(str, '.'))
>>>> +    {
>>>> +        int i;
>>>> +        for (i = strlen(str) - 1; i >= 0 && str[i] == '0'; i--) {
>>>> +            str[i] = '\0';
>>>> +        }
>>>> +
>>>> +        // Remove decimal point if it's the last character
>>>> +        if (i >= 0 && str[i] == '.') {
>>>> +            str[i] = '\0';
>>>> +        }
>>>> +
>>>> +        // String was modified in place; no need for return value.
>>>> +    }
>>>> +}
>>>> +
>>>>   /**
>>>>   * Fill the provided buffer with a string containing a timestamp time
>>>>   * representation.
>>>> @@ -65,8 +92,30 @@ static inline char *av_ts_make_string(char *buf, int64_t ts)
>>>>   static inline char *av_ts_make_time_string(char *buf, int64_t ts,
>>>>                                             const AVRational *tb)
>>>>   {
>>>> -    if (ts == AV_NOPTS_VALUE) snprintf(buf, AV_TS_MAX_STRING_SIZE, "NOPTS");
>>>> -    else                      snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.6g", av_q2d(*tb) * ts);
>>>> +    if (ts == AV_NOPTS_VALUE)
>>>> +    {
>>>> +        snprintf(buf, AV_TS_MAX_STRING_SIZE, "NOPTS");
>>>> +    }
>>>> +    else
>>>> +    {
>>>> +        const int max_fraction_digits = 6;
>>>> +
>>>> +        // Convert 64-bit timestamp to double, using rational timebase
>>>> +        double seconds = av_q2d(*tb) * ts;
>>>> +
>>>> +        int length = snprintf(NULL, 0, "%.*f", max_fraction_digits, seconds);
>>>> +        if (length <= AV_TS_MAX_STRING_SIZE - 1)
>>>> +        {
>>>> +            snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.*f", max_fraction_digits, seconds);
>>>> +            av_ts_strip_trailing_zeros_and_decimal_point(buf);
>>>> +        }
>>>> +        else
>>>> +        {
>>>> +            snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.8g", seconds);
>>>> +        }
>>>> +
>>>> +    }
>>>> +
>>>>       return buf;
>>>>   }
>>>>
>>>
>>> 1. What makes you believe that all users want the new format and that it
>>> does not cause undesired behaviour for some (maybe a lot) of them? The
>>> number of characters written by the earlier code stayed pretty constant
>>> even when the times became big (in this case, it just printed 8 chars if
>>> ts>=0), yet your code will really make use of the whole buffer.
>>> Granted, we could tell our users that they have no right to complain
>>> about this, given that we always had a "right" to use the full buffer,
>>> but I consider this a violation of the principle of least surprise. Why
>>> don't you just change silencedetect or add another function?
>>>
>>> I suggested to change av_ts_make_time_string() because this
>>> problem affect all detection filters, not only silencedetect. So fixing
>>> it in silencedetect only is wrong.
>>>
>>> The API did not make any promises about the format, and as long as it
>>> provides less chars than AV_TS_MAX_STRING_SIZE, and the string is
>>> parseable string representation of a floating point number, it is not a
>>> break.
>>>
>>> Sure, it changes behaviour, but that is not unheard of if there is a good
>>> reason, and in this case, I believe there is. And I think it is more
>>> likely that some code is broken right now because the function
>>> loses precision or returns scientific notation for relatively small
>>> timestamps. Actually, it _was_ a suprise for me, so IMHO the element of
>>> least suprise is that precision will not fade away for reasonably small
>>> timestamps.
>>>
>>> 2. For very small timestamps (< 10^-4), the new code will print a lot of
>>> useless leading zeros (after the decimal point). In fact, it can be so
>>> many that the new code has less precision than the old code, despite
>>> using the fill buffer.
>>> 2. This is way too much code for an inline function.
>>> 3. Anyway, your placement of {} on their own lines does not match the
>>> project coding style.
>>>
>>> I am OK with any implementation which keeps at least millisecond
>>> precision for timestamps < 10^10. You are right, it should be as simple as
>>> possible.
>>>
>>
>> Milliseconds would be fine with me.
>>
>> Marton, do you have any other comments on my implementation? I have
>> from Andreas that I should remove the inlines, and move the curly
>> braces to match coding style. I also plan on removing the
>> #include <stdbool.h>, which I added at some point but is no longer
>> needed. And I would be happy to change from %.6f to %.3f.

> TBH I'd rather keep the precision as is. If you want to convert the
> function to non-inlined, then you will have to move the implementation
> from the header to an existing or new C file and unconditionally compile
> it to avutil... Maybe we should give it another go keeping it inlineable
> by simplifying it a little.

It's been years (decades) since I had to think about inlines -- and I 
don't remember there being any fixed criteria for them. Is it a
language rule that subroutines in header files have to be inlined? Or
just a policy? I really don't know the language. 

Anyway, I'm all for simple. I was fairly uncomfortable with that much 
complexity, especially in a header file. I don't really like having
that extra sub to strip trailing zeros, but I couldn't come up with a
better way to do it.

>>
>> If that sounds good, I'll submit another patch sometime tomorrow.
>>
>> The only other thing I had thought of is to wonder if I should add some
>> additional testing for the new formatting. I did a fair amount of testing
>> on my own, but it would probably be good to have at least some of that
>> in FATE. I had thought about maybe generating an audio file with just
>> a tone and several silence intervals.
> 

> One thing to notice is that you will not need to use the scientific
> representation at all, because maximum value this function prints is the
> product of an INT32 and an INT64, and that is 96 bit, which is at most 29
> chars. Adding the optional sign and the decimal point, that is still only
> 31. So we can be sure that by using %.6f, the first character of
> the decimal point is going to be present in the output.

I had done some similar calculation and came to a similar conclusion. 

Which is great,
> because that means we only have to
> - do a single snprintf("%.6f")
> - calculate last char position by subtracting 1 from the minimum of
> AV_TS_MAX_STRING_SIZE-1 and the result of the snprintf() call.
> - decrement string length while last char is '0' to remove trailing 0s
> - decrement string length while last char is non-digit to remove decimal
> point (which can be a multiple chars for some locales).
> - update last+1 char to \0.
> Ot is it still too complex to keep it inline?

I'll give your suggestion a spin tomorrow. Thanks.