[FFmpeg-devel] [PATCH] avformat/hlsenc: Fix path handling on Windows

Soft Works softworkz at hotmail.com
Sun Jan 16 00:29:03 EET 2022



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Andreas
> Rheinhardt
> Sent: Saturday, January 15, 2022 10:45 PM
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH] avformat/hlsenc: Fix path handling on
> Windows
> 
> Soft Works:
> >
> >
> >> -----Original Message-----
> >> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Andreas
> >> Rheinhardt
> >> Sent: Saturday, January 15, 2022 7:34 PM
> >> To: ffmpeg-devel at ffmpeg.org
> >> Subject: Re: [FFmpeg-devel] [PATCH] avformat/hlsenc: Fix path handling on
> >> Windows
> >>
> >> Soft Works:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Andreas
> >>>> Rheinhardt
> >>>> Sent: Saturday, January 15, 2022 7:40 AM
> >>>> To: ffmpeg-devel at ffmpeg.org
> >>>> Subject: Re: [FFmpeg-devel] [PATCH] avformat/hlsenc: Fix path handling
> on
> >>>> Windows
> >>>>
> >>>> ffmpegagent:
> >>>>> From: softworkz <softworkz at hotmail.com>
> >>>>>
> >>>>> Signed-off-by: softworkz <softworkz at hotmail.com>
> >>>>> ---
> >>>>>     avformat/hlsenc: Fix path handling on Windows
> >>>>>
> >>>>>     Handling for DOS path separators was missing
> >>>>>
> >>>>> Published-As: https://github.com/ffstaging/FFmpeg/releases/tag/pr-
> >>>> ffstaging-19%2Fsoftworkz%2Fsubmit_hlspath-v1
> >>>>> Fetch-It-Via: git fetch https://github.com/ffstaging/FFmpeg pr-
> ffstaging-
> >>>> 19/softworkz/submit_hlspath-v1
> >>>>> Pull-Request: https://github.com/ffstaging/FFmpeg/pull/19
> >>>>>
> >>>>>  libavformat/hlsenc.c | 4 ++++
> >>>>>  1 file changed, 4 insertions(+)
> >>>>>
> >>>>> diff --git a/libavformat/hlsenc.c b/libavformat/hlsenc.c
> >>>>> index ef8973cea1..eff7f4212e 100644
> >>>>> --- a/libavformat/hlsenc.c
> >>>>> +++ b/libavformat/hlsenc.c
> >>>>> @@ -3028,6 +3028,10 @@ static int hls_init(AVFormatContext *s)
> >>>>>                  }
> >>>>>
> >>>>>                  p = strrchr(vs->m3u8_name, '/');
> >>>>> +#if HAVE_DOS_PATHS
> >>>>> +                p = FFMAX(p, strrchr(vs->m3u8_name, '\\'));
> >>>>> +#endif
> >>>>> +
> >>>>>                  if (p) {
> >>>>>                      char tmp = *(++p);
> >>>>>                      *p = '\0';
> >>>>>
> >>>>> base-commit: c936c319bd54f097cc1d75b1ee1c407d53215d71
> >>>>>
> >>>>
> >>>
> >>> Thanks for reviewing.
> >>>
> >>>> 1. You seem to be under the impression that NULL <= all other pointers.
> >>>> This is wrong. Relational operators acting on pointers are only defined
> >>>> when both point to the same object (the case of "one past the last
> >>>> element of an array" is also allowed) and are undefined behaviour
> >> otherwise.
> >>>
> >>> The case about NULL is interesting - I wasn't aware of that.
> >>> Is it practically relevant, i.e. is there any platform where casting
> >>> (void *)0 does not evaluate to 0 ?
> >>>
> >>
> >> "An integer constant expression with the value 0, or such an expression
> >> cast to type
> >>  void *, is called a null pointer constant." (C11, 6.3.2.3 3) (void*)0
> >> is therefore a valid null pointer constant and is also commonly used for
> >> the NULL macro. (void*)0 == 0 is always true, because the right hand
> >> side is converted to the type of the pointer (namely to a null pointer)
> >> and all null pointers compare equal. But this is irrelevant to
> >> relational comparisons, because checking for equality of pointers is not
> >> subject to these pointers pointing to the same object (or one past the
> >> last element of an array...), whereas this is so for relational
> operations.
> >>
> >> (If one uses unsigned for pointers, then one only needs to reserve two
> >> values that can not be used as part of an object: 0 and the max value
> >> (the latter can't be used for an object, because using a pointer one
> >> past the object is legal and has to be consistent with "<=" (and anyway
> >> said pointer must compare unequal to NULL)); if one used signed
> >> comparisons for pointers
> >
> > On initial read I had thought you were implying something like "signed
> > pointers". But maybe one should really create a computer system with
> > "symmetric addressing" where pointers are always pointing to the center
> > of a memory area (odd-number sized). I think it would be a great
> > invention for bored and under-challenged developers who love to
> > deep-dive into academic details - hihi..
> >
> >
> > (I'd bet, concepts and theory exist somewhere anywhere)
> >
> >> one would have to reserve -1, 0 and the max
> >> value, the former because a one past the end array element needs to
> >> compare unequal to NULL and the latter to be consistent with <= and a
> >> potential one-past-the-end element. But this is a very small advantage.
> >> Honestly, I don't know whether compilers consistently use unsigned
> >> comparisons for pointer comparisons at all (even when restricted to
> >> compilers for systems with HAVE_DOS_PATHS). The fact that comparisons of
> >> pointers to different objects is UB means that compiler writers actually
> >> can choose what they want.)
> >
> > So one could cast both pointers to unsigned to make the relational
> > comparison valid, as long as either (1) both are pointing to the same
> object,
> > or (2) one or (3) both are NULL/0?
> >
> 
> Cast to unsigned? The above is not meant to require the programmer to
> add casts; it is about whether an implementation can simply use numbers
> for pointers under the hood and what it has to do to ensure that it is
> compliant with the C specs.
> Also notice that (void*)0 <= (void*)0 is UB, although (void*)0
> ==(void*)0 is defined and true. Whether NULL <= NULL is UB depends upon
> the implementation details of said macro (if it is an integer constant,
> the comparisons are valid (and true)).
> 
> > C++ defines NULL as 0 (which I had actually also assumed to be the case
> > in C), so this wouldn't be an issue there I guess?
> 
> This?
> 
> > (I mean "hypothetical" issue same as what it is in C)
> >
> > BTW, thanks for the insight. That part is really interesting.
> >
> > Now for the other one..
> >
> >> (Furthermore, it is not guaranteed by the spec that zeroing a pointer
> >> via memset (or calloc) generates a valid null pointer. E.g. the
> >> documentation of calloc has this footnote: "Note that this [the bitwise
> >> zero-initialization] need not be the same as the representation of
> >> floating-point zero or a null pointer constant." But I don't know a
> >> system where this is not so and we definitely require it to be so.)
> >>
> >>>> 2. Apart from that: Your code would potentially evaluate strrchr()
> >>>> multiple times which is bad style (given that this function is likely
> >>>> marked as pure the compiler could probably optimize the second call
> >>>> away, but this is not a given).
> >>>
> >>> It's not my code. It's code copied from avstring.c - so please blame
> >>> whoever that wrote.
> >>>
> >>
> >> I couldn't find strrchr() being evaluated multiple times unnecessarily
> >> due to a macro in avstring.c.
> >
> > I don't understand. av_basename() does this:
> >
> >     p = strrchr(path, '/');
> > #if HAVE_DOS_PATHS
> >     q = strrchr(path, '\\');
> >     d = strchr(path, ':');
> >     p = FFMAX3(p, q, d);
> > #endif
> >
> > and I do this:
> >
> >                 p = strrchr(vs->m3u8_name, '/');
> > #if HAVE_DOS_PATHS
> >                 p = FFMAX(p, strrchr(vs->m3u8_name, '\\'));
> > #endif
> >
> > So, for Windows, I'm counting 3 vs. 2...
> >
> 
> Indeed, you completely misunderstood:
> #define FFMAX(a,b) ((a) > (b) ? (a) : (b)),
> so what you intend to add will expand to
> ((p) > (strrchr(vs->m3u8_name, '\\')) ? (p) : (strrchr(vs->m3u8_name,
> '\\')))
> As you can see, there are two calls to strrchr(). A good compiler can
> optimize the second call away (especially if the function is declared as
> pure or similar), but it is better not to rely on this.
> 
> FFMAX3 uses FFMAX internally and evaluates its argument even more often
> than FFMAX; but this doesn't matter, because these are just pointers;
> there are no calls or side-effects or anything.

OK apologies, I really misunderstood. Yup - that's in fact unnecessary.

But it's still like peeing into the ocean and being afraid that the
polar caps could melt :-)

> 
> >
> >>> Regarding performance, I'm not sure whether this is relevant in any way,
> >>> given the low frequency of execution and putting it into relation to
> >>> all the other things that ffmpeg is doing usually.
> >>>
> >>
> >> The above would be a valid point if there were a tradeoff between
> >> writing the code without repeated evaluations and writing clear code.
> >> (And even then you'd be ignoring that the performance difference might
> >> be negligible for code only run very infrequently, but bloated code
> >> takes more space in the binary even when executed infrequently.) But
> >> there is no such tradeoff here.
> >
> > A HLS segment typically has 0.5 to 20 MB. Let's assume 1 MB. Even when the
> > file name/path would be re-evaluated for each segment, it would be looping
> > over something like 100 byte _additionally_ per segment. To produce the
> > segment, those 1 MB data would need to be "touched" or looped over multiple
> > times (effectively), even when only remuxing. Let's say 10x.
> > That leaves us with 100 iterations vs. 10 Million iterations.
> > Static ffmpeg binaries have like 60 MB. How many bytes does it add to the
> > code? 20? => makes 20 vs. 60 Million bytes
> >
> 
> To quote myself: "The above would be a valid point if there were a
> tradeoff between writing the code without repeated evaluations and
> writing clear code."

Now it makes sense ;-)


> >>> 2. The docs tell it's required to copy a string before supplying it to
> >>>    those functions (as they may changing the string).
> >>
> >> You are confusing av_basename() and av_dirname().
> >
> > No. av_dirname() is actually the one that is needed.
> >
> 
> One could use av_basename() to get the beginning of the basename to
> temporarily zero it. I specifically asked why you didn't use av_basename().
> 
> >>> 3. The hlsenc code changes the string temporarily and restores it after
> >>>    wards. The same couldn't be done when using the avstring functions.
> >>>
> >>
> >> Why?
> >
> > I don't know. You need to ask the author.
> 
> I asked you why you believed that this couldn't be done with the
> avstring functions, not why the code is as it is. (I know why the code
> is as it is: To simplify appending two strings. See
> 6db81e93a95d150ec828214ba7eb6183577c748c.)
> 
> >
> >> (Actually, your code is still slightly different from av_basename():
> >           ---^^^^---
> >
> > Are you aware that I haven't written a single line of this code besides
> > the 3 lines of the patch?
> >
> 
> And we are talking about this three line patch which you have written...

Question why those functions from avstring aren't used, still need to 
be asked at the author of the code.

The reason why I didn't do it is that I just needed a minimal fix without 
rewriting the code, and that's what it is: a small fix.
As it is in fact only executed once on Init, the unnecessary extra work
in total is multiple times less than outputting a single log lines with 
parameters.

Even when summing up the caused extra energy on a Million computers over
100 years, it will be still less than it takes to write this e-mail :-)


softworkz





More information about the ffmpeg-devel mailing list