[FFmpeg-devel] [PATCH] avformat/hlsenc: Fix path handling on Windows

Soft Works softworkz at hotmail.com
Sat Jan 15 23:17:49 EET 2022



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Andreas
> Rheinhardt
> Sent: Saturday, January 15, 2022 7:34 PM
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH] avformat/hlsenc: Fix path handling on
> Windows
> 
> Soft Works:
> >
> >
> >> -----Original Message-----
> >> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Andreas
> >> Rheinhardt
> >> Sent: Saturday, January 15, 2022 7:40 AM
> >> To: ffmpeg-devel at ffmpeg.org
> >> Subject: Re: [FFmpeg-devel] [PATCH] avformat/hlsenc: Fix path handling on
> >> Windows
> >>
> >> ffmpegagent:
> >>> From: softworkz <softworkz at hotmail.com>
> >>>
> >>> Signed-off-by: softworkz <softworkz at hotmail.com>
> >>> ---
> >>>     avformat/hlsenc: Fix path handling on Windows
> >>>
> >>>     Handling for DOS path separators was missing
> >>>
> >>> Published-As: https://github.com/ffstaging/FFmpeg/releases/tag/pr-
> >> ffstaging-19%2Fsoftworkz%2Fsubmit_hlspath-v1
> >>> Fetch-It-Via: git fetch https://github.com/ffstaging/FFmpeg pr-ffstaging-
> >> 19/softworkz/submit_hlspath-v1
> >>> Pull-Request: https://github.com/ffstaging/FFmpeg/pull/19
> >>>
> >>>  libavformat/hlsenc.c | 4 ++++
> >>>  1 file changed, 4 insertions(+)
> >>>
> >>> diff --git a/libavformat/hlsenc.c b/libavformat/hlsenc.c
> >>> index ef8973cea1..eff7f4212e 100644
> >>> --- a/libavformat/hlsenc.c
> >>> +++ b/libavformat/hlsenc.c
> >>> @@ -3028,6 +3028,10 @@ static int hls_init(AVFormatContext *s)
> >>>                  }
> >>>
> >>>                  p = strrchr(vs->m3u8_name, '/');
> >>> +#if HAVE_DOS_PATHS
> >>> +                p = FFMAX(p, strrchr(vs->m3u8_name, '\\'));
> >>> +#endif
> >>> +
> >>>                  if (p) {
> >>>                      char tmp = *(++p);
> >>>                      *p = '\0';
> >>>
> >>> base-commit: c936c319bd54f097cc1d75b1ee1c407d53215d71
> >>>
> >>
> >
> > Thanks for reviewing.
> >
> >> 1. You seem to be under the impression that NULL <= all other pointers.
> >> This is wrong. Relational operators acting on pointers are only defined
> >> when both point to the same object (the case of "one past the last
> >> element of an array" is also allowed) and are undefined behaviour
> otherwise.
> >
> > The case about NULL is interesting - I wasn't aware of that.
> > Is it practically relevant, i.e. is there any platform where casting
> > (void *)0 does not evaluate to 0 ?
> >
> 
> "An integer constant expression with the value 0, or such an expression
> cast to type
>  void *, is called a null pointer constant." (C11, 6.3.2.3 3) (void*)0
> is therefore a valid null pointer constant and is also commonly used for
> the NULL macro. (void*)0 == 0 is always true, because the right hand
> side is converted to the type of the pointer (namely to a null pointer)
> and all null pointers compare equal. But this is irrelevant to
> relational comparisons, because checking for equality of pointers is not
> subject to these pointers pointing to the same object (or one past the
> last element of an array...), whereas this is so for relational operations.
> 
> (If one uses unsigned for pointers, then one only needs to reserve two
> values that can not be used as part of an object: 0 and the max value
> (the latter can't be used for an object, because using a pointer one
> past the object is legal and has to be consistent with "<=" (and anyway
> said pointer must compare unequal to NULL)); if one used signed
> comparisons for pointers

On initial read I had thought you were implying something like "signed 
pointers". But maybe one should really create a computer system with 
"symmetric addressing" where pointers are always pointing to the center 
of a memory area (odd-number sized). I think it would be a great 
invention for bored and under-challenged developers who love to 
deep-dive into academic details - hihi..


(I'd bet, concepts and theory exist somewhere anywhere)
 
> one would have to reserve -1, 0 and the max
> value, the former because a one past the end array element needs to
> compare unequal to NULL and the latter to be consistent with <= and a
> potential one-past-the-end element. But this is a very small advantage.
> Honestly, I don't know whether compilers consistently use unsigned
> comparisons for pointer comparisons at all (even when restricted to
> compilers for systems with HAVE_DOS_PATHS). The fact that comparisons of
> pointers to different objects is UB means that compiler writers actually
> can choose what they want.)

So one could cast both pointers to unsigned to make the relational
comparison valid, as long as either (1) both are pointing to the same object,
or (2) one or (3) both are NULL/0?

C++ defines NULL as 0 (which I had actually also assumed to be the case
in C), so this wouldn't be an issue there I guess?
(I mean "hypothetical" issue same as what it is in C)

BTW, thanks for the insight. That part is really interesting.

Now for the other one..

> (Furthermore, it is not guaranteed by the spec that zeroing a pointer
> via memset (or calloc) generates a valid null pointer. E.g. the
> documentation of calloc has this footnote: "Note that this [the bitwise
> zero-initialization] need not be the same as the representation of
> floating-point zero or a null pointer constant." But I don't know a
> system where this is not so and we definitely require it to be so.)
> 
> >> 2. Apart from that: Your code would potentially evaluate strrchr()
> >> multiple times which is bad style (given that this function is likely
> >> marked as pure the compiler could probably optimize the second call
> >> away, but this is not a given).
> >
> > It's not my code. It's code copied from avstring.c - so please blame
> > whoever that wrote.
> >
> 
> I couldn't find strrchr() being evaluated multiple times unnecessarily
> due to a macro in avstring.c.

I don't understand. av_basename() does this:

    p = strrchr(path, '/');
#if HAVE_DOS_PATHS
    q = strrchr(path, '\\');
    d = strchr(path, ':');
    p = FFMAX3(p, q, d);
#endif

and I do this:

                p = strrchr(vs->m3u8_name, '/');
#if HAVE_DOS_PATHS
                p = FFMAX(p, strrchr(vs->m3u8_name, '\\'));
#endif

So, for Windows, I'm counting 3 vs. 2...


> > Regarding performance, I'm not sure whether this is relevant in any way,
> > given the low frequency of execution and putting it into relation to
> > all the other things that ffmpeg is doing usually.
> >
> 
> The above would be a valid point if there were a tradeoff between
> writing the code without repeated evaluations and writing clear code.
> (And even then you'd be ignoring that the performance difference might
> be negligible for code only run very infrequently, but bloated code
> takes more space in the binary even when executed infrequently.) But
> there is no such tradeoff here.

A HLS segment typically has 0.5 to 20 MB. Let's assume 1 MB. Even when the 
file name/path would be re-evaluated for each segment, it would be looping
over something like 100 byte _additionally_ per segment. To produce the
segment, those 1 MB data would need to be "touched" or looped over multiple
times (effectively), even when only remuxing. Let's say 10x.
That leaves us with 100 iterations vs. 10 Million iterations.
Static ffmpeg binaries have like 60 MB. How many bytes does it add to the 
code? 20? => makes 20 vs. 60 Million bytes


> >> 3. The code in av_basename() is also wrong.
> >
> > ...
> >
> >> 4. Is there actually a reason why you don't use av_basename() directly
> here?
> >
> > Yes, multiple:
> >
> > 1. Different behavior - those functions are returning a "." when not found
> 
> av_basename() also has precise guarantees when this happens.

by "those", I meant av_basename and av_dirname(). I wanted to do an
absolute minimal fix that is safe without rewriting the code and without
having to do much testing. I don't even use that code, the fix was
a favor for somebody else and I've been once again stupid enough to 
submit it to the ML - thinking it might be useful for somebody.


> > 2. The docs tell it's required to copy a string before supplying it to
> >    those functions (as they may changing the string).
> 
> You are confusing av_basename() and av_dirname().

No. av_dirname() is actually the one that is needed. 

> > 3. The hlsenc code changes the string temporarily and restores it after
> >    wards. The same couldn't be done when using the avstring functions.
> >
> 
> Why? 

I don't know. You need to ask the author. 

> (Actually, your code is still slightly different from av_basename(): 
          ---^^^^---

Are you aware that I haven't written a single line of this code besides 
the 3 lines of the patch?

I know nothing about the code. I have just fixed that single thing.
To do it in a way that is conforming and acceptable, I looked around
how it's done elsewhere. avstring.c is a central part of the code base,
used virtually everywhere, so I used the same approach, assuming that
it must be the most acceptable one. I didn't even spend a second for
thinking about it, because it's pointless here anyway: Somebody will
always come with another opinion. 
But when even the project's own code is not good enough for submission,
then it will sooner or later inflate.

Kind regards,
softworkz



More information about the ffmpeg-devel mailing list