[FFmpeg-devel] [PATCH v14 4/5] libavformat: Remove MAX_PATH limit and use UTF-8 version of getenv()

Hendrik Leppkes h.leppkes at gmail.com
Mon Jun 13 21:55:10 EEST 2022


On Mon, Jun 13, 2022 at 7:47 PM Soft Works <softworkz at hotmail.com> wrote:
>
>
>
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Nil
> > Admirari
> > Sent: Monday, June 13, 2022 6:26 PM
> > To: ffmpeg-devel at ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH v14 4/5] libavformat: Remove MAX_PATH
> > limit and use UTF-8 version of getenv()
> >
> > 1. getenv() is replaced with getenv_utf8() across libavformat.
> > 2. New versions of AviSynth+ are now called with UTF-8 filenames.
> > 3. Old versions of AviSynth are still using ANSI strings,
> >    but MAX_PATH limit on filename is removed.
> > ---
> >  libavformat/avisynth.c    | 39 +++++++++++++++++++++++++++----------
> > --
> >  libavformat/http.c        | 20 +++++++++++++-------
> >  libavformat/ipfsgateway.c | 35 +++++++++++++++++++++++------------
> >  libavformat/tls.c         | 11 +++++++++--
> >  4 files changed, 72 insertions(+), 33 deletions(-)
> >
> > diff --git a/libavformat/avisynth.c b/libavformat/avisynth.c
> > index 8ba2bdead2..a97d12b6b6 100644
> > --- a/libavformat/avisynth.c
> > +++ b/libavformat/avisynth.c
> > @@ -34,6 +34,7 @@
> >  /* Platform-specific directives. */
> >  #ifdef _WIN32
> >    #include "compat/w32dlfcn.h"
> > +  #include "libavutil/wchar_filename.h"
> >    #undef EXTERN_C
> >    #define AVISYNTH_LIB "avisynth"
> >  #else
> > @@ -56,6 +57,7 @@ typedef struct AviSynthLibrary {
> >  #define AVSC_DECLARE_FUNC(name) name ## _func name
> >      AVSC_DECLARE_FUNC(avs_bit_blt);
> >      AVSC_DECLARE_FUNC(avs_clip_get_error);
> > +    AVSC_DECLARE_FUNC(avs_check_version);
> >      AVSC_DECLARE_FUNC(avs_create_script_environment);
> >      AVSC_DECLARE_FUNC(avs_delete_script_environment);
> >      AVSC_DECLARE_FUNC(avs_get_audio);
> > @@ -137,6 +139,7 @@ static av_cold int avisynth_load_library(void)
> >
> >      LOAD_AVS_FUNC(avs_bit_blt, 0);
> >      LOAD_AVS_FUNC(avs_clip_get_error, 0);
> > +    LOAD_AVS_FUNC(avs_check_version, 0);
> >      LOAD_AVS_FUNC(avs_create_script_environment, 0);
> >      LOAD_AVS_FUNC(avs_delete_script_environment, 0);
> >      LOAD_AVS_FUNC(avs_get_audio, 0);
> > @@ -807,26 +810,38 @@ static int
> > avisynth_create_stream(AVFormatContext *s)
> >  static int avisynth_open_file(AVFormatContext *s)
> >  {
> >      AviSynthContext *avs = s->priv_data;
> > -    AVS_Value arg, val;
> > +    AVS_Value val;
> >      int ret;
> > -#ifdef _WIN32
> > -    char filename_ansi[MAX_PATH * 4];
> > -    wchar_t filename_wc[MAX_PATH * 4];
> > -#endif
> >
> >      if (ret = avisynth_context_create(s))
> >          return ret;
> >
> > +    if (!avs_library.avs_check_version(avs->env, 7)) {
>
> I like the version check. I don't know about all the derivatives
> of AviSynth, but I assume you have checked that it's valid for
> the common ones (or at least the original non-Plus variant)?
>
> > +        AVS_Value args[] = {
> > +            avs_new_value_string(s->url),
> > +            avs_new_value_bool(1) // filename is in UTF-8
> > +        };
> > +        val = avs_library.avs_invoke(avs->env, "Import",
> > +                                     avs_new_value_array(args, 2),
> > 0);
> > +    } else {
> > +        AVS_Value arg;
> >  #ifdef _WIN32
> > -    /* Convert UTF-8 to ANSI code page */
> > -    MultiByteToWideChar(CP_UTF8, 0, s->url, -1, filename_wc,
> > MAX_PATH * 4);
> > -    WideCharToMultiByte(CP_THREAD_ACP, 0, filename_wc, -1,
> > filename_ansi,
> > -                        MAX_PATH * 4, NULL, NULL);
> > -    arg = avs_new_value_string(filename_ansi);
> > +        char *filename_ansi;
> > +        /* Convert UTF-8 to ANSI code page */
> > +        if (utf8toansi(s->url, &filename_ansi)) {
>
> Two ideas came to my mind how this could be done better.
> What's actually needed here is not a string conversion, we need
> a valid and usable filename, and the function could be more
> something like "get_ansi_filename()".
>
> The first thing that this function could do is to convert the
> the filename to ANSI and right back to UTF-8, then compare the
> UTF-8 result with the original UTF-8 string. When both are equal,
> we know that the conversion is safe, otherwise we know that it
> won't work.
>
> Then, we can use the win32 API GetShortFileName(). Which returns
> file and directory names in 8.3 notation which (IIRC) contains
> only letters which are valid in the ANSI code page.
>

This seems unrelated to this patch, which is about removing the
MAX_PATH limit. The code previously converted UTF-8 to ANSI, and still
does so now, just without the MAX_PATH limit.
Further improvements tangential to this topic can, and should, be
applied independently, and not hold up this patch in discussion-hell
for longer than necessary.

- Hendrik


More information about the ffmpeg-devel mailing list