[MPlayer-dev-eng] [PATCH 2/4] String handling audit/cleanup take 2

Nicholas Kain njkain at gmail.com
Wed Mar 7 11:41:10 CET 2007


On 3/6/07, Reimar Döffinger <Reimar.Doeffinger at stud.uni-karlsruhe.de> wrote:
> > @@ -690,8 +690,7 @@ static int asf_http_parse_response(asf_h
> >                                 mp_msg(MSGT_NETWORK,MSGL_WARN,MSGTR_MPDEMUX_ASF_ASFHTTPParseWarnCuttedPragma,pragma,len,sizeof(features) - 1);
> >                                 len = sizeof(features) - 1;
> >                               }
> > -                             strncpy( features, pragma, len );
> > -                             features[len]='\0';
> > +                             strlcpy( features, pragma, len );
>
> wrong.

'wrong' is not a very good response for this case.  This is an instance
where strncpy() should never have been used.

The problem with many of the uses of strncpy() is that there are instances
where strncpy() is used in place of memcpy().  It obscures the intention of the
code, and makes it difficult to tell whether the possibility of not including a
terminal nul is intentional or not.

If this is 'wrong' (ie, breaks the function), then the strncpy() needs
to be changed
to be a memcpy().  I've made this change, but the function as a whole has
some very hairy (and fragile) parsing.

> >      dst = malloc(length + 1);
> > -    strncpy(dst, src, length);
> > -    dst[length] = 0;
>> +    strlcpy(dst, src, length);
>
>wrong.

Please don't strip the function references off of the diff.  It makes
it harder to find
the referenced function.  I manged to find it in this case by grepping for
'dst\[length\]', but that's hardly a very unique identifier.

Yes, the behavior is different, but if you check the actual context of
the code change
(and the single place where that function is called), it makes no difference.

The function is a length-limited version of strdup.  The length-limiting should
be moved to parse_line() and strdup() should replace this function.  A string
duplication function is not the proper place to length-limit.  That said,
parse_line() is sufficiently cryptic that I do not plan on changing it.

Regardless, I've changed the diff as suggested.

> > @@ -389,7 +388,7 @@ http_response_parse( HTTP_header_t *http
> >               mp_msg(MSGT_NETWORK,MSGL_FATAL,"Memory allocation failed\n");
> >               return -1;
> >       }
> > -     strncpy( http_hdr->reason_phrase, hdr_ptr, len );
> > +     strlcpy( http_hdr->reason_phrase, hdr_ptr, len );
> >       if( http_hdr->reason_phrase[len-1]=='\r' ) {
>
> wrong, and this condition will always be false then
>
> > @@ -453,8 +453,7 @@ cddb_parse_matches_list(HTTP_header_t *h
> >               } else {
> >                       len = ptr2-ptr+1;
> >               }
> > -             strncpy(album_title, ptr, len);
> > -             album_title[len-2]='\0';
> > +             strlcpy(album_title, ptr, len);
> >       }
> >       mp_msg(MSGT_DEMUX, MSGL_STATUS, MSGTR_MPDEMUX_CDDB_ParseOKFoundAlbumTitle, album_title);
>
> These are not equivalent, but don't ask me which one is wrong...

Yes, the existing code appeared to have an off-by-one error, but is unclear
and very hard to follow.  In this case, it seems rather likely that the
lack of equivalence itself might be a bug fix.  This is where testing would
be great, but as I said in my original post, I don't have a http-encapsulated
video stream to test against...

> > --- stream/stream_cue.c.orig  2007-03-02 08:00:52.000000000 -0500
> > +++ stream/stream_cue.c       2007-03-02 13:43:17.000000000 -0500
> > @@ -33,6 +33,7 @@
> >  #define MODE1_2048 30
> >  #define MODE2_2336 40
> >  #define UNKNOWN -1
> > +#define CUEBUFMAX 256
>
> PATH_MAX probably is more appropriate but the malloc should be used...

I didn't want to change the lengths of the existing buffers.  Lifting
an embedded
constant used in many places into a #define seemed worthwhile, however.

> > @@ -156,14 +157,10 @@ static int cue_getTrackinfo(char *Line,
> >
> >
> >
> > -/* FIXME: the string operations ( strcpy,strcat ) below depend
> > - * on the arrays to have the same size, thus we need to make
> > - * sure the sizes are in sync.
> > - */
>
> This is wrong. If they are not the same size the code still won't work
> with those changes, it just won't write beyond a buffer.
> No buffer overflow != correct code.

I don't entirely agree.  Yes, it's true that if these buffers are
exceeded, there
will be truncations.  However, in general, a truncation is far preferable to
overflowing a buffer.  There are two reasons.  First, a truncation is easily
visible.  Overflowed buffers often times introduce silent corruption that
may not be noticed and lead to hard-to-debug, non-local problems.
Second, buffer overflows are more likely to lead to possible unintended
behavior than a truncation.  Truncations can break things, but will usually
break in a predictable manner.

I'd much rather deal with debugging a truncated string, than trying to
track down stack corruption.

The function could be better, I agree, but these are string cleanup patches,
not functionality improvement patches.  The patch in its current form is
a safety improvement over the existing code even if it is not ideal.

> >             bin_filename);
> >
> >      /* now try to find it with the path of the cue file */
> > -    snprintf(s,sizeof( s ),"%s/%s",bincue_path,bin_filename);
> > +    snprintf(s,sizeof(s),"%s/%s",bincue_path,bin_filename);
>
> cosmetics
>
> > @@ -221,23 +217,22 @@ static int cue_find_bin (char *firstline
> >                 MSGTR_MPDEMUX_CUEREAD_BinFilenameTested, s);
> >
> >          /* ok try it with path */
> > -        snprintf(t, sizeof( t ), "%s/%s", bincue_path, s);
> > +        snprintf(t, sizeof(t), "%s/%s", bincue_path, s);
>
> Here, too.
>
> > -            snprintf(t, sizeof( t ), "%s/%s", bincue_path, s);
> > +            snprintf(t, sizeof(t), "%s/%s", bincue_path, s);
>
> And here.

Removed.  Those are the only three cosmetic changes in the
patch.  They only remained because they were contained within
a diff hunk that had non-cosmetic changes, and thus were
difficult to remove by editing the diff...

> > @@ -322,7 +317,7 @@ static int cue_read_cue (char *in_cue_fi
> >       *t = '\0';
> >       t = s;
> >       if (*t == '\0')
> > -       strcpy(t, "/");
> > +       strlcpy(t, "/", 2);
>
> pointless. Code could probably be simplified though.

t is a buffer that is malloc()ed to a size of 2 just above the context of
the diff.  Changing from strcpy() to strlcpy() is not going to harm anything,
doesn't change behavior, makes it a bit faster to audit the code, and
will not impact performance.

> >      if (response[3] == '-') {
> > -      strncpy(match,response,3);
> > +      strncpy(match,response,3); /* safe and intended */
> >        match[3] = ' ';
> >        match[4] = '\0';
>
> Hm.
> strlcpy(match, response, 5);
> match[3] = ' ';
> would be no worse though.
> Although the 0-termination of match is pointless anyway...

Strings should always be null terminated.  Unterminated strings are
an accident waiting to happen if the code is later changed, even if
the code properly handles the unterminated string.

The existing code is certainly ugly, however, and I'm not going to change it.

> > -  snprintf(str,127,"%d.%d.%d.%d",num[0],num[1],num[2],num[3]);
> > +  snprintf(str,sizeof(str),"%d.%d.%d.%d",num[0],num[1],num[2],num[3]);
>
> Wow, I'd say in that file there was a proponent of useless use of
> snprintf at work.
> I must say, having an integer becoming >= 10^30 is really likely... We
> might really risk an overflow once ints become 128bit, with the first
> 1024-bit CPUs...

I'm not sure if you're commenting on my patch, or on the existing code,
or if you're just flaming me...

Since it seems implied, I wouldn't write a function in that way, but I don't
see the purpose in changing it, either.

BTW, is there an instance where sprintf() is demonstrably _better_ than
snprintf()?  snprintf() at least indicates that the original author had to
keep track of string lengths when writing the original code, which is
something that is in theory necessary even when using sprintf().
snprintf() is simply more explicit for later maintainers, and forces the
original author to explicitly track string lengths, which does lead to
fewer bugs.

I don't see how blanket criticism of snprintf() is going to improve
things.  It's a standard function now and has no portability issues
on modern platforms.

If snprintf() were required for all patches in place of sprintf(), then all
of this pain (both in creating, reviewing, and fixing these patches) would
be unnecessary.  I certainly don't enjoy doing this, and I doubt you do,
either.

> > --- stream/stream_rtsp.c.orig 2007-03-02 11:24:10.000000000 -0500
> > +++ stream/stream_rtsp.c      2007-03-02 13:39:52.000000000 -0500
> > @@ -61,6 +61,7 @@ rtsp_streaming_start (stream_t *stream)
> >    char *file;
> >    int port;
> >    int redirected, temp;
> > +  size_t len;
> >
> >    if (!stream)
> >      return -1;
> > @@ -87,10 +88,10 @@ rtsp_streaming_start (stream_t *stream)
> >      if (file[0] == '/')
> >        file++;
> >
> > -    mrl = malloc (strlen (stream->streaming_ctrl->url->hostname)
> > -                  + strlen (file) + 16);
> > +    len = strlen (stream->streaming_ctrl->url->hostname) + strlen (file) + 16;
> > +    mrl = malloc (len);
> >
> > -    sprintf (mrl, "rtsp://%s:%i/%s",
> > +    snprintf (mrl, sizeof(len), "rtsp://%s:%i/%s",
> >               stream->streaming_ctrl->url->hostname, port, file);
>
> wtf?

Thanks, this is a real bug that I've corrected.  Not sure if it was introduced
when I converted sizeof to sizeof() or if it was always there. :/

> >  static int control(stream_t *s, int cmd, void *arg) {
> > --- stream/tv.h.orig  2007-03-02 07:34:11.000000000 -0500
> > +++ stream/tv.h       2007-03-02 07:34:30.000000000 -0500
> > @@ -97,7 +97,6 @@ typedef struct tv_channels_s {
> >
> >  extern tv_channels_t *tv_channel_list;
> >  extern tv_channels_t *tv_channel_current, *tv_channel_last;
> > -extern char *tv_channel_last_real;
>
> Somewhat unrelated (yes I saw the changes below that make it necessary).
>
> > @@ -101,8 +102,7 @@ url_new(const char* url) {
> >               mp_msg(MSGT_NETWORK,MSGL_FATAL,MSGTR_MemAllocFailed);
> >               goto err_out;
> >       }
> > -     strncpy(Curl->protocol, escfilename, pos1);
> > -     Curl->protocol[pos1] = '\0';
> > +     strlcpy(Curl->protocol, escfilename, pos1);
>
> wrong.
>
> > @@ -123,8 +123,7 @@ url_new(const char* url) {
> >                       mp_msg(MSGT_NETWORK,MSGL_FATAL,MSGTR_MemAllocFailed);
> >                       goto err_out;
> >               }
> > -             strncpy(Curl->username, ptr1, len);
> > -             Curl->username[len] = '\0';
> > +             strlcpy(Curl->username, ptr1, len);
>
> wrong.
>
> > @@ -136,8 +135,7 @@ url_new(const char* url) {
> >                               mp_msg(MSGT_NETWORK,MSGL_FATAL,MSGTR_MemAllocFailed);
> >                               goto err_out;
> >                       }
> > -                     strncpy( Curl->password, ptr3+1, len2);
> > -                     Curl->password[len2]='\0';
> > +                     strlcpy( Curl->password, ptr3+1, len2);
>
> wrong.
>
> > @@ -188,8 +186,7 @@ url_new(const char* url) {
> >               mp_msg(MSGT_NETWORK,MSGL_FATAL,MSGTR_MemAllocFailed);
> >               goto err_out;
> >       }
> > -     strncpy(Curl->hostname, ptr1, pos2-pos1);
> > -     Curl->hostname[pos2-pos1] = '\0';
> > +     strlcpy(Curl->hostname, ptr1, pos2-pos1);
>
> wrong.

Have you verified that it is 'wrong' or merely different?  I've
changed them to be
functionally the same as before, but if you look past just the context
of the diffs,
the parsing code there is nearly unreadable.  It's quite possible some (but not
all, from looking at it) of these should have been memcpy().

> > @@ -212,7 +209,7 @@ url_new(const char* url) {
> >                       mp_msg(MSGT_NETWORK,MSGL_FATAL,MSGTR_MemAllocFailed);
> >                       goto err_out;
> >               }
> > -             strcpy(Curl->file, "/");
> > +             strlcpy(Curl->file, "/", 2);
>
> pointless.

Same as above.  2 is the length of Curl->file, as malloc()'ed above the context
of the diff.  This change is trivial, but intentional.

> > @@ -322,7 +319,7 @@ url_escape_string(char *outbuf, const ch
> >               if(tmp && (tmp[1] == '/' || tmp[1] == ':' ||
> >                          tmp[1] == '\0')) {
> >                       i = tmp+1-inbuf;
> > -                     strncpy(outbuf,inbuf,i);
> > +                     strncpy(outbuf,inbuf,i); /* safe and intended */
> >                       outbuf += i;
> >                       tmp = NULL;
>
> Using strlcpy would work fine too though.
> But the whole url_escape_string function is a horror and lacks at least
> some proper documentation of how to use it safely.
> And that tmp = NULL; must be one level further out AFAICT.

The function should be rewritten, but the lack of documentation and
convoluted code
makes me reluctant to touch it other than to comment that it will not overflow.


More information about the MPlayer-dev-eng mailing list