[FFmpeg-devel] [PATCH] avutil: add av_memcpy() to avoid undefined behavior with NULL, NULL, 0

Hendrik Leppkes h.leppkes at gmail.com
Thu Jul 4 21:56:41 EEST 2019


On Thu, Jul 4, 2019 at 8:45 PM Michael Niedermayer
<michael at niedermayer.cc> wrote:
>
> On Wed, Jul 03, 2019 at 12:50:59PM +0200, Hendrik Leppkes wrote:
> > On Wed, Jul 3, 2019 at 10:46 AM Michael Niedermayer
> > <michael at niedermayer.cc> wrote:
> > >
> > > On Wed, Jul 03, 2019 at 09:41:41AM +0200, Reimar Döffinger wrote:
> > > >
> > > >
> > > > On 03.07.2019, at 08:29, Michael Niedermayer <michael at niedermayer.cc> wrote:
> > > >
> > > > > On Tue, Jul 02, 2019 at 08:42:43PM -0300, James Almer wrote:
> > > > >> On 7/2/2019 5:56 PM, Michael Niedermayer wrote:
> > > > >>> Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> > > > >>> ---
> > > > >>> doc/APIchanges      |  3 +++
> > > > >>> libavutil/mem.h     | 13 +++++++++++++
> > > > >>> libavutil/version.h |  2 +-
> > > > >>> 3 files changed, 17 insertions(+), 1 deletion(-)
> > > > >>>
> > > > >>> diff --git a/doc/APIchanges b/doc/APIchanges
> > > > >>> index b5fadc2a48..65b8ed74ad 100644
> > > > >>> --- a/doc/APIchanges
> > > > >>> +++ b/doc/APIchanges
> > > > >>> @@ -15,6 +15,9 @@ libavutil:     2017-10-21
> > > > >>>
> > > > >>> API changes, most recent first:
> > > > >>>
> > > > >>> +2019-07-XX - XXXXXXXXXX - lavu 56.31.100 - mem.h
> > > > >>> +  Add av_memcpy()
> > > > >>> +
> > > > >>> 2019-06-21 - XXXXXXXXXX - lavu 56.30.100 - frame.h
> > > > >>>   Add FF_DECODE_ERROR_DECODE_SLICES
> > > > >>>
> > > > >>> diff --git a/libavutil/mem.h b/libavutil/mem.h
> > > > >>> index 5fb1a02dd9..a35230360d 100644
> > > > >>> --- a/libavutil/mem.h
> > > > >>> +++ b/libavutil/mem.h
> > > > >>> @@ -506,6 +506,19 @@ void *av_memdup(const void *p, size_t size);
> > > > >>>  */
> > > > >>> void av_memcpy_backptr(uint8_t *dst, int back, int cnt);
> > > > >>>
> > > > >>> +/**
> > > > >>> + * memcpy() implementation without a NULL pointer special case
> > > > >>> + *
> > > > >>> + * @param dst  Destination buffer
> > > > >>> + * @param src  Source buffer
> > > > >>> + * @param cnt  Number of bytes to copy; must be >= 0
> > > > >>> + */
> > > > >>> +static inline void av_memcpy(void *dst, const void *src, size_t cnt)
> > > > >>
> > > > >> How many cases are there in the codebase where cnt can be 0, and dst or
> > > > >> src NULL, without it having been checked before calling memcpy? And from
> > > > >> those, which would not be from situations where the code should have
> > > > >> instead aborted and returned ENOMEM, or EINVAL if either of them are
> > > > >> function arguments?
> > > > >
> > > > > There are around 2500 occurances of memcpy in the codebase
> > > > > To awnser your question it would be needed to review all of them and in many
> > > > > cases their surrounding code.
> > > > > So that is unlikely to be awnsered by anyone accuratly
> > > > >
> > > > > Also iam not sure i understand why you ask or why this would matter
> > > > > the suggested function allows to simplify cases where the NULL can
> > > > > occur, not where it cannot or should not. That is this is intended for
> > > > > the cases where we already have or are adding explicit checks to
> > > > > avoid the NULL case.
> > > > >
> > > > > i could rename this to av_memcpy_nullsafe which would make it clearer but
> > > > > also its more to write and read
> > > >
> > > > I admit I thought that a worthwhile idea originally,
> > > > but I have to think back to a long time ago that every function added to our "API" has a cost of people having to know about it and how to use it.
> > > > And if it's currently only 2 places that would benefit I think James is right to ask if it makes sense.
> > > > Of course another question might be if it might make sense to replace all memcpy uses with it.
> > > > I mean, isn't it naturally expected behaviour that the pointers would be ignored if the copy amount is 0? There might be a lot of code assuming that we do not know about...
> > >
> > > in addition to the 2 there are these related commits found by very dumb git log greps
> > > In further addition there would be cases that deal with src == dst, something we
> > > could add a check for in av_memcpy() too
> > >
> >
> > Personally I really don't like hiding too much magic in a function
> > like this. It can easily lead to brittle code, someone may think the
> > pointer is always valid since its memcpy'ed to, and uses it
> > afterwards, and there you have a disaster.
>
> Why would someone think that a pointer to an array of length 0 is valid ?
> malloc(0) for example does not gurantee that
>

Why would someone memcpy a array of length 0?
Because they didn't realize that it may be zero.

You seem to view this as intentional code with the full knowledge that
it may be zero, but I would rather argue that most people just didn't
consider this special case, instead of willingly running into
errorneous usage of memcpy.

- Hendrik


More information about the ffmpeg-devel mailing list