[FFmpeg-devel] [PATCH] avutil: add av_memcpy() to avoid undefined behavior with NULL, NULL, 0

Hendrik Leppkes h.leppkes at gmail.com
Wed Jul 3 13:50:59 EEST 2019


On Wed, Jul 3, 2019 at 10:46 AM Michael Niedermayer
<michael at niedermayer.cc> wrote:
>
> On Wed, Jul 03, 2019 at 09:41:41AM +0200, Reimar Döffinger wrote:
> >
> >
> > On 03.07.2019, at 08:29, Michael Niedermayer <michael at niedermayer.cc> wrote:
> >
> > > On Tue, Jul 02, 2019 at 08:42:43PM -0300, James Almer wrote:
> > >> On 7/2/2019 5:56 PM, Michael Niedermayer wrote:
> > >>> Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> > >>> ---
> > >>> doc/APIchanges      |  3 +++
> > >>> libavutil/mem.h     | 13 +++++++++++++
> > >>> libavutil/version.h |  2 +-
> > >>> 3 files changed, 17 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/doc/APIchanges b/doc/APIchanges
> > >>> index b5fadc2a48..65b8ed74ad 100644
> > >>> --- a/doc/APIchanges
> > >>> +++ b/doc/APIchanges
> > >>> @@ -15,6 +15,9 @@ libavutil:     2017-10-21
> > >>>
> > >>> API changes, most recent first:
> > >>>
> > >>> +2019-07-XX - XXXXXXXXXX - lavu 56.31.100 - mem.h
> > >>> +  Add av_memcpy()
> > >>> +
> > >>> 2019-06-21 - XXXXXXXXXX - lavu 56.30.100 - frame.h
> > >>>   Add FF_DECODE_ERROR_DECODE_SLICES
> > >>>
> > >>> diff --git a/libavutil/mem.h b/libavutil/mem.h
> > >>> index 5fb1a02dd9..a35230360d 100644
> > >>> --- a/libavutil/mem.h
> > >>> +++ b/libavutil/mem.h
> > >>> @@ -506,6 +506,19 @@ void *av_memdup(const void *p, size_t size);
> > >>>  */
> > >>> void av_memcpy_backptr(uint8_t *dst, int back, int cnt);
> > >>>
> > >>> +/**
> > >>> + * memcpy() implementation without a NULL pointer special case
> > >>> + *
> > >>> + * @param dst  Destination buffer
> > >>> + * @param src  Source buffer
> > >>> + * @param cnt  Number of bytes to copy; must be >= 0
> > >>> + */
> > >>> +static inline void av_memcpy(void *dst, const void *src, size_t cnt)
> > >>
> > >> How many cases are there in the codebase where cnt can be 0, and dst or
> > >> src NULL, without it having been checked before calling memcpy? And from
> > >> those, which would not be from situations where the code should have
> > >> instead aborted and returned ENOMEM, or EINVAL if either of them are
> > >> function arguments?
> > >
> > > There are around 2500 occurances of memcpy in the codebase
> > > To awnser your question it would be needed to review all of them and in many
> > > cases their surrounding code.
> > > So that is unlikely to be awnsered by anyone accuratly
> > >
> > > Also iam not sure i understand why you ask or why this would matter
> > > the suggested function allows to simplify cases where the NULL can
> > > occur, not where it cannot or should not. That is this is intended for
> > > the cases where we already have or are adding explicit checks to
> > > avoid the NULL case.
> > >
> > > i could rename this to av_memcpy_nullsafe which would make it clearer but
> > > also its more to write and read
> >
> > I admit I thought that a worthwhile idea originally,
> > but I have to think back to a long time ago that every function added to our "API" has a cost of people having to know about it and how to use it.
> > And if it's currently only 2 places that would benefit I think James is right to ask if it makes sense.
> > Of course another question might be if it might make sense to replace all memcpy uses with it.
> > I mean, isn't it naturally expected behaviour that the pointers would be ignored if the copy amount is 0? There might be a lot of code assuming that we do not know about...
>
> in addition to the 2 there are these related commits found by very dumb git log greps
> In further addition there would be cases that deal with src == dst, something we
> could add a check for in av_memcpy() too
>

Personally I really don't like hiding too much magic in a function
like this. It can easily lead to brittle code, someone may think the
pointer is always valid since its memcpy'ed to, and uses it
afterwards, and there you have a disaster.
Either the function name should make perfectly clear what it does, or
preferably it should just not exist and code should just validate its
stuff.

- Hendrik


More information about the ffmpeg-devel mailing list