[FFmpeg-devel] [PATCH v3 1/5] libavutil: some VAAPI infrastructure

Tue Jan 19 20:47:49 CET 2016

On Tue, 19 Jan 2016 13:19:36 +0000
Mark Thompson <sw at jkqxz.net> wrote:

> ...
> >> +
> >> +static void vaapi_codec_release_surface(void *opaque, uint8_t *data)
> >> +{
> >> +    AVVAAPISurface *surface = opaque;
> >> +
> >> +    av_assert0(surface->refcount > 0);
> >> +    --surface->refcount;
> >> +}  
> > 
> > Maybe that's just me, but I think vaapi surfaces shouldn't break what
> > is guaranteed by AVFrame:
> > 
> > - unreffing them is thread-safe
> > - unreffing the last AVFrame will also unref whatever manages the
> >   surfaces (or alternatively, if the vaapi state is not refcounted, kill
> >   the process if there are still AVFrames referencing the vaapi state,
> >   as these AVFrames would essentially have dangling pointers)
> > 
> > That's a bit more complex, but IMO worth it.  
> 
> Unref being thread safe is a bit annoying because of the locking - if the lock were recursive then it could work, but that pushes me into pthreadland to do it.  Is that ok?

There could be a second lock which is not user-visible, and which
protects the surface pool only.

> 
> I'm not sure what you mean by the second point.  What do you want to also go when the last AVFrame gets unreferenced?

Imagine someone holding an AVFrame referencing the surface in the pool
at a time when the surface pool should be destroyed.

In my own vaapi code, I keep the pool alive until all frames are
unreferenced, even if the decoder is destroyed before. Helps with
making format changes simpler.

> 
> I know the AVFrame setup is all a bit dubious because I still don't really "get" how AVFrames are meant to be used.  I will think about it further.

Some explanation: AVFrames are refcounted via AVFrame.buf[]. Each
AVBufferRef is a refcounted memory region, and the AVFrame.data[]
pointers are supposed to point into one of the buffers. (All planes can
point into memory covered by AVFrame.buf[0], but it's also possible to
have each plane its own buffer.) If you create a reference to an
AVFrame with e.g. av_frame_ref(), new AVBufferRefs are created, which
increases the refcount of the internal buffer. Calling av_frame_unref()
deletes the AVBufferRefs and decrease the AVBuffer refcount. If the
AVBuffer's refcount reaches 0, the user-provided free callback is
invoked. (If there's no free callback, av_free() is called on the
original buffer data pointer.)

With hwaccels, the situation is a bit more confusing. There are no data
pointers, only an ID casted to a pointer. So it's harder to reason what
exactly AVBufferRef.data should be. But everything else can work in the
same way.

If the AVFrame is used in a different thread than the decoder, then it
can easily happen that the free callback is invoked in a different
thread.

> >> +
> >> +static int vaapi_get_surface(AVVAAPIPipelineContext *ctx,
> >> +                             AVVAAPISurfaceConfig *config,
> >> +                             AVVAAPISurface *surfaces, AVFrame *frame)
> >> +{
> >> +    AVVAAPISurface *surface;
> >> +    int i;
> >> +
> >> +    for(i = 0; i < config->count; i++) {
> >> +        if(surfaces[i].refcount == 0)
> >> +            break;
> >> +    }
> >> +    if(i >= config->count) {
> >> +        av_log(ctx, AV_LOG_ERROR, "Failed to allocate surface "
> >> +               "(%d in use).\n", config->count);
> >> +        return AVERROR(ENOMEM);
> >> +    }
> >> +    surface = &surfaces[i];
> >> +
> >> +    ++surface->refcount;
> >> +    frame->data[3] = (uint8_t*)(uintptr_t)surface->id;
> >> +    frame->buf[0] = av_buffer_create((uint8_t*)surface, 0,
> >> +                                     &vaapi_codec_release_surface,
> >> +                                     surface, AV_BUFFER_FLAG_READONLY);
> >> +    if(!frame->buf[0]) {
> >> +        av_log(ctx, AV_LOG_ERROR, "Failed to allocate dummy buffer "
> >> +               "for surface %#x.\n", surface->id);
> >> +        return AVERROR(ENOMEM);
> >> +    }
> >> +
> >> +    frame->format = AV_PIX_FMT_VAAPI;
> >> +    frame->width  = config->width;
> >> +    frame->height = config->height;
> >> +
> >> +    return 0;
> >> +}
> >> +
> >> +int ff_vaapi_get_input_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame)
> >> +{
> >> +    return vaapi_get_surface(ctx, ctx->input, ctx->input_surfaces, frame);
> >> +}
> >> +
> >> +int ff_vaapi_get_output_surface(AVVAAPIPipelineContext *ctx, AVFrame *frame)
> >> +{
> >> +    return vaapi_get_surface(ctx, ctx->output, ctx->output_surfaces, frame);
> >> +}  
> > 
> > I wonder if vaapi_get_surface() could instead be some sort of
> > independent frame pool? (But maybe the way you did is already elegant
> > wrt. how vaapi works?)  
> 
> The only trickiness here is that you have to declare all of the surfaces you are going to use for output when you create the pipeline context.
> 
> So yes, probably, if I understood AVFrames better...  Will consider along with the previous point.

Side note: it might also be possible to recreate the decoder and
recreate all surfaces if the number of required output surfaces turns
out higher than anticipated during playback.

> ...
> >> +    case AV_PIX_FMT_YUV420P:
> >> +        av_assert0(image->format.fourcc == VA_FOURCC_YV12);
> >> +        av_image_copy_plane(f->data[0], f->linesize[0],
> >> +                            data + image->offsets[0], image->pitches[0],
> >> +                            f->width, f->height);
> >> +        // Um, apparently these are not the same way round...  
> > 
> > While we're at it, there is also VA_FOURCC_IYUV, which is YV12 with
> > swapped chroma planes. Don't know if it's still in use anywhere.  
> 
> Which is presumably what this supposedly-YV12 format actually is, hence the unexpected swap (yay Intel).  I'll just add that because it clarifies what's going on.

Well, there are just different conventions in use. Actually, I think I
came to the conclusion that FFmpeg's convention is the less usual one.
The YV12 FourCC, which you can find in Microsoft stuff, for some reason
in X11's Xv, and libva, all use the same convention.

> >> +        av_image_copy_plane(f->data[2], f->linesize[2],
> >> +                            data + image->offsets[1], image->pitches[1],
> >> +                            f->width / 2, f->height / 2);
> >> +        av_image_copy_plane(f->data[1], f->linesize[1],
> >> +                            data + image->offsets[2], image->pitches[2],
> >> +                            f->width / 2, f->height / 2);
> >> +        break;
> >> +
> >> +    case AV_PIX_FMT_NV12:
> >> +        av_assert0(image->format.fourcc == VA_FOURCC_NV12);
> >> +        av_image_copy_plane(f->data[0], f->linesize[0],
> >> +                            data + image->offsets[0], image->pitches[0],
> >> +                            f->width, f->height);
> >> +        av_image_copy_plane(f->data[1], f->linesize[1],
> >> +                            data + image->offsets[1], image->pitches[1],
> >> +                            f->width, f->height / 2);
> >> +        break;
> >> +
> >> +    default:
> >> +        return AVERROR(EINVAL);
> >> +    }
> >> +
> >> +    return 0;
> >> +}  
> > 
> > It might be better to setup a temporary AVFrame to point to the
> > VAImage's data, and then use av_frame_copy().  
> 
> Yeah, ok.  I was thinking cleverer things might be able to happen here, but I guess that really that should be the responsibility of the user.
> 
> (Doing/undoing the interleave to convert YV12 <-> NV12 in this step was the obvious one I was thinking of.)

Such trivial reshuffling could also be handled by a generic copy
function.

> >> diff --git a/libavutil/vaapi.h b/libavutil/vaapi.h
> >> new file mode 100644
> >> index 0000000..3bbae6e
> >> --- /dev/null
> >> +++ b/libavutil/vaapi.h
> >> @@ -0,0 +1,123 @@
> >> +/*
> >> + * VAAPI helper functions.
> >> + *
> >> + * Copyright (C) 2016 Mark Thompson <mrt at jkqxz.net>
> >> + *
> >> + * This file is part of FFmpeg.
> >> + *
> >> + * FFmpeg is free software; you can redistribute it and/or
> >> + * modify it under the terms of the GNU Lesser General Public
> >> + * License as published by the Free Software Foundation; either
> >> + * version 2.1 of the License, or (at your option) any later version.
> >> + *
> >> + * FFmpeg is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >> + * Lesser General Public License for more details.
> >> + *
> >> + * You should have received a copy of the GNU Lesser General Public
> >> + * License along with FFmpeg; if not, write to the Free Software
> >> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> >> + */
> >> +
> >> +#ifndef LIBAVUTIL_VAAPI_H_
> >> +#define LIBAVUTIL_VAAPI_H_
> >> +
> >> +#include <va/va.h>
> >> +
> >> +#include "pixfmt.h"
> >> +#include "frame.h"
> >> +
> >> +
> >> +typedef struct AVVAAPIHardwareContext {
> >> +    VADisplay display;
> >> +
> >> +    VAConfigID  decoder_pipeline_config_id;
> >> +    VAContextID decoder_pipeline_context_id;
> >> +
> >> +    void (*lock)(void);
> >> +    void (*unlock)(void);
> >> +} AVVAAPIHardwareContext;
> >> +
> >> +void av_vaapi_lock_hardware_context(AVVAAPIHardwareContext *ctx);
> >> +void av_vaapi_unlock_hardware_context(AVVAAPIHardwareContext *ctx);
> >> +
> >> +
> >> +
> >> +// Everything below this point does not constitute a public API.  
> > 
> > Then why is it not in a private header?  
> 
> Right, I didn't think of that at all.
> 
> It kindof wants to be a public API (if users want to do anything interesting with VAAPI beyond a vanilla encode or decode, they want these functions or will reimplement them), but I wasn't confident
> enough that it has the right form to fix forever to actually declare it as such initially.
> 

Ah, wait, I think I'm seeing a problem here. Usually libavutil and
libavcodec are supposed to be independent libraries. So libavcodec
can't use libavutil internal APIs, at least not APIs which require
function calls to be resolved over a linker. And then again, there are
exceptions to this, which use the avpriv_ prefix.

Whenever this problem occurs, we have no clear idea how it should be
handled (and also, many of us dislike this "half-separation", where
it's neither possible to share private things, nor are they truly
separated).

Maybe it's just better to make this a public API.