[FFmpeg-devel] [PATCHv3] Added QSV based VPP filter

wm4 nfxjfg at gmail.com
Thu Nov 12 15:34:20 CET 2015


On Thu, 12 Nov 2015 11:33:28 +0100
"Sven Dueking" <sven at nablet.com> wrote:

> From a4de3cfda2f99a2e1f1e471d198ef39971c03798 Mon Sep 17 00:00:00 2001
> From: Sven Dueking <sven at nablet.com>
> Date: Thu, 12 Nov 2015 08:33:50 +0000
> Subject: [PATCH] added QSV VPP filter
> 
> ---
>  configure                |   1 +
>  doc/filters.texi         | 169 +++++++++++
>  libavfilter/Makefile     |   1 +
>  libavfilter/allfilters.c |   1 +
>  libavfilter/vf_qsv_vpp.c | 734 +++++++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 906 insertions(+)
>  create mode 100644 libavfilter/vf_qsv_vpp.c
> 
> diff --git a/configure b/configure
> index d5e76de..811be83 100755
> --- a/configure
> +++ b/configure
> @@ -2846,6 +2846,7 @@ super2xsai_filter_deps="gpl"
>  tinterlace_filter_deps="gpl"
>  vidstabdetect_filter_deps="libvidstab"
>  vidstabtransform_filter_deps="libvidstab"
> +vppqsv_filter_deps="libmfx"
>  pixfmts_super2xsai_test_deps="super2xsai_filter"
>  tinterlace_merge_test_deps="tinterlace_filter"
>  tinterlace_pad_test_deps="tinterlace_filter"
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 471ec3f..e90d998 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -11581,6 +11581,175 @@ vignette='PI/4+random(1)*PI/50':eval=frame
>  
>  @end itemize
>  
> +
> + at section vppqsv
> +
> +The VPP library is part of the IntelĀ® Media SDK
> +and exposes the media acceleration capabilities of IntelĀ® platforms.
> +
> +Specifically, this library performs the following conversions:
> +
> + at itemize
> + at item
> + at emph{Color Conversion}: is the process of changing one type of color-encoded signal into another.
> +VPP supports several input color formats and NV12 as output format.
> +
> + at verbatim
> +Format Type                     |
> +-------------------------------------------------------------------------
> +Input (uncompressed)            | YV12, NV12, YUY2, RGB4 (RGB 32-bit)
> +-------------------------------------------------------------------------
> +Output (uncompressed)           | NV12
> + at end verbatim
> +
> + at item
> + at emph{Scaling}: is the process of resizing an image.
> +An image size can be changed in several ways,
> +VPP uses a separable 8-tap poly-phase scaling filter with adaptive filter ringing suppression.
> +
> + at item
> + at emph{Deinterlacing}: is the process of converting interlaced video into a non-interlaced form.
> +VPP uses a motion adaptive based deinterlacing algorithm.
> +
> + at verbatim
> +Input Frame Rate (FPS) |                  Output Frame Rate (FPS)
> +   Interlaced          |                       Progressive
> +                       |  23.976  |  25  |  29.97  |  30  |  50  |  59.94  |  60
> +-------------------------------------------------------------------------------------
> +      29.97            | Inverse  |      |         |      |      |         |
> +                       | Telecine |      |    x    |      |      |         |
> +-------------------------------------------------------------------------------------
> +       50              |          |   x  |         |      |   x  |         |
> +-------------------------------------------------------------------------------------
> +      59.94            |          |      |    x    |      |      |    x    |
> +-------------------------------------------------------------------------------------
> +       60              |          |      |         |   x  |      |         |    x
> +
> +x indicates a supported function
> + at end verbatim
> +
> + at item
> + at emph{Denoising}: is the process of removing noise from a video signal.
> +VPP uses a motion detection based temporal (adaptive to noise
> +history) and spatial (edge/ texture preserved) denoiser.
> +The spatial video denoiser is applied to each frame individually
> +and the temporal video denoiser to reduce noise between frames.
> +
> + at item
> + at emph{Frame Rate Conversion}: is the process of converting the frame rate of a video stream,
> +VPP supports frame drop/repetition frame rate conversion algorithm or frame interpolation
> +which means it will interpolate the new frames in case of increasing the frame rate.
> +
> + at item
> + at emph{Detail Enhancement}: is the process of enhancing the edge contrast to improve its sharpness.
> +VPP uses an adaptive detail enhancement algorithm to increase the edge sharpness.
> +
> + at end itemize
> +
> +The filter accepts the following option:
> +
> + at table @option
> +
> + at item deinterlace
> +Sets the deinterlacing mode.
> +
> +It accepts the following values:
> + at table @option
> + at item 0
> +No deinterlacing (default).
> + at item 1
> +BOB deinterlacing mode
> + at item 2
> +Advanced deinterlacing.
> +
> + at emph{Note}: Advanced deinterlacing uses reference frames and has better quality.
> +BOB is faster than advanced deinterlacing.
> +Thus the user can choose between speed and quality.
> + at end table
> +
> + at item denoise
> +Enables denoise algorithm.
> +
> + at table @option
> + at item Value of 0-100 (inclusive) indicates the level of noise to remove. Default value is 0 (disabled).
> + at end table
> +
> + at item detail
> +Enables detail enhancement algorithm.
> +
> + at table @option
> + at item Value of 0-100 (inclusive) indicates the level of details to be enhanced. Default value is 0 (disabled).
> + at end table
> +
> + at item w
> + at item width
> +
> + at table @option
> + at item Output video width (range [32,4096]).
> + at end table
> +
> + at item h
> + at item height
> +
> + at table @option
> + at item Output video height (range [32,2304]).
> + at end table
> +
> + at item dpic
> +Sets output picture structure.
> +
> +It accepts the following values:
> + at table @option
> + at item 0
> +Interlaced top field first.
> + at item 1
> +Progressive (default).
> + at item 2
> +Interlaced bottom field first.
> + at end table
> +
> + at item fps
> +Sets output frame rate.
> +
> +It accepts the following values:
> + at table @option
> + at item 0
> +Input frame rate is used
> + at item else
> +Given value is used
> + at end table
> +
> + at item async_depth
> +
> + at table @option
> +
> + at item Specifies how many asynchronous operations VPP performs before the results are explicitly synchronized
> +(Default value is 4).
> +
> + at end table
> +
> + at emph{Warning}: The number of allocated frame surfaces depends on the number of async_depth and max_b_frames.
> +Wrong configuration could result in lost frames, since the VPP works asynchronously and could request multiple surfaces.
> +
> + at end table
> +
> +Limitations and known issues:
> +
> + at itemize
> + at item
> + at emph{Maximum supported resolution}: 4096x2304
> + at item
> + at emph{Minimum supported resolution}: 32x32
> + at item
> + at emph{Frame size}: frame width must be
> +a multiple of 16, frame height must be a multiple of 16 for progressive frames and a multiple
> +of 32 otherwise.
> +
> + at emph{NOTE}: VPP checks feature availability on any given machine at
> +runtime. Availability of features depends on hardware capabilities as well as driver version.
> +
> + at end itemize
> +
>  @section vstack
>  Stack input videos vertically.
>  
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 1f4abeb..8684c22 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -243,6 +243,7 @@ OBJS-$(CONFIG_VFLIP_FILTER)                  += vf_vflip.o
>  OBJS-$(CONFIG_VIDSTABDETECT_FILTER)          += vidstabutils.o vf_vidstabdetect.o
>  OBJS-$(CONFIG_VIDSTABTRANSFORM_FILTER)       += vidstabutils.o vf_vidstabtransform.o
>  OBJS-$(CONFIG_VIGNETTE_FILTER)               += vf_vignette.o
> +OBJS-$(CONFIG_VPPQSV_FILTER)                 += vf_qsv_vpp.o
>  OBJS-$(CONFIG_VSTACK_FILTER)                 += vf_stack.o framesync.o
>  OBJS-$(CONFIG_W3FDIF_FILTER)                 += vf_w3fdif.o
>  OBJS-$(CONFIG_WAVEFORM_FILTER)               += vf_waveform.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 63b8fdb..fa295e7 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -264,6 +264,7 @@ void avfilter_register_all(void)
>      REGISTER_FILTER(VIDSTABDETECT,  vidstabdetect,  vf);
>      REGISTER_FILTER(VIDSTABTRANSFORM, vidstabtransform, vf);
>      REGISTER_FILTER(VIGNETTE,       vignette,       vf);
> +    REGISTER_FILTER(VPPQSV,         vppqsv,         vf);
>      REGISTER_FILTER(VSTACK,         vstack,         vf);
>      REGISTER_FILTER(W3FDIF,         w3fdif,         vf);
>      REGISTER_FILTER(WAVEFORM,       waveform,       vf);
> diff --git a/libavfilter/vf_qsv_vpp.c b/libavfilter/vf_qsv_vpp.c
> new file mode 100644
> index 0000000..6064ae1
> --- /dev/null
> +++ b/libavfilter/vf_qsv_vpp.c
> @@ -0,0 +1,734 @@
> +/*
> + * Intel MediaSDK Quick Sync Video VPP filter
> + *
> + * copyright (c) 2015 Sven Dueking
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include <mfx/mfxvideo.h>
> +#include <mfx/mfxplugin.h>
> +
> +#include "avfilter.h"
> +#include "internal.h"
> +#include "formats.h"
> +
> +#include "libavutil/avstring.h"
> +#include "libavutil/error.h"
> +
> +#include "libavutil/common.h"
> +#include "libavutil/mem.h"
> +#include "libavutil/log.h"
> +#include "libavutil/time.h"
> +#include "libavutil/imgutils.h"
> +
> +#include "libavutil/opt.h"
> +#include "libavutil/pixfmt.h"
> +#include "libavutil/time.h"
> +#include "libavcodec/avcodec.h"
> +#include "libavcodec/qsv_internal.h"
> +
> +// number of video enhancement filters (denoise, procamp, detail, video_analysis, image stab)
> +#define ENH_FILTERS_COUNT           5
> +
> +typedef struct {
> +    const AVClass *class;
> +
> +    AVFilterContext *ctx;
> +
> +    mfxSession session;
> +    QSVSession internal_qs;
> +
> +    AVRational framerate;                           // target framerate
> +
> +    mfxFrameSurface1 **in_surface;
> +    mfxFrameSurface1 **out_surface;
> +
> +    mfxFrameAllocRequest req[2];                    // [0] - in, [1] - out
> +
> +    int num_surfaces_in;                            // input surfaces
> +    int num_surfaces_out;                           // output surfaces
> +
> +    unsigned char * surface_buffers_out;            // output surface buffer
> +
> +    char *load_plugins;
> +
> +    mfxVideoParam* pVppParam;
> +
> +    int cur_out_idx;
> +
> +    /* VPP extension */
> +    mfxExtBuffer* pExtBuf[1+ENH_FILTERS_COUNT];
> +    mfxExtVppAuxData extVPPAuxData;
> +
> +    /* Video Enhancement Algorithms */
> +    mfxExtVPPDeinterlacing deinterlace_conf;
> +    mfxExtVPPFrameRateConversion frc_conf;
> +    mfxExtVPPDenoise denoise_conf;
> +    mfxExtVPPDetail detail_conf;
> +
> +    int out_width;
> +    int out_height;
> +
> +    int height_align;
> +
> +    int dpic;                                       // destination picture structure
> +                                                    // -1 = unknown
> +                                                    // 0 = interlaced top field first
> +                                                    // 1 = progressive
> +                                                    // 2 = interlaced bottom field first
> +
> +    int deinterlace;                                // deinterlace mode : 0=off, 1=bob, 2=advanced
> +    int denoise;                                    // enable denoise algorithm. Level is the optional value from the interval [0; 100]
> +    int detail;                                     // enable detail enhancement algorithm.
> +                                                    // level is the optional value from the interval [0; 100]
> +    int async_depth;                                // async dept used by encoder
> +
> +    int frame_number;
> +
> +    int use_frc;                                    // use framerate conversion
> +
> +    int num_of_out_surfaces;
> +
> +} VPPContext;
> +
> +#define OFFSET(x) offsetof(VPPContext, x)
> +#define FLAGS AV_OPT_FLAG_VIDEO_PARAM|AV_OPT_FLAG_FILTERING_PARAM
> +
> +static const AVOption vpp_options[] = {
> +    { "w",           "Output video width",                                     OFFSET(out_width),    AV_OPT_TYPE_INT, {.i64=0}, 0, 4096, .flags = FLAGS },
> +    { "width",       "Output video width",                                     OFFSET(out_width),    AV_OPT_TYPE_INT, {.i64=0}, 0, 4096, .flags = FLAGS },
> +    { "h",           "Output video height",                                    OFFSET(out_height),   AV_OPT_TYPE_INT, {.i64=0}, 0, 2304, .flags = FLAGS },
> +    { "height",      "Output video height",                                    OFFSET(out_height),   AV_OPT_TYPE_INT, {.i64=0}, 0, 2304, .flags = FLAGS },
> +    { "dpic",        "dest pic struct: 0=tff, 1=progressive [default], 2=bff", OFFSET(dpic),         AV_OPT_TYPE_INT, {.i64 = 1 }, 0, 2, .flags = FLAGS },
> +    { "fps",         "A string describing desired output framerate",           OFFSET(framerate),    AV_OPT_TYPE_VIDEO_RATE, { .str = "0" }, .flags = FLAGS },
> +    { "async_depth", "Maximum processing parallelism [default = 4]",           OFFSET(async_depth),  AV_OPT_TYPE_INT, { .i64 = ASYNC_DEPTH_DEFAULT }, 0, INT_MAX, .flags = FLAGS },
> +    { "deinterlace", "deinterlace mode: 0=off, 1=bob, 2=advanced",             OFFSET(deinterlace),  AV_OPT_TYPE_INT, {.i64=0}, 0, MFX_DEINTERLACING_ADVANCED, .flags = FLAGS },
> +    { "denoise",     "denoise level [0, 100]",                                 OFFSET(denoise),      AV_OPT_TYPE_INT, {.i64=0}, 0, 100, .flags = FLAGS },
> +    { "detail",      "detail enhancement level [0, 100]",                      OFFSET(detail),       AV_OPT_TYPE_INT, {.i64=0}, 0, 100, .flags = FLAGS },
> +    { NULL }
> +};
> +
> +AVFILTER_DEFINE_CLASS(vpp);
> +
> +static int get_bpp(unsigned int fourcc)
> +{
> +    switch (fourcc) {
> +    case MFX_FOURCC_YUY2:
> +        return 16;
> +    case MFX_FOURCC_RGB4:
> +        return 32;
> +    }
> +    return 12;
> +}
> +
> +static int option_id_to_mfx_pic_struct(int id)
> +{
> +    switch (id) {
> +    case 0:
> +        return MFX_PICSTRUCT_FIELD_TFF;
> +    case 1:
> +        return MFX_PICSTRUCT_PROGRESSIVE;
> +    case 2:
> +        return MFX_PICSTRUCT_FIELD_BFF;
> +    }
> +    return MFX_PICSTRUCT_UNKNOWN;
> +}
> +
> +static int get_chroma_fourcc(unsigned int fourcc)
> +{
> +    switch (fourcc) {
> +    case MFX_FOURCC_YUY2:
> +        return MFX_CHROMAFORMAT_YUV422;
> +    case MFX_FOURCC_RGB4:
> +        return MFX_CHROMAFORMAT_YUV444;
> +    }
> +    return MFX_CHROMAFORMAT_YUV420;
> +}
> +
> +static int avframe_id_to_mfx_pic_struct(AVFrame * pic)
> +{
> +    if (!pic->interlaced_frame)
> +        return MFX_PICSTRUCT_PROGRESSIVE;
> +
> +    if (pic->top_field_first == 1)
> +        return MFX_PICSTRUCT_FIELD_TFF;
> +
> +    return MFX_PICSTRUCT_FIELD_BFF;
> +}
> +
> +static int query_formats(AVFilterContext *ctx)
> +{
> +    static const enum AVPixelFormat pix_fmts[] = {
> +        AV_PIX_FMT_YUV420P,
> +        AV_PIX_FMT_NV12,
> +        AV_PIX_FMT_YUV422P,
> +        AV_PIX_FMT_RGB32,
> +        AV_PIX_FMT_NONE
> +    };
> +
> +    return ff_set_common_formats(ctx, ff_make_format_list(pix_fmts));
> +}
> +
> +static av_cold int init_vpp_param(AVFilterLink *inlink , AVFrame * pic)
> +{
> +    AVFilterContext *ctx = inlink->dst;
> +    VPPContext *vpp= ctx->priv;
> +
> +    // input data
> +    if (inlink->format == AV_PIX_FMT_YUV420P)
> +        vpp->pVppParam->vpp.In.FourCC = MFX_FOURCC_YV12;
> +    else if (inlink->format == AV_PIX_FMT_YUV422P)
> +        vpp->pVppParam->vpp.In.FourCC = MFX_FOURCC_YUY2;
> +    else if (inlink->format == AV_PIX_FMT_NV12)
> +        vpp->pVppParam->vpp.In.FourCC = MFX_FOURCC_NV12;
> +    else
> +        vpp->pVppParam->vpp.In.FourCC = MFX_FOURCC_RGB4;

I think it would be better to make the last case explicit too (instead
of assuming that the format is AV_PIX_FMT_RGB32) - it will make it
slightly easier to add new formats. (Same for get_chroma_fourcc and
get_bpp.)

Also, why is get_chroma_fourcc a separate function, while mapping the
pixfmt is not?

> +
> +    vpp->pVppParam->vpp.In.ChromaFormat = get_chroma_fourcc(vpp->pVppParam->vpp.In.FourCC);
> +    vpp->pVppParam->vpp.In.CropX = 0;
> +    vpp->pVppParam->vpp.In.CropY = 0;
> +    vpp->pVppParam->vpp.In.CropW = inlink->w;
> +    vpp->pVppParam->vpp.In.CropH = inlink->h;
> +    vpp->pVppParam->vpp.In.PicStruct = avframe_id_to_mfx_pic_struct(pic);
> +    vpp->pVppParam->vpp.In.FrameRateExtN = inlink->frame_rate.num;
> +    vpp->pVppParam->vpp.In.FrameRateExtD = inlink->frame_rate.den;
> +    vpp->pVppParam->vpp.In.BitDepthLuma   = 8;
> +    vpp->pVppParam->vpp.In.BitDepthChroma = 8;
> +
> +    // width must be a multiple of 16
> +    // height must be a multiple of 16 in case of frame picture and a multiple of 32 in case of field picture
> +    vpp->height_align = MFX_PICSTRUCT_PROGRESSIVE == vpp->pVppParam->vpp.In.PicStruct ? 16 : 32;
> +    vpp->pVppParam->vpp.In.Width = FFALIGN(inlink->w, 16);
> +    vpp->pVppParam->vpp.In.Height =  FFALIGN(inlink->h, vpp->height_align);
> +
> +    // output data
> +    vpp->pVppParam->vpp.Out.FourCC = MFX_FOURCC_NV12;
> +    vpp->pVppParam->vpp.Out.ChromaFormat = MFX_CHROMAFORMAT_YUV420;
> +    vpp->pVppParam->vpp.Out.CropX = 0;
> +    vpp->pVppParam->vpp.Out.CropY = 0;
> +    vpp->pVppParam->vpp.Out.CropW = !vpp->out_width ? inlink->w : vpp->out_width;
> +    vpp->pVppParam->vpp.Out.CropH = !vpp->out_height ? inlink->h : vpp->out_height;

Looking at other parts of the patch, I get the impression out_width==0
does not work at all. It seems it must always be set by the options.
But if that is so, checking it here doesn't make sense, and it should
error out on init instead. Is this an inconsistency or just forgotten?

> +    vpp->pVppParam->vpp.Out.PicStruct = option_id_to_mfx_pic_struct(vpp->dpic);
> +    vpp->pVppParam->vpp.Out.FrameRateExtN = vpp->framerate.num;
> +    vpp->pVppParam->vpp.Out.FrameRateExtD = vpp->framerate.den;
> +    vpp->pVppParam->vpp.Out.BitDepthLuma   = 8;
> +    vpp->pVppParam->vpp.Out.BitDepthChroma = 8;
> +
> +    if ((vpp->pVppParam->vpp.In.FrameRateExtN / vpp->pVppParam->vpp.In.FrameRateExtD) !=
> +        (vpp->pVppParam->vpp.Out.FrameRateExtN / vpp->pVppParam->vpp.Out.FrameRateExtD))
> +       vpp->use_frc = 1;
> +    else
> +       vpp->use_frc = 0;
> +
> +    // width must be a multiple of 16
> +    // height must be a multiple of 16 in case of frame picture and a multiple of 32 in case of field picture
> +    vpp->pVppParam->vpp.Out.Width = FFALIGN(vpp->pVppParam->vpp.Out.CropW, 16);
> +    vpp->pVppParam->vpp.Out.Height =
> +        (MFX_PICSTRUCT_PROGRESSIVE == vpp->pVppParam->vpp.Out.PicStruct) ?
> +        FFALIGN(vpp->pVppParam->vpp.Out.CropH, 16) :
> +        FFALIGN(vpp->pVppParam->vpp.Out.CropH, 32);

This alignment business is duplicated in multiple places. Maybe move it
to a function?

> +
> +    vpp->pVppParam->IOPattern =
> +        MFX_IOPATTERN_IN_SYSTEM_MEMORY | MFX_IOPATTERN_OUT_SYSTEM_MEMORY;
> +
> +    av_log(ctx, AV_LOG_INFO, "In %dx%d %d fps\t Out %dx%d %d fps\n",
> +           vpp->pVppParam->vpp.In.Width,
> +           vpp->pVppParam->vpp.In.Height,
> +           vpp->pVppParam->vpp.In.FrameRateExtN / vpp->pVppParam->vpp.In.FrameRateExtD,
> +           vpp->pVppParam->vpp.Out.Width,
> +           vpp->pVppParam->vpp.Out.Height,
> +           vpp->pVppParam->vpp.Out.FrameRateExtN / vpp->pVppParam->vpp.Out.FrameRateExtD);
> +
> +    if (vpp->use_frc == 1)
> +        av_log(ctx, AV_LOG_INFO, "Framerate conversion enabled\n");
> +
> +    return 0;
> +}
> +
> +static av_cold int init_surface(AVFilterLink *inlink)
> +{
> +    AVFilterContext *ctx = inlink->dst;
> +    VPPContext *vpp= ctx->priv;
> +
> +    int i,j;
> +    unsigned int width = 0;
> +    unsigned int height = 0;
> +    unsigned int surface_size = 0;
> +
> +    vpp->num_surfaces_in  = FFMAX(vpp->req[0].NumFrameSuggested, vpp->async_depth + 4);
> +    vpp->num_surfaces_out = FFMAX(vpp->req[1].NumFrameSuggested, vpp->num_of_out_surfaces);
> +
> +    width = FFALIGN(vpp->pVppParam->vpp.In.Width, 32);
> +    height = FFALIGN(vpp->pVppParam->vpp.In.Height, 32);
> +    surface_size = width * height * get_bpp(vpp->pVppParam->vpp.In.FourCC) / 8;
> +
> +    vpp->in_surface = av_mallocz(sizeof(char*) * vpp->num_surfaces_in);
> +
> +    if (!vpp->in_surface) {
> +        vpp->num_surfaces_out = 0;
> +        vpp->num_surfaces_in = 0;
> +        return AVERROR(ENOMEM);
> +    }
> +
> +    for (i = 0; i < vpp->num_surfaces_in; i++) {
> +       vpp->in_surface[i] = av_mallocz(sizeof(mfxFrameSurface1));
> +
> +        if (!vpp->in_surface[i]) {
> +            for (j = 0; j < i; j++) {
> +                av_freep(&vpp->in_surface[j]);
> +            }
> +
> +            vpp->num_surfaces_out = 0;
> +            vpp->num_surfaces_in = 0;
> +            return AVERROR(ENOMEM);
> +        }
> +
> +        vpp->in_surface[i]->Info = vpp->pVppParam->vpp.In;
> +    }
> +
> +    width = FFALIGN(vpp->pVppParam->vpp.Out.Width, 32);
> +    height = FFALIGN(vpp->pVppParam->vpp.Out.Height, 32);

Aren't Out.Width/Height are already aligned? Isn't it a problem that
Height is differently aligned (on 16 above, 32 here)?

> +    surface_size = width * height * 12 / 8;
> +    vpp->surface_buffers_out = av_mallocz(surface_size * vpp->num_surfaces_out);
> +
> +    if (!vpp->surface_buffers_out) {
> +        vpp->num_surfaces_out = 0;
> +        return AVERROR(ENOMEM);
> +    }
> +
> +    vpp->out_surface = av_mallocz(sizeof(char*) * vpp->num_surfaces_out);
> +
> +    if (!vpp->out_surface) {
> +        vpp->num_surfaces_out = 0;
> +        return AVERROR(ENOMEM);
> +    }
> +
> +    for (i = 0; i < vpp->num_surfaces_out; i++) {
> +        vpp->out_surface[i] = av_mallocz(sizeof(mfxFrameSurface1));
> +
> +        if (!vpp->out_surface[i]) {
> +            for (j = 0; j < i; j++) {
> +                if (vpp->out_surface[j])
> +                    av_freep(&vpp->out_surface[j]);
> +            }
> +
> +            vpp->num_surfaces_out = 0;
> +            return AVERROR(ENOMEM);
> +        }
> +
> +        vpp->out_surface[i]->Info = vpp->pVppParam->vpp.Out;
> +        vpp->out_surface[i]->Data.Y = &vpp->surface_buffers_out[surface_size * i];
> +        vpp->out_surface[i]->Data.U = vpp->out_surface[i]->Data.Y + width * height;
> +        vpp->out_surface[i]->Data.V = vpp->out_surface[i]->Data.U + 1;
> +        vpp->out_surface[i]->Data.PitchLow = vpp->pVppParam->vpp.Out.Width;
> +    }
> +
> +    return 0;
> +}

This whole function could benefit from a single error exit point with
the same cleanup code for all cases. (You know, the "goto fail;" idiom.)

Also, I don't know if config can be called multiple times on the same
filter instance, but if it does, the old surfaces need to be
deallocated first.

> +
> +static av_cold int config_vpp(AVFilterLink *inlink, AVFrame * pic)
> +{
> +    AVFilterContext *ctx = inlink->dst;
> +    VPPContext *vpp= ctx->priv;
> +
> +    int ret = 0;
> +
> +    mfxVideoParam mfxParamsVideo;
> +
> +    memset(&mfxParamsVideo, 0, sizeof(mfxVideoParam));
> +    vpp->pVppParam = &mfxParamsVideo;

Storing a pointer to the stack? Are you sure about this?

> +
> +    init_vpp_param(inlink, pic);
> +
> +    vpp->pVppParam->NumExtParam = 0;
> +
> +    vpp->pVppParam->ExtParam = (mfxExtBuffer**)vpp->pExtBuf;
> +
> +    if (vpp->deinterlace) {
> +        memset(&vpp->deinterlace_conf, 0, sizeof(mfxExtVPPDeinterlacing));
> +        vpp->deinterlace_conf.Header.BufferId = MFX_EXTBUFF_VPP_DEINTERLACING;
> +        vpp->deinterlace_conf.Header.BufferSz = sizeof(mfxExtVPPDeinterlacing);
> +        vpp->deinterlace_conf.Mode            = vpp->deinterlace == 1 ? MFX_DEINTERLACING_BOB : MFX_DEINTERLACING_ADVANCED;
> +
> +        vpp->pVppParam->ExtParam[vpp->pVppParam->NumExtParam++] = (mfxExtBuffer*)&(vpp->deinterlace_conf);
> +    }
> +
> +    if (vpp->denoise) {
> +        memset(&vpp->denoise_conf, 0, sizeof(mfxExtVPPDenoise));
> +        vpp->denoise_conf.Header.BufferId = MFX_EXTBUFF_VPP_DENOISE;
> +        vpp->denoise_conf.Header.BufferSz = sizeof(mfxExtVPPDenoise);
> +        vpp->denoise_conf.DenoiseFactor   = vpp->denoise;
> +
> +        vpp->pVppParam->ExtParam[vpp->pVppParam->NumExtParam++] = (mfxExtBuffer*)&(vpp->denoise_conf);
> +    }
> +
> +    if (vpp->detail) {
> +        memset(&vpp->detail_conf, 0, sizeof(mfxExtVPPDetail));
> +        vpp->detail_conf.Header.BufferId  = MFX_EXTBUFF_VPP_DETAIL;
> +        vpp->detail_conf.Header.BufferSz  = sizeof(mfxExtVPPDetail);
> +        vpp->detail_conf.DetailFactor     = vpp->detail;
> +
> +        vpp->pVppParam->ExtParam[vpp->pVppParam->NumExtParam++] = (mfxExtBuffer*)&(vpp->detail_conf);
> +    }
> +
> +    if (vpp->use_frc || !vpp->pVppParam->NumExtParam) {
> +        memset(&vpp->frc_conf, 0, sizeof(mfxExtVPPFrameRateConversion));
> +        vpp->frc_conf.Header.BufferId = MFX_EXTBUFF_VPP_FRAME_RATE_CONVERSION;
> +        vpp->frc_conf.Header.BufferSz = sizeof(mfxExtVPPFrameRateConversion);
> +        vpp->frc_conf.Algorithm       = MFX_FRCALGM_PRESERVE_TIMESTAMP; // make optional
> +
> +        vpp->pVppParam->ExtParam[vpp->pVppParam->NumExtParam++] = (mfxExtBuffer*)&(vpp->frc_conf);
> +    }
> +
> +    ret = MFXVideoVPP_Query(vpp->session, vpp->pVppParam, vpp->pVppParam);
> +
> +    if (ret >= MFX_ERR_NONE) {
> +        av_log(ctx, AV_LOG_INFO, "MFXVideoVPP_Query returned %d \n", ret);

Should be verbose log level?

> +    } else {
> +        av_log(ctx, AV_LOG_ERROR, "Error %d querying the VPP parameters\n", ret);
> +        return ff_qsv_error(ret);
> +    }
> +
> +    memset(&vpp->req, 0, sizeof(mfxFrameAllocRequest) * 2);
> +    ret = MFXVideoVPP_QueryIOSurf(vpp->session, vpp->pVppParam, &vpp->req[0]);
> +
> +    if (ret < 0) {
> +        av_log(ctx, AV_LOG_ERROR, "Error querying the VPP IO surface\n");
> +        return ff_qsv_error(ret);
> +    }
> +
> +    ret = init_surface(inlink);
> +
> +    if (ret < 0) {
> +        av_log(ctx, AV_LOG_ERROR, "Error %d initialize VPP surfaces\n", ret);
> +        return ret;
> +    }
> +
> +    ret = MFXVideoVPP_Init(vpp->session, vpp->pVppParam);
> +
> +    if (MFX_WRN_PARTIAL_ACCELERATION == ret) {

I'm not sure if we have a policy against yoda-conditions, but if we
don't, I wish that we have.

> +        av_log(ctx, AV_LOG_WARNING, "VPP will work with partial HW acceleration\n");
> +    } else if (ret < 0) {
> +        av_log(ctx, AV_LOG_ERROR, "Error initializing VPP\n");
> +        return ff_qsv_error(ret);
> +    }
> +
> +    return 0;
> +}
> +
> +static av_cold void free_surface(AVFilterLink *inlink)
> +{
> +    AVFilterContext *ctx = inlink->dst;
> +    VPPContext *vpp= ctx->priv;
> +    unsigned int i;
> +
> +    for (i = 0; i < vpp->num_surfaces_in; i++)
> +            av_freep(&vpp->in_surface[i]);
> +
> +    if (vpp->in_surface)
> +        av_freep(&vpp->in_surface);
> +
> +    for (i = 0; i < vpp->num_surfaces_out; i++)
> +         av_freep(&vpp->out_surface[i]);
> +
> +    av_freep(&vpp->out_surface);
> +
> +    av_free(vpp->surface_buffers_out);
> +
> +    vpp->num_surfaces_in  = 0;
> +    vpp->num_surfaces_out = 0;
> +}
> +
> +static av_cold int config_input(AVFilterLink *inlink)
> +{
> +    AVFilterLink *outlink = inlink->dst->outputs[0];
> +    AVFilterContext *ctx = inlink->dst;
> +    VPPContext *vpp= ctx->priv;
> +
> +    vpp->out_width = FFALIGN(vpp->out_width, 16);
> +    vpp->out_height = (vpp->dpic == 2) ?
> +        FFALIGN(vpp->out_height, 16) :
> +        FFALIGN(vpp->out_height, 32); // force 32 if unknown
> +
> +    outlink->w = vpp->out_width;
> +    outlink->h = vpp->out_height;
> +
> +    if (vpp->framerate.den && vpp->framerate.num) {
> +        outlink->frame_rate    = vpp->framerate;
> +        outlink->time_base     = av_inv_q(vpp->framerate);
> +        outlink->time_base.den = outlink->time_base.den * 1000;
> +    } else {
> +        vpp->framerate         = inlink->frame_rate;
> +        outlink->frame_rate    = vpp->framerate;
> +        outlink->time_base     = av_inv_q(vpp->framerate);
> +        outlink->time_base.den = outlink->time_base.den * 1000;
> +    }
> +
> +    outlink->format = AV_PIX_FMT_NV12;
> +
> +    return 0;
> +}
> +
> +static int get_free_surface_index_in(AVFilterContext *ctx, mfxFrameSurface1 ** surface_pool, int pool_size)
> +{
> +    if (surface_pool) {
> +        for (mfxU16 i = 0; i < pool_size; i++) {

Why the use of the mfxU16 type?

> +            if (!surface_pool[i]->Data.Locked)
> +                return i;
> +        }
> +    }
> +
> +    av_log(ctx, AV_LOG_ERROR, "Error getting a free surface, pool size is %d\n", pool_size);
> +    return MFX_ERR_NOT_FOUND;
> +}
> +
> +static int get_free_surface_index_out(AVFilterContext *ctx, mfxFrameSurface1 ** surface_pool, int pool_size)
> +{
> +    VPPContext *vpp = ctx->priv;
> +
> +    if (surface_pool) {
> +        for (mfxU16 i = 0; i < pool_size; i++) {
> +            if (!surface_pool[i]->Data.Locked) {
> +                if (i == vpp->cur_out_idx) {
> +                   if (vpp->cur_out_idx == pool_size - 1)
> +                       vpp->cur_out_idx = 0;
> +                   else
> +                       vpp->cur_out_idx ++;
> +
> +                   return i;
> +                }
> +            }
> +        }
> +    }
> +
> +    av_log(ctx, AV_LOG_ERROR, "Error getting a free surface, pool size is %d\n", pool_size);
> +    return MFX_ERR_NOT_FOUND;
> +}
> +
> +static int filter_frame(AVFilterLink *inlink, AVFrame *picref)
> +{
> +    AVFilterContext *ctx = inlink->dst;
> +    VPPContext *vpp = ctx->priv;
> +
> +    mfxSyncPoint sync = NULL;
> +    int ret = 0;
> +    int filter_frame_ret = 0;
> +
> +    AVFilterLink *outlink = inlink->dst->outputs[0];
> +
> +    AVFrame *out;
> +    AVFrame *in;
> +
> +    int in_idx = 0;
> +    int out_idx = 0;
> +
> +    if (!vpp->frame_number) {
> +        ret = config_vpp(inlink, picref);
> +
> +        if (ret < 0)
> +            return AVERROR(ENOMEM);
> +    }
> +
> +    /* make a copy if the input is not padded as libmfx requires */
> +    if (picref->height & (vpp->height_align - 1) ||
> +        picref->linesize[0] & 15) {
> +            int aligned_w = FFALIGN(picref->width, 16);
> +            int aligned_h = FFALIGN(picref->height, vpp->height_align);
> +
> +            in = ff_get_video_buffer(inlink, aligned_w, aligned_h);
> +
> +            if (!in) {
> +                return AVERROR(ENOMEM);
> +            }
> +
> +            ret = av_frame_copy(in, picref);
> +
> +            av_frame_free(&picref);
> +
> +            if (ret < 0) {
> +                av_frame_free(&in);
> +                return ret;
> +            }
> +    } else
> +        in = picref;
> +
> +    do {
> +        in_idx = get_free_surface_index_in(ctx, vpp->in_surface, vpp->num_surfaces_in);
> +        out_idx = get_free_surface_index_out(ctx, vpp->out_surface, vpp->num_surfaces_out);
> +
> +        if (in_idx == MFX_ERR_NOT_FOUND || out_idx == MFX_ERR_NOT_FOUND) {
> +            av_frame_free(&in);
> +            return AVERROR(ENOMEM);
> +        }
> +
> +        out = ff_get_video_buffer(outlink,
> +                                  vpp->out_surface[out_idx]->Info.Width,
> +                                  vpp->out_surface[out_idx]->Info.Height);
> +        if (!out) {
> +            av_frame_free(&in);
> +            return AVERROR(ENOMEM);
> +        }
> +
> +        av_frame_copy_props(out, in);
> +
> +        // init in surface
> +        if (inlink->format == AV_PIX_FMT_NV12) {
> +            vpp->in_surface[in_idx]->Data.Y = in->data[0];
> +            vpp->in_surface[in_idx]->Data.VU = in->data[1];
> +            vpp->in_surface[in_idx]->Data.PitchLow = in->linesize[0];
> +        } else if (inlink->format == AV_PIX_FMT_YUV420P) {
> +            vpp->in_surface[in_idx]->Data.Y = in->data[0];
> +            vpp->in_surface[in_idx]->Data.U = in->data[1];
> +            vpp->in_surface[in_idx]->Data.V = in->data[2];
> +            vpp->in_surface[in_idx]->Data.PitchLow = in->linesize[0];
> +        } else if (inlink->format == AV_PIX_FMT_YUV422P ) {
> +            vpp->in_surface[in_idx]->Data.Y = in->data[0];
> +            vpp->in_surface[in_idx]->Data.U = in->data[0] + 1;
> +            vpp->in_surface[in_idx]->Data.V = in->data[0] + 3;
> +            vpp->in_surface[in_idx]->Data.PitchLow = in->linesize[0];
> +        } else if (inlink->format == AV_PIX_FMT_ARGB) {
> +            vpp->in_surface[in_idx]->Data.B = in->data[0];
> +            vpp->in_surface[in_idx]->Data.G = in->data[0] + 1;
> +            vpp->in_surface[in_idx]->Data.R = in->data[0] + 2;
> +            vpp->in_surface[in_idx]->Data.A = in->data[0] + 3;
> +            vpp->in_surface[in_idx]->Data.PitchLow = in->linesize[0];
> +        }

Maybe this could be in a separate function, together with a comment why
it's needed. (I believe the reason was because libmfx wants it so.)

> +
> +        do {
> +            ret = MFXVideoVPP_RunFrameVPPAsync(vpp->session, vpp->in_surface[in_idx], vpp->out_surface[out_idx], NULL, &sync);
> +
> +            if (ret == MFX_WRN_DEVICE_BUSY) {
> +                av_usleep(500);
> +                continue;
> +            }
> +
> +            break;
> +        } while ( 1 );
> +
> +
> +        if (ret < 0 && ret != MFX_ERR_MORE_SURFACE) {
> +            if (ret == MFX_ERR_MORE_DATA)
> +                return 0;
> +            av_log(ctx, AV_LOG_ERROR, "RunFrameVPPAsync %d\n", ret);
> +            av_frame_free(&in);
> +            av_frame_free(&out);
> +            return ff_qsv_error(ret);
> +        }
> +
> +        if (ret == MFX_WRN_INCOMPATIBLE_VIDEO_PARAM) {
> +            av_log(ctx, AV_LOG_WARNING,
> +                   "EncodeFrameAsync returned 'incompatible param' code\n");
> +        }
> +
> +        MFXVideoCORE_SyncOperation(vpp->session, sync, 60000);

What's the 60000?

> +
> +        out->interlaced_frame = !vpp->dpic || vpp->dpic == 2;
> +        out->top_field_first  = !vpp->dpic;
> +
> +        out->data[0]     = vpp->out_surface[out_idx]->Data.Y;
> +        out->data[1]     = vpp->out_surface[out_idx]->Data.VU;
> +        out->linesize[0] = vpp->out_surface[out_idx]->Info.Width;
> +        out->linesize[1] = vpp->out_surface[out_idx]->Info.Width;

You can't just set the data pointers on a refcounted AVFrame to a
completely different allocation. This breaks refcounting completely.
Also, a refcounted AVFrame has to remain valid even if the filter gets
destroyed, so I guess you can only output not-refcounted AVFrames,
which probably will result in a copy sooner or later.

I'd say this is a pretty critical issue.

> +
> +        out->pts = ((av_rescale_q(0, inlink->time_base,
> +                                  outlink->time_base) + vpp->frame_number) * 1000);

How does using this function with 0 as first argument make sense? Isn't
the result always o then?

> +
> +        filter_frame_ret = ff_filter_frame(inlink->dst->outputs[0], out);
> +
> +        if (filter_frame_ret < 0) {
> +            av_log(ctx, AV_LOG_ERROR, "ff_filter_frame %d\n", filter_frame_ret);
> +            av_frame_free(&in);
> +            av_frame_free(&out);
> +            return filter_frame_ret;
> +        }
> +
> +        vpp->frame_number++;
> +
> +    } while (ret == MFX_ERR_MORE_SURFACE);
> +
> +    av_frame_free(&in);
> +
> +    return 0;
> +}
> +
> +static av_cold int vpp_init(AVFilterContext *ctx)
> +{
> +    VPPContext *vpp= ctx->priv;
> +
> +    AVCodecContext *avctx = (AVCodecContext *)ctx;

Excuse me, what???

> +
> +    if (!vpp->session) {
> +        int ret = ff_qsv_init_internal_session(avctx, &vpp->internal_qs,
> +                                               vpp->load_plugins);
> +
> +        if (ret < 0)
> +            return ret;
> +
> +        vpp->session = vpp->internal_qs.session;
> +    }
> +
> +    vpp->frame_number = 0;
> +    vpp->cur_out_idx = 0;
> +
> +    vpp->num_of_out_surfaces = 1;
> +
> +    vpp->in_surface = NULL;
> +    vpp->out_surface = NULL;
> +    vpp->surface_buffers_out = NULL;
> +
> +    return 0;
> +}
> +
> +static av_cold void vpp_uninit(AVFilterContext *ctx)
> +{
> +    VPPContext *vpp= ctx->priv;
> +
> +    free_surface(ctx->inputs[0]);
> +
> +    ff_qsv_close_internal_session(&vpp->internal_qs);
> +}
> +
> +static const AVFilterPad vpp_inputs[] = {
> +    {
> +        .name          = "default",
> +        .type          = AVMEDIA_TYPE_VIDEO,
> +        .config_props  = config_input,
> +        .filter_frame  = filter_frame,
> +    },
> +    { NULL }
> +};
> +
> +static const AVFilterPad vpp_outputs[] = {
> +    {
> +        .name          = "default",
> +        .type          = AVMEDIA_TYPE_VIDEO,
> +    },
> +    { NULL }
> +};
> +
> +AVFilter ff_vf_vppqsv = {
> +    .name          = "vppqsv",
> +    .description   = NULL_IF_CONFIG_SMALL("Quick Sync Video VPP."),
> +    .priv_size     = sizeof(VPPContext),
> +    .query_formats = query_formats,
> +    .init          = vpp_init,
> +    .uninit        = vpp_uninit,
> +    .inputs        = vpp_inputs,
> +    .outputs       = vpp_outputs,
> +    .priv_class    = &vpp_class,
> +};



More information about the ffmpeg-devel mailing list