[FFmpeg-devel] [PATCH 1/2] avfilter: add scale_d3d11 filter

Fri May 30 11:01:10 EEST 2025

Hi,

Thanks for your review and inputs. I understand.

Instead of making the changes here, I'm trying to create a hw_frames_ctx with required config and pass it while opening the decoder, but I'm facing exceptions at av_buffer_unref(&dst->hw_frames_ctx) in update_context_from_thread. Call stack attached below. Could you help in understanding this error please? Appreciate your time.

I'll share the updated patch for scale_d3d11 filter after resolving these errors.


Call stack:

Unhandled exception at 0x00007FFAE1644FCC (ntdll.dll) in HandBrakeCLI.exe: Unknown __fastfail() status code: 0x0000000000000023.

[Inline Frame] buffer_replace(AVBufferRef * * dst, AVBufferRef * * src) Line 121 (ffmpeg\libavutil\buffer.c:121)
av_buffer_unref(AVBufferRef * * buf) Line 144 (ffmpeg\libavutil\buffer.c:144)
update_context_from_thread(AVCodecContext * dst, const AVCodecContext * src, int for_user) Line 399 (ffmpeg\libavcodec\pthread_frame.c:399)
[Inline Frame] submit_packet(PerThreadContext * p, AVCodecContext * user_avctx, AVPacket * in_pkt) Line 540 (ffmpeg\libavcodec\pthread_frame.c:540)
ff_thread_receive_frame(AVCodecContext * avctx, AVFrame * frame) Line 585 (ffmpeg\libavcodec\pthread_frame.c:585)
decode_receive_frame_internal(AVCodecContext * avctx, AVFrame * frame) Line 663 (ffmpeg\libavcodec\decode.c:663)
avcodec_send_packet(AVCodecContext * avctx, const AVPacket * avpkt) Line 753 (ffmpeg\libavcodec\decode.c:753)
decodeFrame(hb_work_private_s * pv, packet_info_t * packet_info) Line 1768 (HandBrake\libhb\decavcodec.c:1768)
decodePacket(hb_work_object_s * w) Line 2184 (HandBrake\libhb\decavcodec.c:2184)
decavcodecvWork(hb_work_object_s * w, hb_buffer_s * * buf_in, hb_buffer_s * * buf_out) Line 2353 (HandBrake\libhb\decavcodec.c:2353)
hb_work_loop(void * _w) Line 2350 (HandBrake\libhb\work.c:2350)


On 23-05-2025 00:12, softworkz . wrote:




-----Original Message-----
From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org><mailto:ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Dash Santosh
Sathyanarayanan
Sent: Donnerstag, 22. Mai 2025 19:58
To: ffmpeg-devel at ffmpeg.org<mailto:ffmpeg-devel at ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH 1/2] avfilter: add scale_d3d11 filter

On 22-05-2025 20:55, Timo Rothenpieler wrote:


On 22/05/2025 15:20, Dash Santosh Sathyanarayanan wrote:


This commit introduces a new hardware-accelerated video filter,
scale_d3d11,
which performs scaling and format conversion using Direct3D 11. The
filter enables
efficient GPU-based scaling and pixel format conversion (p010 to
nv12), reducing
CPU overhead and latency in video pipelines.
---
  Changelog                     |   1 +
  libavcodec/decode.c           |   2 +-
  libavcodec/dxva2.c            |   3 +
  libavfilter/Makefile          |   1 +
  libavfilter/allfilters.c      |   1 +
  libavfilter/vf_scale_d3d11.c  | 480 ++++++++++++++++++++++++++++++++++
  libavutil/hwcontext_d3d11va.c |  40 ++-
  7 files changed, 514 insertions(+), 14 deletions(-)
  create mode 100644 libavfilter/vf_scale_d3d11.c

diff --git a/Changelog b/Changelog
index 4217449438..68610a63d0 100644
--- a/Changelog
+++ b/Changelog
@@ -18,6 +18,7 @@ version <next>:
  - APV encoding support through a libopenapv wrapper
  - VVC decoder supports all content of SCC (Screen Content Coding):
    IBC (Inter Block Copy), Palette Mode and ACT (Adaptive Color
Transform
+- vf_scale_d3d11 filter


Bit of a nit, this could at last say "Added".


Oops, sorry. My bad.


      version 7.1:
diff --git a/libavcodec/decode.c b/libavcodec/decode.c
index c2b2dd6e3b..a796ae7930 100644
--- a/libavcodec/decode.c
+++ b/libavcodec/decode.c
@@ -1079,7 +1079,7 @@ int ff_decode_get_hw_frames_ctx(AVCodecContext
*avctx,
      if (frames_ctx->initial_pool_size) {
          // We guarantee 4 base work surfaces. The function above
guarantees 1
          // (the absolute minimum), so add the missing count.
-        frames_ctx->initial_pool_size += 3;
+        frames_ctx->initial_pool_size += 33;


This seems a bit extreme, and can potentially drastically increase
VRAM usage of anything using d3d11va.


In full hardware pipeline, when all surfaces are in use, we hit pool
exhaustion and 'static pool size exceeded' error occurs. Hence the
change. The increase in memory footprint was about ~100mb with this change.


Hi,

this is not the right place for this change as it affects all hw accelerations.
Also, it sets it unconditionally, even when no filter is in play.
Since there is no direct relation between the decoder and the filter,
it's hard for the decoder to "know" whether its output will be connected
to a filter. The current way to handle this is to specify the
extra_hw_frames parameter on the command line in cases where more frames
are needed.

Note that there's a extra_hw_frames decoder parameter and a extra_hw_frames
filter parameter. Here's an example where it is used:
https://trac.ffmpeg.org/wiki/Hardware/AMF





        ret = av_hwframe_ctx_init(avctx->hw_frames_ctx);
diff --git a/libavcodec/dxva2.c b/libavcodec/dxva2.c
index 22ecd5acaf..37dab6cd68 100644
--- a/libavcodec/dxva2.c
+++ b/libavcodec/dxva2.c
@@ -647,6 +647,9 @@ int ff_dxva2_common_frame_params(AVCodecContext
*avctx,
          AVD3D11VAFramesContext *frames_hwctx = frames_ctx->hwctx;
            frames_hwctx->BindFlags |= D3D11_BIND_DECODER;
+        if (frames_ctx->sw_format == AV_PIX_FMT_NV12) {
+            frames_hwctx->BindFlags |= D3D11_BIND_VIDEO_ENCODER;
+        }


This change also seems a bit random here. Using NV12 does not
automatically mean you'll encode with it.


The encoder requires D3D11_BIND_VIDEO_ENCODER to be set in the input
surface, when sending the D3D11 surface directly to MF encoder.
Currently MF encoder supports only 8bit (NV12). Hence the change. If the
input is 10bit (P010), scale_d3d11 can be configured to output 8bit NV12
frames.


The filter should be designed universally rather than expecting to be connected
to something specific at its output. Whether or not the bind_encoder bindflag
should be set, needs to be determined during negotiation of the output
connection.

In case you wonder, what else could come after this filter:

- another instance of this filter
- a hwmap filter to derive to a QSV context (followed by QSV filters or a QSV encoder)
- a hwmap filter to other contexts (maybe AMF)
- the Nvidia NVENC encoders are said to be able to take D3D11 frames as input directly
  (without hwmap), I've never tried that, but if it works, it might also work to
  connect them to this filter (given that the D3D11 context is an Nvidia GPU)

I don't mean to say that you should get all this working - just illustrate
the scope of possibilities.






Did not look at the rest yet.


+        return AVERROR_EXTERNAL;
+    }
+
+    ///< Set up output frame
+    ret = av_frame_copy_props(out, in);
+    if (ret < 0) {
+        av_log(ctx, AV_LOG_ERROR, "Failed to copy frame properties\n");
+        videoContext->lpVtbl->Release(videoContext);
+        inputView->lpVtbl->Release(inputView);
+        av_frame_free(&in);
+        av_frame_free(&out);
+        return ret;
+    }
+
+    out->data[0] = (uint8_t *)output_texture;
+    out->data[1] = (uint8_t *)(intptr_t)0;
+    out->width = s->width;
+    out->height = s->height;
+    out->format = AV_PIX_FMT_D3D11;
+
+    ///< Clean up resources
+    inputView->lpVtbl->Release(inputView);
+    videoContext->lpVtbl->Release(videoContext);
+    if (s->outputView) {
+ s->outputView->lpVtbl->Release(s->outputView);
+        s->outputView = NULL;
+    }
+    av_frame_free(&in);
+
+    ///< Forward the frame
+    return ff_filter_frame(outlink, out);
+}
+
+static int scale_d3d11_config_props(AVFilterLink *outlink)
+{
+    AVFilterContext *ctx = outlink->src;
+    ScaleD3D11Context *s = ctx->priv;
+    AVFilterLink *inlink = ctx->inputs[0];
+    FilterLink *inl = ff_filter_link(inlink);
+    FilterLink *outl = ff_filter_link(outlink);
+    int ret;
+
+    ///< Clean up any previous resources
+    release_d3d11_resources(s);
+
+    ///< Evaluate output dimensions
+    ret = ff_scale_eval_dimensions(s, s->w_expr, s->h_expr, inlink,
outlink, &s->width, &s->height);
+    if (ret < 0) {
+        av_log(ctx, AV_LOG_ERROR, "Failed to evaluate dimensions\n");
+        return ret;
+    }
+
+    outlink->w = s->width;
+    outlink->h = s->height;
+
+    ///< Validate input hw_frames_ctx
+    if (!inl->hw_frames_ctx) {
+        av_log(ctx, AV_LOG_ERROR, "No hw_frames_ctx available on
input link\n");
+        return AVERROR(EINVAL);
+    }
+
+    ///< Propagate hw_frames_ctx to output
+    outl->hw_frames_ctx = av_buffer_ref(inl->hw_frames_ctx);
+    if (!outl->hw_frames_ctx) {
+        av_log(ctx, AV_LOG_ERROR, "Failed to propagate hw_frames_ctx
to output\n");
+        return AVERROR(ENOMEM);
+    }
+
+    ///< Initialize filter's hardware device context
+    if (!s->hw_device_ctx) {
+        AVHWFramesContext *in_frames_ctx = (AVHWFramesContext
*)inl->hw_frames_ctx->data;
+        s->hw_device_ctx = av_buffer_ref(in_frames_ctx->device_ref);
+        if (!s->hw_device_ctx) {
+            av_log(ctx, AV_LOG_ERROR, "Failed to initialize filter
hardware device context\n");
+            return AVERROR(ENOMEM);
+        }
+    }
+
+    ///< Get D3D11 device and context (but don't initialize
processor yet - done in filter_frame)
+    AVHWDeviceContext *hwctx = (AVHWDeviceContext
*)s->hw_device_ctx->data;
+    AVD3D11VADeviceContext *d3d11_hwctx = (AVD3D11VADeviceContext
*)hwctx->hwctx;
+
+    s->device = d3d11_hwctx->device;
+    s->context = d3d11_hwctx->device_context;
+
+    if (!s->device || !s->context) {
+        av_log(ctx, AV_LOG_ERROR, "Failed to get valid D3D11 device
or context\n");
+        return AVERROR(EINVAL);
+    }
+
+    ///< Create new hardware frames context for output
+    AVHWFramesContext *in_frames_ctx = (AVHWFramesContext
*)inl->hw_frames_ctx->data;
+    s->hw_frames_ctx_out = av_hwframe_ctx_alloc(s->hw_device_ctx);
+    if (!s->hw_frames_ctx_out)
+        return AVERROR(ENOMEM);
+
+    enum AVPixelFormat sw_format;
+    switch (s->output_format_opt) {
+        case OUTPUT_NV12:
+            sw_format = AV_PIX_FMT_NV12;
+            break;
+        case OUTPUT_P010:
+            sw_format = AV_PIX_FMT_P010;
+            break;
+        default:
+            return AVERROR(EINVAL);
+    }
+
+    AVHWFramesContext *frames_ctx = (AVHWFramesContext
*)s->hw_frames_ctx_out->data;
+    frames_ctx->format = AV_PIX_FMT_D3D11;
+    frames_ctx->sw_format = sw_format;
+    frames_ctx->width = s->width;
+    frames_ctx->height = s->height;
+    frames_ctx->initial_pool_size = 30; ///< Adjust pool size as needed


This should have a lower default and use the extra_hw_frames option that all
filters have to add to it.




+
+    AVD3D11VAFramesContext *frames_hwctx = frames_ctx->hwctx;
+    frames_hwctx->MiscFlags = 0;
+    frames_hwctx->BindFlags = D3D11_BIND_RENDER_TARGET |
D3D11_BIND_VIDEO_ENCODER;
+
+    ret = av_hwframe_ctx_init(s->hw_frames_ctx_out);
+    if (ret < 0) {
+        av_buffer_unref(&s->hw_frames_ctx_out);
+        return ret;
+    }
+
+    outl->hw_frames_ctx = av_buffer_ref(s->hw_frames_ctx_out);
+    if (!outl->hw_frames_ctx)
+        return AVERROR(ENOMEM);
+
+    av_log(ctx, AV_LOG_VERBOSE, "D3D11 scale config: %dx%d -> %dx%d\n",
+           inlink->w, inlink->h, outlink->w, outlink->h);
+    return 0;
+}
+
+static av_cold void scale_d3d11_uninit(AVFilterContext *ctx) {
+    ScaleD3D11Context *s = ctx->priv;
+
+    ///< Release D3D11 resources
+    release_d3d11_resources(s);
+
+    ///< Free the hardware device context reference
+    av_buffer_unref(&s->hw_frames_ctx_out);
+    av_buffer_unref(&s->hw_device_ctx);
+
+    ///< Free option strings
+    av_freep(&s->w_expr);
+    av_freep(&s->h_expr);
+}
+
+static const AVFilterPad scale_d3d11_inputs[] = {
+    {
+        .name         = "default",
+        .type         = AVMEDIA_TYPE_VIDEO,
+        .filter_frame = scale_d3d11_filter_frame,
+    },
+};
+
+static const AVFilterPad scale_d3d11_outputs[] = {
+    {
+        .name         = "default",
+        .type         = AVMEDIA_TYPE_VIDEO,
+        .config_props = scale_d3d11_config_props,
+    },
+};
+
+#define OFFSET(x) offsetof(ScaleD3D11Context, x)
+#define FLAGS (AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM)
+
+static const AVOption scale_d3d11_options[] = {
+    { "width",  "Output video width",  OFFSET(w_expr),
AV_OPT_TYPE_STRING, {.str = "iw"}, .flags = FLAGS },
+    { "height", "Output video height", OFFSET(h_expr),
AV_OPT_TYPE_STRING, {.str = "ih"}, .flags = FLAGS },
+    { "output_fmt", "Output format", OFFSET(output_format_opt),
AV_OPT_TYPE_INT, {.i64 = OUTPUT_NV12}, 0, OUTPUT_P010, FLAGS, "fmt" },
+    { "nv12", "NV12 format", 0, AV_OPT_TYPE_CONST, {.i64 =
OUTPUT_NV12}, 0, 0, FLAGS, "fmt" },
+    { "p010", "P010 format", 0, AV_OPT_TYPE_CONST, {.i64 =
OUTPUT_P010}, 0, 0, FLAGS, "fmt" },
+    { NULL }


There's a specific option type for that: AV_OPT_TYPE_PIXEL_FMT





+};
+
+AVFILTER_DEFINE_CLASS(scale_d3d11);
+
+const FFFilter ff_vf_scale_d3d11 = {
+    .p.name           = "scale_d3d11",
+    .p.description    = NULL_IF_CONFIG_SMALL("Scale video using
Direct3D11"),
+    .priv_size        = sizeof(ScaleD3D11Context),
+    .p.priv_class     = &scale_d3d11_class,
+    .init             = scale_d3d11_init,
+    .uninit           = scale_d3d11_uninit,
+    FILTER_INPUTS(scale_d3d11_inputs),
+    FILTER_OUTPUTS(scale_d3d11_outputs),
+    FILTER_SINGLE_PIXFMT(AV_PIX_FMT_D3D11),
+    .p.flags          = AVFILTER_FLAG_HWDEVICE,
+    .flags_internal   = FF_FILTER_FLAG_HWFRAME_AWARE,
+};
\ No newline at end of file
diff --git a/libavutil/hwcontext_d3d11va.c
b/libavutil/hwcontext_d3d11va.c
index 1a047ce57b..36694896e4 100644
--- a/libavutil/hwcontext_d3d11va.c
+++ b/libavutil/hwcontext_d3d11va.c
@@ -82,6 +82,8 @@ typedef struct D3D11VAFramesContext {
        int nb_surfaces;
      int nb_surfaces_used;
+    int retries;
+    int max_retries;
        DXGI_FORMAT format;
  @@ -258,7 +260,9 @@ static AVBufferRef *d3d11va_pool_alloc(void
*opaque, size_t size)
      ID3D11Texture2D_GetDesc(hwctx->texture, &texDesc);
        if (s->nb_surfaces_used >= texDesc.ArraySize) {
-        av_log(ctx, AV_LOG_ERROR, "Static surface pool size
exceeded.\n");
+        if (s->retries >= s->max_retries) {
+            av_log(ctx, AV_LOG_ERROR, "Static surface pool size
exceeded.\n");
+        }
          return NULL;
      }
  @@ -339,20 +343,30 @@ static int
d3d11va_frames_init(AVHWFramesContext *ctx)
  static int d3d11va_get_buffer(AVHWFramesContext *ctx, AVFrame *frame)
  {
      AVD3D11FrameDescriptor *desc;
+    D3D11VAFramesContext       *s = ctx->hwctx;
+    s->retries = 0;
+    s->max_retries = 50;
+
+    while (s->retries < s->max_retries) {
+
+        frame->buf[0] = av_buffer_pool_get(ctx->pool);
+        if (frame->buf[0]) {
+            desc = (AVD3D11FrameDescriptor *)frame->buf[0]->data;
+
+            frame->data[0] = (uint8_t *)desc->texture;
+            frame->data[1] = (uint8_t *)desc->index;
+            frame->format  = AV_PIX_FMT_D3D11;
+            frame->width   = ctx->width;
+            frame->height  = ctx->height;
+
+            return 0;
+        }
  -    frame->buf[0] = av_buffer_pool_get(ctx->pool);
-    if (!frame->buf[0])
-        return AVERROR(ENOMEM);
-
-    desc = (AVD3D11FrameDescriptor *)frame->buf[0]->data;
-
-    frame->data[0] = (uint8_t *)desc->texture;
-    frame->data[1] = (uint8_t *)desc->index;
-    frame->format  = AV_PIX_FMT_D3D11;
-    frame->width   = ctx->width;
-    frame->height  = ctx->height;
+        av_usleep(1000);
+        s->retries++;
+    }
  -    return 0;
+    return AVERROR(ENOMEM);
  }
    static int d3d11va_transfer_get_formats(AVHWFramesContext *ctx,


I'm afraid, but this loop is not right. Please take a look at how other
filters are handling the case when no hardware frame is available.

Thanks
sw

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel at ffmpeg.org<mailto:ffmpeg-devel at ffmpeg.org>
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-request at ffmpeg.org<mailto:ffmpeg-devel-request at ffmpeg.org> with subject "unsubscribe".