[FFmpeg-devel] [PATCH 1/3] lavu: add a Vulkan hwcontext
Rostislav Pehlivanov
atomnuker at gmail.com
Tue Apr 3 03:42:36 EEST 2018
On 2 April 2018 at 21:24, Mark Thompson <sw at jkqxz.net> wrote:
> On 30/03/18 04:14, Rostislav Pehlivanov wrote:
> > This commit adds a Vulkan hwcontext, currently capable of mapping DRM and
> > VAAPI frames but additional functionality can be added later to support
> > importing of D3D11 surfaces as well as exporting to various other APIs.
>
> Assuming you haven't done this already, I think it would be a good idea to
> at least see it interoperating with one of the Windows APIs (D3D11 is
> probably most interesting) in case there is some unforeseen problem there
> (doesn't have to be in committable form if you don't want to check
> everything fully).
>
> > This context requires the newest stable version of the Vulkan API,
> > and once the new extension for DRM surfaces makes it in will also require
> > it (in order to properly and fully import them).
>
> What is the dependency for this extension? Presumably you need something
> in the headers - will that be present in any official headers after some
> version (including Windows, say), or does it need some platform dependency
> as well?
>
Unlike OpenCL, there's a single loader library, so all defines are defined
on all platforms. I just have the ifdef to do testing until its accepted.
>
> > It makes use of every part of the Vulkan spec in order to ensure fastest
> > possible uploading, downloading and mapping of frames. On AMD, it will
> > also make use of mapping host memory frames in order to upload
> > very efficiently and with minimal CPU to hardware.
> >
> > To be useful for non-RGB images an implementation with the YUV images
> > extension is needed. All current implementations support that with the
> > exception of AMD, though support is coming soon for Mesa.
>
> I may have asked this before, but can you explain what is actually gained
> by using the combined images rather than separate planes? It seems
> unfortunate to be requiring it when it isn't present on all implementations.
>
> > This is RFC-only, its not meant for merging as is. In particular, I'd
> > like to know whether the exposed hwcontext API is fine.
> >
> > Signed-off-by: Rostislav Pehlivanov <atomnuker at gmail.com>
> > ---
> > configure | 7 +-
> > doc/APIchanges | 3 +
> > libavutil/Makefile | 3 +
> > libavutil/hwcontext.c | 4 +
> > libavutil/hwcontext.h | 1 +
> > libavutil/hwcontext_internal.h | 1 +
> > libavutil/hwcontext_vulkan.c | 1952 ++++++++++++++++++++++++++++++
> ++++++++++
> > libavutil/hwcontext_vulkan.h | 116 +++
> > libavutil/pixdesc.c | 4 +
> > libavutil/pixfmt.h | 4 +
> > libavutil/version.h | 2 +-
> > 11 files changed, 2095 insertions(+), 2 deletions(-)
> > create mode 100644 libavutil/hwcontext_vulkan.c
> > create mode 100644 libavutil/hwcontext_vulkan.h
> >
> > diff --git a/configure b/configure
> > index 99570a1415..2213f0452d 100755
> > --- a/configure
> > +++ b/configure
> > @@ -297,6 +297,7 @@ External library support:
> > --enable-opengl enable OpenGL rendering [no]
> > --enable-openssl enable openssl, needed for https support
> > if gnutls or libtls is not used [no]
> > + --enable-vulkan enable Vulkan code [no]
> > --disable-sndio disable sndio support [autodetect]
> > --disable-schannel disable SChannel SSP, needed for TLS support
> on
> > Windows if openssl and gnutls are not used
> [autodetect]
> > @@ -1761,6 +1762,7 @@ HWACCEL_LIBRARY_LIST="
> > mmal
> > omx
> > opencl
> > + vulkan
> > "
> >
> > DOCUMENT_LIST="
> > @@ -3442,7 +3444,7 @@ avformat_deps="avcodec avutil"
> > avformat_suggest="libm network zlib"
> > avresample_deps="avutil"
> > avresample_suggest="libm"
> > -avutil_suggest="clock_gettime ffnvcodec libm libdrm libmfx opencl
> user32 vaapi videotoolbox corefoundation corevideo coremedia wincrypt"
> > +avutil_suggest="clock_gettime ffnvcodec libm libdrm libmfx opencl
> vulkan user32 vaapi videotoolbox corefoundation corevideo coremedia
> wincrypt"
> > postproc_deps="avutil gpl"
> > postproc_suggest="libm"
> > swresample_deps="avutil"
> > @@ -6314,6 +6316,9 @@ enabled vdpau &&
> >
> > enabled crystalhd && check_lib crystalhd "stdint.h
> libcrystalhd/libcrystalhd_if.h" DtsCrystalHDVersion -lcrystalhd
> >
> > +enabled vulkan &&
> > + check_lib vulkan "vulkan/vulkan.h" vkCreateInstance -lvulkan
> > +
> > if enabled x86; then
> > case $target_os in
> > mingw32*|mingw64*|win32|win64|linux|cygwin*)
> > diff --git a/doc/APIchanges b/doc/APIchanges
> > index 83c7a40a55..cd38dee916 100644
> > --- a/doc/APIchanges
> > +++ b/doc/APIchanges
> > @@ -15,6 +15,9 @@ libavutil: 2017-10-21
> >
> > API changes, most recent first:
> >
> > +2018-03-xx - xxxxxxxxxx - lavu 56.13.100 - hwcontext.h
> > + Add AV_HWDEVICE_TYPE_VULKAN and implementation.
> > +
> > 2018-03-xx - xxxxxxx - lavc 58.16.100 - avcodec.h
> > Add FF_SUB_CHARENC_MODE_IGNORE.
> >
> > diff --git a/libavutil/Makefile b/libavutil/Makefile
> > index a63ba523c9..aa641d78ed 100644
> > --- a/libavutil/Makefile
> > +++ b/libavutil/Makefile
> > @@ -42,6 +42,7 @@ HEADERS = adler32.h
> \
> > hwcontext_vaapi.h
> \
> > hwcontext_videotoolbox.h
> \
> > hwcontext_vdpau.h
> \
> > + hwcontext_vulkan.h
> \
> > imgutils.h
> \
> > intfloat.h
> \
> > intreadwrite.h
> \
> > @@ -168,6 +169,7 @@ OBJS-$(CONFIG_VAAPI) +=
> hwcontext_vaapi.o
> > OBJS-$(CONFIG_VIDEOTOOLBOX) += hwcontext_videotoolbox.o
> > OBJS-$(CONFIG_VDPAU) += hwcontext_vdpau.o
> > OBJS-$(CONFIG_MEDIACODEC) += hwcontext_mediacodec.o
> > +OBJS-$(CONFIG_VULKAN) += hwcontext_vulkan.o
> >
> > OBJS += $(COMPAT_OBJS:%=../compat/%)
> >
> > @@ -183,6 +185,7 @@ SKIPHEADERS-$(CONFIG_OPENCL) +=
> hwcontext_opencl.h
> > SKIPHEADERS-$(CONFIG_VAAPI) += hwcontext_vaapi.h
> > SKIPHEADERS-$(CONFIG_VIDEOTOOLBOX) += hwcontext_videotoolbox.h
> > SKIPHEADERS-$(CONFIG_VDPAU) += hwcontext_vdpau.h
> > +SKIPHEADERS-$(CONFIG_VULKAN) += hwcontext_vulkan.h
> >
> > TESTPROGS = adler32
> \
> > aes
> \
> > diff --git a/libavutil/hwcontext.c b/libavutil/hwcontext.c
> > index 70c556ecac..9993279da1 100644
> > --- a/libavutil/hwcontext.c
> > +++ b/libavutil/hwcontext.c
> > @@ -58,6 +58,9 @@ static const HWContextType * const hw_table[] = {
> > #endif
> > #if CONFIG_MEDIACODEC
> > &ff_hwcontext_type_mediacodec,
> > +#endif
> > +#if CONFIG_VULKAN
> > + &ff_hwcontext_type_vulkan,
> > #endif
> > NULL,
> > };
> > @@ -73,6 +76,7 @@ static const char *const hw_type_names[] = {
> > [AV_HWDEVICE_TYPE_VDPAU] = "vdpau",
> > [AV_HWDEVICE_TYPE_VIDEOTOOLBOX] = "videotoolbox",
> > [AV_HWDEVICE_TYPE_MEDIACODEC] = "mediacodec",
> > + [AV_HWDEVICE_TYPE_VULKAN] = "vulkan",
> > };
> >
> > enum AVHWDeviceType av_hwdevice_find_type_by_name(const char *name)
> > diff --git a/libavutil/hwcontext.h b/libavutil/hwcontext.h
> > index f5a4b62387..f874af9f8f 100644
> > --- a/libavutil/hwcontext.h
> > +++ b/libavutil/hwcontext.h
> > @@ -36,6 +36,7 @@ enum AVHWDeviceType {
> > AV_HWDEVICE_TYPE_DRM,
> > AV_HWDEVICE_TYPE_OPENCL,
> > AV_HWDEVICE_TYPE_MEDIACODEC,
> > + AV_HWDEVICE_TYPE_VULKAN,
> > };
> >
> > typedef struct AVHWDeviceInternal AVHWDeviceInternal;
> > diff --git a/libavutil/hwcontext_internal.h b/libavutil/hwcontext_
> internal.h
> > index 332062ddaa..c0ea7a7e3e 100644
> > --- a/libavutil/hwcontext_internal.h
> > +++ b/libavutil/hwcontext_internal.h
> > @@ -167,5 +167,6 @@ extern const HWContextType ff_hwcontext_type_vaapi;
> > extern const HWContextType ff_hwcontext_type_vdpau;
> > extern const HWContextType ff_hwcontext_type_videotoolbox;
> > extern const HWContextType ff_hwcontext_type_mediacodec;
> > +extern const HWContextType ff_hwcontext_type_vulkan;
> >
> > #endif /* AVUTIL_HWCONTEXT_INTERNAL_H */
> > diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
> > new file mode 100644
> > index 0000000000..93d0892f25
> > --- /dev/null
> > +++ b/libavutil/hwcontext_vulkan.c
> > @@ -0,0 +1,1952 @@
> > +/*
> > + * Vulkan hwcontext
> > + * Copyright (c) 2018 Rostislav Pehlivanov <atomnuker at gmail.com>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> > + */
> > +
> > +#include "config.h"
> > +#include "pixdesc.h"
> > +#include "avstring.h"
> > +#include "hwcontext.h"
> > +#include "hwcontext_internal.h"
> > +#include "hwcontext_vulkan.h"
> > +
> > +#if CONFIG_LIBDRM
> > +#include <xf86drm.h>
> > +#include <drm_fourcc.h>
> > +#include "hwcontext_drm.h"
> > +#if CONFIG_VAAPI
> > +#include <va/va_drmcommon.h>
> > +#include "hwcontext_vaapi.h"
> > +#endif
> > +#endif
> > +
> > +typedef struct VulkanDevicePriv {
> > + /* Properties */
> > + VkPhysicalDeviceProperties props;
> > + VkPhysicalDeviceMemoryProperties mprops;
> > +
> > + /* Debug callback */
> > + VkDebugUtilsMessengerEXT debug_ctx;
> > +
> > + /* Image uploading */
> > + AVVkExecContext exec;
> > +
> > + /* Extensions */
> > + uint64_t extensions;
> > +
> > + /* Settings */
> > + int use_linear_images;
> > +} VulkanDevicePriv;
> > +
> > +#define HAS_DRM_MODIFIERS 0
>
> Should this be coming from configure?
>
This was for testing. Still not sure whether to hide it behind ifdef always
- it's required for proper DMA mapping and will be available on all
platforms.
>
> > +
> > +#define DEFAULT_USAGE_FLAGS (VK_IMAGE_USAGE_SAMPLED_BIT |
> \
> > + VK_IMAGE_USAGE_STORAGE_BIT |
> \
> > + VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
> \
> > + VK_IMAGE_USAGE_TRANSFER_DST_BIT)
> > +
> > +#define ADD_VAL_TO_LIST(list, count, val)
> \
> > + do {
> \
> > + list = av_realloc_array(list, sizeof(*list), ++count);
> \
> > + if (!list) {
> \
> > + err = AVERROR(ENOMEM);
> \
> > + goto end;
> \
> > + }
> \
> > + list[count - 1] = val;
> \
> > + } while(0)
> > +
> > +static const VkFormat av_vk_format_map[AV_PIX_FMT_NB] = {
>
> Not "av_" for local tables.
Fixed.
>
> > + /* Gray */
> > + [AV_PIX_FMT_GRAY8] = VK_FORMAT_R8_UNORM,
> > + [AV_PIX_FMT_GRAY10] = VK_FORMAT_R10X6_UNORM_PACK16,
> > + [AV_PIX_FMT_GRAY12] = VK_FORMAT_R12X4_UNORM_PACK16,
> > + [AV_PIX_FMT_GRAY16] = VK_FORMAT_R16_UNORM,
> > +
> > + /* Interleaved */
> > + [AV_PIX_FMT_NV12] = VK_FORMAT_G8_B8R8_2PLANE_420_UNORM,
> > + [AV_PIX_FMT_P010] = VK_FORMAT_G10X6_B10X6R10X6_
> 2PLANE_420_UNORM_3PACK16,
> > + [AV_PIX_FMT_P016] = VK_FORMAT_G16_B16R16_2PLANE_420_UNORM,
> > + [AV_PIX_FMT_NV16] = VK_FORMAT_G16_B16R16_2PLANE_422_UNORM,
> > + [AV_PIX_FMT_UYVY422] = VK_FORMAT_B16G16R16G16_422_UNORM,
> > + [AV_PIX_FMT_YVYU422] = VK_FORMAT_G16B16G16R16_422_UNORM,
> > +
> > + /* 420 */
> > + [AV_PIX_FMT_YUV420P] = VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM,
> > + [AV_PIX_FMT_YUV420P10] = VK_FORMAT_G10X6_B10X6_R10X6_
> 3PLANE_420_UNORM_3PACK16,
> > + [AV_PIX_FMT_YUV420P12] = VK_FORMAT_G12X4_B12X4_R12X4_
> 3PLANE_420_UNORM_3PACK16,
> > + [AV_PIX_FMT_YUV420P16] = VK_FORMAT_G16_B16_R16_3PLANE_420_UNORM,
> > +
> > + /* 422 */
> > + [AV_PIX_FMT_YUV422P] = VK_FORMAT_G8_B8_R8_3PLANE_422_UNORM,
> > + [AV_PIX_FMT_YUV422P10] = VK_FORMAT_G10X6_B10X6_R10X6_
> 3PLANE_422_UNORM_3PACK16,
> > + [AV_PIX_FMT_YUV422P12] = VK_FORMAT_G12X4_B12X4_R12X4_
> 3PLANE_422_UNORM_3PACK16,
> > + [AV_PIX_FMT_YUV422P16] = VK_FORMAT_G16_B16_R16_3PLANE_422_UNORM,
> > +
> > + /* 444 */
> > + [AV_PIX_FMT_YUV444P] = VK_FORMAT_G8_B8_R8_3PLANE_444_UNORM,
> > + [AV_PIX_FMT_YUV444P10] = VK_FORMAT_G10X6_B10X6_R10X6_
> 3PLANE_444_UNORM_3PACK16,
> > + [AV_PIX_FMT_YUV444P12] = VK_FORMAT_G12X4_B12X4_R12X4_
> 3PLANE_444_UNORM_3PACK16,
> > + [AV_PIX_FMT_YUV444P16] = VK_FORMAT_G16_B16_R16_3PLANE_444_UNORM,
> > +
> > + /* RGB */
> > + [AV_PIX_FMT_ABGR] = VK_FORMAT_A8B8G8R8_UNORM_PACK32,
> > + [AV_PIX_FMT_BGRA] = VK_FORMAT_B8G8R8A8_UNORM,
> > + [AV_PIX_FMT_RGBA] = VK_FORMAT_R8G8B8A8_UNORM,
> > + [AV_PIX_FMT_RGB24] = VK_FORMAT_R8G8B8_UNORM,
> > + [AV_PIX_FMT_BGR24] = VK_FORMAT_B8G8R8_UNORM,
> > + [AV_PIX_FMT_RGB48] = VK_FORMAT_R16G16B16_UNORM,
> > + [AV_PIX_FMT_RGBA64] = VK_FORMAT_R16G16B16A16_UNORM,
> > + [AV_PIX_FMT_RGB565] = VK_FORMAT_R5G6B5_UNORM_PACK16,
> > + [AV_PIX_FMT_BGR565] = VK_FORMAT_B5G6R5_UNORM_PACK16,
> > + [AV_PIX_FMT_BGR0] = VK_FORMAT_B8G8R8A8_UNORM,
> > + [AV_PIX_FMT_0BGR] = VK_FORMAT_A8B8G8R8_UNORM_PACK32,
> > + [AV_PIX_FMT_RGB0] = VK_FORMAT_R8G8B8A8_UNORM,
>
> Why are some of the 4-byte RGB[A0] formats marked PACK32 and others aren't?
>
Not a clue.
>
> Has there been any thought of planar RGB[A] or YUVA images? 3 is
> hard-coded in a few places below as the limit on the number of planes,
> which might not be so good for them.
>
Nope, neither. I'll make all the arrays in the code size 4 though, just in
case.
>
> > +};
> > +
> > +enum VulkanExtensions {
> > + EXT_DEDICATED_ALLOC = 1LL << 0, /*
> VK_KHR_dedicated_allocation */
> > + EXT_IMAGE_FORMAT_LIST = 1LL << 1, /* VK_KHR_image_format_list
> */
> > + EXT_EXTERNAL_MEMORY = 1LL << 2, /* VK_KHR_external_memory */
> > + EXT_EXTERNAL_HOST_MEMORY = 1LL << 3, /*
> VK_EXT_external_memory_host */
> > + EXT_EXTERNAL_FD_MEMORY = 1LL << 4, /*
> VK_KHR_external_memory_fd */
> > + EXT_EXTERNAL_DMABUF_MEMORY = 1LL << 5, /*
> VK_EXT_external_memory_dma_buf */
> > + EXT_DRM_MODIFIER_FLAGS = 1LL << 6, /* VK_EXT_image_drm_format_modifier
> */
> > + EXT_YUV_IMAGES = 1LL << 7, /* VK_KHR_sampler_ycbcr_conversion
> */
> > +
> > + EXT_REQUIRED = 1LL << 63,
> > +};
> > +
> > +typedef struct VulkanOptExtension {
> > + const char *name;
> > + uint64_t flag;
> > +} VulkanOptExtension;
> > +
> > +VulkanOptExtension optional_instance_exts[] = {
> > + { VK_KHR_EXTERNAL_MEMORY_CAPABILITIES_EXTENSION_NAME,
> EXT_EXTERNAL_MEMORY, },
> > + { VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME,
> EXT_REQUIRED },
> > +};
> > +
> > +VulkanOptExtension optional_device_exts[] = {
> > + { VK_KHR_MAINTENANCE1_EXTENSION_NAME,
> EXT_REQUIRED },
> > + { VK_KHR_MAINTENANCE2_EXTENSION_NAME,
> EXT_REQUIRED },
> > + { VK_KHR_GET_MEMORY_REQUIREMENTS_2_EXTENSION_NAME,
> EXT_REQUIRED },
> > + { VK_KHR_BIND_MEMORY_2_EXTENSION_NAME,
> EXT_REQUIRED },
> > + { VK_KHR_DESCRIPTOR_UPDATE_TEMPLATE_EXTENSION_NAME,
> EXT_REQUIRED },
> > +
> > + { VK_KHR_DEDICATED_ALLOCATION_EXTENSION_NAME,
> EXT_DEDICATED_ALLOC, },
> > + { VK_KHR_IMAGE_FORMAT_LIST_EXTENSION_NAME,
> EXT_IMAGE_FORMAT_LIST, },
> > + { VK_KHR_EXTERNAL_MEMORY_EXTENSION_NAME,
> EXT_EXTERNAL_MEMORY, },
> > + { VK_EXT_EXTERNAL_MEMORY_HOST_EXTENSION_NAME,
> EXT_EXTERNAL_HOST_MEMORY, },
> > + { VK_KHR_EXTERNAL_MEMORY_FD_EXTENSION_NAME,
> EXT_EXTERNAL_FD_MEMORY, },
> > + { VK_EXT_EXTERNAL_MEMORY_DMA_BUF_EXTENSION_NAME,
> EXT_EXTERNAL_DMABUF_MEMORY, },
> > + { VK_KHR_SAMPLER_YCBCR_CONVERSION_EXTENSION_NAME,
> EXT_YUV_IMAGES },
> > +#if HAS_DRM_MODIFIERS
> > + { VK_EXT_IMAGE_DRM_FORMAT_MODIFIER_EXTENSION_NAME,
> EXT_DRM_MODIFIER_FLAGS, },
> > +#else
> > + { VK_EXT_EXTERNAL_MEMORY_DMA_BUF_EXTENSION_NAME,
> EXT_DRM_MODIFIER_FLAGS, },
> > +#endif
> > +};
> > +
> > +VkFormat av_vkfmt_from_pixfmt(enum AVPixelFormat p)
> > +{
> > + if ((p >= 0 && p < AV_PIX_FMT_NB) && av_vk_format_map[p])
> > + return av_vk_format_map[p];
> > + return VK_FORMAT_UNDEFINED;
> > +}
>
> I realise there is already a function like this in videotoolbox, but I
> don't think it's sensible to add separate external symbols like this for
> every API. Can we resurrect wm4's patch from a while ago which added
> hwcontext functions to do this mapping?
>
Sure, can you/wm4 link me?
>
> > +
> > +static int vkfmt_is_supported(AVVulkanDeviceContext *hwctx, enum
> AVPixelFormat p,
> > + int linear)
> > +{
> > + VkFormatFeatureFlags flags;
> > + VkFormatProperties2 prop = {
> > + .sType = VK_STRUCTURE_TYPE_FORMAT_PROPERTIES_2,
> > + };
> > + VkFormat fmt = av_vkfmt_from_pixfmt(p);
> > +
> > + if (fmt == VK_FORMAT_UNDEFINED)
> > + return 0;
> > +
> > + vkGetPhysicalDeviceFormatProperties2(hwctx->phys_dev, fmt, &prop);
> > + flags = linear ? prop.formatProperties.linearTilingFeatures :
> > + prop.formatProperties.optimalTilingFeatures;
> > +
> > + return !!(flags & (VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT |
> > + VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT |
> > + VK_FORMAT_FEATURE_TRANSFER_SRC_BIT |
> > + VK_FORMAT_FEATURE_TRANSFER_DST_BIT));
> > +}
> > +
> > +/* Converts return values to strings */
> > +const char *av_vk_ret2str(VkResult res)
> > +{
> > +#define CASE(VAL) case VAL: return #VAL
> > + switch (res) {
> > + CASE(VK_SUCCESS);
> > + CASE(VK_NOT_READY);
> > + CASE(VK_TIMEOUT);
> > + CASE(VK_EVENT_SET);
> > + CASE(VK_EVENT_RESET);
> > + CASE(VK_INCOMPLETE);
> > + CASE(VK_ERROR_OUT_OF_HOST_MEMORY);
> > + CASE(VK_ERROR_OUT_OF_DEVICE_MEMORY);
> > + CASE(VK_ERROR_INITIALIZATION_FAILED);
> > + CASE(VK_ERROR_DEVICE_LOST);
> > + CASE(VK_ERROR_MEMORY_MAP_FAILED);
> > + CASE(VK_ERROR_LAYER_NOT_PRESENT);
> > + CASE(VK_ERROR_EXTENSION_NOT_PRESENT);
> > + CASE(VK_ERROR_FEATURE_NOT_PRESENT);
> > + CASE(VK_ERROR_INCOMPATIBLE_DRIVER);
> > + CASE(VK_ERROR_TOO_MANY_OBJECTS);
> > + CASE(VK_ERROR_FORMAT_NOT_SUPPORTED);
> > + CASE(VK_ERROR_FRAGMENTED_POOL);
> > + CASE(VK_ERROR_SURFACE_LOST_KHR);
> > + CASE(VK_ERROR_NATIVE_WINDOW_IN_USE_KHR);
> > + CASE(VK_SUBOPTIMAL_KHR);
> > + CASE(VK_ERROR_OUT_OF_DATE_KHR);
> > + CASE(VK_ERROR_INCOMPATIBLE_DISPLAY_KHR);
> > + CASE(VK_ERROR_VALIDATION_FAILED_EXT);
> > + CASE(VK_ERROR_INVALID_SHADER_NV);
> > + CASE(VK_ERROR_OUT_OF_POOL_MEMORY);
> > + CASE(VK_ERROR_INVALID_EXTERNAL_HANDLE);
> > + CASE(VK_ERROR_NOT_PERMITTED_EXT);
> > + default: return "Unknown error";
> > + }
> > +#undef CASE
> > +}
>
> Likewise this one. libavutil is not a generic Vulkan support layer - this
> shouldn't be in the user-visible API.
>
Yeah, that's fine, its not too bad to have it duplicated.
>
> > +
> > +static VkBool32 vk_dbg_callback(VkDebugUtilsMessageSeverityFlagBitsEXT
> severity,
> > + VkDebugUtilsMessageTypeFlagsEXT
> messageType,
> > + const VkDebugUtilsMessengerCallbackDataEXT
> *data,
> > + void *priv)
> > +{
> > + int i, l;
> > + AVHWDeviceContext *ctx = priv;
> > +
> > + switch (severity) {
> > + case VK_DEBUG_UTILS_MESSAGE_SEVERITY_VERBOSE_BIT_EXT: l =
> AV_LOG_VERBOSE; break;
> > + case VK_DEBUG_UTILS_MESSAGE_SEVERITY_INFO_BIT_EXT: l =
> AV_LOG_INFO; break;
> > + case VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT: l =
> AV_LOG_WARNING; break;
> > + case VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT: l =
> AV_LOG_ERROR; break;
> > + default: l =
> AV_LOG_DEBUG; break;
>
> This being a bitmask feels like it implies they can be |ed together.
>
They can, but only when you register the callback. This function receives a
single bit indicating the severity.
>
> > + };
> > +
> > + av_log(ctx, l, "%s\n", data->pMessage);
> > + for (i = 0; i < data->cmdBufLabelCount; i++)
> > + av_log(ctx, l, "\t%i: %s\n", i, data->pCmdBufLabels[i].
> pLabelName);
> > +
> > + return 0;
> > +}
> > +
> > +static int check_extensions(AVHWDeviceContext *ctx, int dev,
> > + const char * const **dst, uint32_t *num,
> int debug)
> > +{
> > + const char *tstr;
> > + const char **extension_names = NULL;
> > + VulkanDevicePriv *p = ctx->internal->priv;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + int i, j, err = 0, found, extensions_found = 0;
> > +
> > + const char *mod;
> > + int optional_exts_num;
> > + uint32_t sup_ext_count;
> > + VkExtensionProperties *sup_ext;
> > + VulkanOptExtension *optional_exts;
> > +
> > + if (!dev) {
> > + mod = "instance";
> > + optional_exts = optional_instance_exts;
> > + optional_exts_num = FF_ARRAY_ELEMS(optional_instance_exts);
> > + vkEnumerateInstanceExtensionProperties(NULL, &sup_ext_count,
> NULL);
> > + sup_ext = av_malloc_array(sup_ext_count,
> sizeof(VkExtensionProperties));
> > + if (!sup_ext)
> > + return AVERROR(ENOMEM);
> > + vkEnumerateInstanceExtensionProperties(NULL, &sup_ext_count,
> sup_ext);
> > + } else {
> > + mod = "device";
> > + optional_exts = optional_device_exts;
> > + optional_exts_num = FF_ARRAY_ELEMS(optional_device_exts);
> > + vkEnumerateDeviceExtensionProperties(hwctx->phys_dev, NULL,
> > + &sup_ext_count, NULL);
> > + sup_ext = av_malloc_array(sup_ext_count,
> sizeof(VkExtensionProperties));
> > + if (!sup_ext)
> > + return AVERROR(ENOMEM);
> > + vkEnumerateDeviceExtensionProperties(hwctx->phys_dev, NULL,
> > + &sup_ext_count, sup_ext);
> > + }
> > +
> > + for (i = 0; i < optional_exts_num; i++) {
> > + int req = optional_exts[i].flag & EXT_REQUIRED;
> > + tstr = optional_exts[i].name;
> > +
> > + found = 0;
> > + for (j = 0; j < sup_ext_count; j++) {
> > + if (!strcmp(tstr, sup_ext[j].extensionName)) {
> > + found = 1;
> > + break;
> > + }
> > + }
> > + if (!found) {
> > + int lvl = req ? AV_LOG_ERROR : AV_LOG_INFO;
> > + av_log(ctx, lvl, "Extension \"%s\" not found!\n", tstr);
> > + if (req) {
> > + err = AVERROR(EINVAL);
> > + goto end;
> > + }
> > + continue;
> > + }
> > + if (!req)
> > + p->extensions |= optional_exts[i].flag;
> > +
> > + av_log(ctx, AV_LOG_INFO, "Using %s extension \"%s\"\n", mod,
> tstr);
> > +
> > + ADD_VAL_TO_LIST(extension_names, extensions_found, tstr);
> > + }
> > +
> > + if (debug && !dev) {
> > + tstr = VK_EXT_DEBUG_UTILS_EXTENSION_NAME;
> > + found = 0;
> > + for (j = 0; j < sup_ext_count; j++) {
> > + if (!strcmp(tstr, sup_ext[j].extensionName)) {
> > + found = 1;
> > + break;
> > + }
> > + }
> > + if (found) {
> > + ADD_VAL_TO_LIST(extension_names, extensions_found, tstr);
> > + } else {
> > + av_log(ctx, AV_LOG_ERROR, "Debug extension \"%s\" not
> found!\n",
> > + tstr);
> > + err = AVERROR(EINVAL);
> > + goto end;
> > + }
> > + }
> > +
> > + *dst = extension_names;
> > + *num = extensions_found;
> > +
> > +end:
> > + av_free(sup_ext);
> > + return err;
> > +}
> > +
> > +/* Creates a VkInstance */
> > +static int create_instance(AVHWDeviceContext *ctx, AVDictionary *opts)
> > +{
> > + int err = 0;
> > + VkResult ret;
> > + VulkanDevicePriv *p = ctx->internal->priv;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + AVDictionaryEntry *debug_opt = av_dict_get(opts, "debug", NULL, 0);
> > + const int debug_mode = debug_opt && strtol(debug_opt->value, NULL,
> 10);
> > + VkApplicationInfo application_info = {
> > + .sType = VK_STRUCTURE_TYPE_APPLICATION_INFO,
> > + .pEngineName = "libavutil",
> > + .apiVersion = VK_API_VERSION_1_0,
> > + .engineVersion = VK_MAKE_VERSION(LIBAVUTIL_VERSION_MAJOR,
> > + LIBAVUTIL_VERSION_MINOR,
> > + LIBAVUTIL_VERSION_MICRO),
> > + };
> > + VkInstanceCreateInfo inst_props = {
> > + .sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO,
> > + .pApplicationInfo = &application_info,
> > + };
> > +
> > + /* Check for present/missing extensions */
> > + err = check_extensions(ctx, 0, &inst_props.ppEnabledExtensionNames,
> > + &inst_props.enabledExtensionCount,
> debug_mode);
> > + if (err < 0)
> > + return err;
> > +
> > + if (debug_mode) {
> > + static const char *layers[] = { "VK_LAYER_LUNARG_standard_validation"
> };
> > + inst_props.ppEnabledLayerNames = layers;
> > + inst_props.enabledLayerCount = FF_ARRAY_ELEMS(layers);
> > + }
> > +
> > + /* Try to create the instance */
> > + ret = vkCreateInstance(&inst_props, NULL, &hwctx->inst);
> > +
> > + /* Free used memory */
> > + av_free((void *)inst_props.ppEnabledExtensionNames);
> > +
> > + /* Check for errors */
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Instance creation failure: %s\n",
> > + av_vk_ret2str(ret));
> > + return AVERROR_EXTERNAL;
> > + }
> > +
> > + if (debug_mode) {
> > + VkDebugUtilsMessengerCreateInfoEXT dbg = {
> > + .sType = VK_STRUCTURE_TYPE_DEBUG_UTILS_
> MESSENGER_CREATE_INFO_EXT,
> > + .messageSeverity = VK_DEBUG_UTILS_MESSAGE_SEVERITY_VERBOSE_BIT_EXT
> |
> > + VK_DEBUG_UTILS_MESSAGE_SEVERITY_INFO_BIT_EXT
> |
> > + VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT
> |
> > + VK_DEBUG_UTILS_MESSAGE_
> SEVERITY_ERROR_BIT_EXT,
> > + .messageType = VK_DEBUG_UTILS_MESSAGE_TYPE_GENERAL_BIT_EXT
> |
> > + VK_DEBUG_UTILS_MESSAGE_TYPE_VALIDATION_BIT_EXT
> |
> > + VK_DEBUG_UTILS_MESSAGE_TYPE_
> PERFORMANCE_BIT_EXT,
> > + .pfnUserCallback = vk_dbg_callback,
> > + .pUserData = ctx,
> > + };
> > + VK_LOAD_PFN(hwctx->inst, vkCreateDebugUtilsMessengerEXT);
> > +
> > + pfn_vkCreateDebugUtilsMessengerEXT(hwctx->inst, &dbg,
> > + NULL, &p->debug_ctx);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +typedef struct VulkanDeviceSelection {
> > + const char *name; /* Will use this first unless NULL */
> > + uint32_t pci_device; /* Will use this second unless 0x0 */
> > + uint32_t vendor_id; /* Last resort to find something deterministic
> */
> > + int index; /* Finally fall back to index */
> > +} VulkanDeviceSelection;
> > +
> > +/* Finds a device */
> > +static int find_device(AVHWDeviceContext *ctx, VulkanDeviceSelection
> *select)
> > +{
> > + uint32_t num;
> > + VkResult ret;
> > + int i, err = 0;
> > + VkPhysicalDevice *devices = NULL;
> > + VkPhysicalDeviceProperties *prop = NULL;
> > + VkPhysicalDevice choice = VK_NULL_HANDLE;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + static const char *dev_types[] = {
> > + [VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU] = "integrated",
> > + [VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU] = "discrete",
> > + [VK_PHYSICAL_DEVICE_TYPE_VIRTUAL_GPU] = "virtual",
> > + [VK_PHYSICAL_DEVICE_TYPE_CPU] = "software",
> > + [VK_PHYSICAL_DEVICE_TYPE_OTHER] = "unknown",
> > + };
> > +
> > + ret = vkEnumeratePhysicalDevices(hwctx->inst, &num, NULL);
> > + if (ret != VK_SUCCESS || !num) {
> > + av_log(ctx, AV_LOG_ERROR, "No devices found: %s!\n",
> av_vk_ret2str(ret));
> > + return AVERROR_EXTERNAL;
> > + }
> > +
> > + devices = av_malloc_array(num, sizeof(VkPhysicalDevice));
> > + if (!devices)
> > + return AVERROR(ENOMEM);
> > +
> > + ret = vkEnumeratePhysicalDevices(hwctx->inst, &num, devices);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Failed enumerating devices: %s\n",
> > + av_vk_ret2str(ret));
> > + err = AVERROR_EXTERNAL;
> > + goto end;
> > + }
> > +
> > + prop = av_malloc_array(num, sizeof(VkPhysicalDeviceProperties));
> > + if (!prop) {
> > + err = AVERROR(ENOMEM);
> > + goto end;
> > + }
> > +
> > + av_log(ctx, AV_LOG_INFO, "GPU listing:\n");
> > + for (i = 0; i < num; i++) {
> > + vkGetPhysicalDeviceProperties(devices[i], &prop[i]);
> > + av_log(ctx, AV_LOG_INFO, " %d: %s (%s) (0x%x)\n", i,
> prop[i].deviceName,
> > + dev_types[prop[i].deviceType], prop[i].deviceID);
> > + }
> > +
> > + if (select->name) {
> > + av_log(ctx, AV_LOG_INFO, "Requested device: %s\n",
> select->name);
> > + for (i = 0; i < num; i++) {
> > + if (strcmp(select->name, prop[i].deviceName) == 0) {
> > + choice = devices[i];
> > + goto end;
> > + }
> > + }
> > + av_log(ctx, AV_LOG_ERROR, "Unable to find device \"%s\"!\n",
> > + select->name);
> > + err = AVERROR_UNKNOWN;
> > + goto end;
> > + } else if (select->pci_device) {
> > + av_log(ctx, AV_LOG_INFO, "Requested device: 0x%x\n",
> select->pci_device);
> > + for (i = 0; i < num; i++) {
> > + if (select->pci_device == prop[i].deviceID) {
> > + choice = devices[i];
> > + goto end;
> > + }
> > + }
> > + av_log(ctx, AV_LOG_ERROR, "Unable to find device with PCI ID
> 0x%x!\n",
> > + select->pci_device);
> > + err = AVERROR(EINVAL);
> > + goto end;
> > + } else if (select->vendor_id) {
> > + av_log(ctx, AV_LOG_INFO, "Requested vendor: 0x%x\n",
> select->vendor_id);
> > + for (i = 0; i < num; i++) {
> > + if (select->vendor_id == prop[i].vendorID) {
> > + choice = devices[i];
> > + goto end;
> > + }
> > + }
> > + av_log(ctx, AV_LOG_ERROR, "Unable to find device with Vendor ID
> 0x%x!\n",
> > + select->vendor_id);
> > + err = AVERROR_UNKNOWN;
> > + goto end;
> > + } else {
> > + if (select->index < num) {
> > + choice = devices[select->index];
> > + goto end;
> > + }
> > + av_log(ctx, AV_LOG_ERROR, "Unable to find device with index
> %i!\n",
> > + select->index);
> > + err = AVERROR_UNKNOWN;
> > + goto end;
> > + }
> > +
> > +end:
> > + av_free(devices);
> > + av_free(prop);
> > + hwctx->phys_dev = choice;
> > +
> > + return err;
> > +}
> > +
> > +static int search_queue_families(AVHWDeviceContext *ctx,
> VkDeviceCreateInfo *cd)
> > +{
> > + uint32_t i, num;
> > + VkQueueFamilyProperties *qs = NULL;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + int graph_index = -1, comp_index = -1, tx_index = -1;
> > + VkDeviceQueueCreateInfo *pc = (VkDeviceQueueCreateInfo
> *)cd->pQueueCreateInfos;
> > +
> > + /* First get the number of queue families */
> > + vkGetPhysicalDeviceQueueFamilyProperties(hwctx->phys_dev, &num,
> NULL);
> > + if (!num) {
> > + av_log(ctx, AV_LOG_ERROR, "Failed to get queues!\n");
> > + return AVERROR_EXTERNAL;
> > + }
> > +
> > + /* Then allocate memory */
> > + qs = av_malloc_array(num, sizeof(VkQueueFamilyProperties));
> > + if (!qs)
> > + return AVERROR(ENOMEM);
> > +
> > + /* Finally retrieve the queue families */
> > + vkGetPhysicalDeviceQueueFamilyProperties(hwctx->phys_dev, &num,
> qs);
> > +
> > +#define SEARCH_FLAGS(expr, out)
> \
> > + for (i = 0; i < num; i++) {
> \
> > + const VkQueueFlagBits flags = qs[i].queueFlags;
> \
> > + if (expr) {
> \
> > + out = i;
> \
> > + break;
> \
> > + }
> \
> > + }
> > +
> > + if (!hwctx->queue_family_index)
> > + SEARCH_FLAGS(flags & VK_QUEUE_GRAPHICS_BIT, graph_index)
> > +
> > + if (!hwctx->queue_family_comp_index)
> > + SEARCH_FLAGS((flags & VK_QUEUE_COMPUTE_BIT) && (i !=
> graph_index),
> > + comp_index)
> > +
> > + if (!hwctx->queue_family_tx_index)
> > + SEARCH_FLAGS((flags & VK_QUEUE_TRANSFER_BIT) && (i !=
> graph_index) &&
> > + (i != comp_index), tx_index)
> > +
> > +#undef SEARCH_FLAGS
> > +#define QF_FLAGS(flags)
> \
> > + ((flags) & VK_QUEUE_GRAPHICS_BIT ) ? "(graphics) " : "",
> \
> > + ((flags) & VK_QUEUE_COMPUTE_BIT ) ? "(compute) " : "",
> \
> > + ((flags) & VK_QUEUE_TRANSFER_BIT ) ? "(transfer) " : "",
> \
> > + ((flags) & VK_QUEUE_SPARSE_BINDING_BIT) ? "(sparse) " : ""
> > +
> > + av_log(ctx, AV_LOG_INFO, "Using queue family %i for graphics, "
> > + "flags: %s%s%s%s\n", graph_index, QF_FLAGS(qs[graph_index].
> queueFlags));
> > +
> > + hwctx->queue_family_index = graph_index;
> > + hwctx->queue_family_tx_index = graph_index;
> > + hwctx->queue_family_comp_index = graph_index;
> > +
> > + pc[cd->queueCreateInfoCount++].queueFamilyIndex = graph_index;
> > +
> > + if (comp_index != -1) {
> > + av_log(ctx, AV_LOG_INFO, "Using queue family %i for compute, "
> > + "flags: %s%s%s%s\n", comp_index, QF_FLAGS(qs[comp_index].
> queueFlags));
> > + hwctx->queue_family_tx_index = comp_index;
> > + hwctx->queue_family_comp_index = comp_index;
> > + pc[cd->queueCreateInfoCount++].queueFamilyIndex = comp_index;
> > + }
> > +
> > + if (tx_index != -1) {
> > + av_log(ctx, AV_LOG_INFO, "Using queue family %i for transfers, "
> > + "flags: %s%s%s%s\n", tx_index, QF_FLAGS(qs[tx_index].
> queueFlags));
> > + hwctx->queue_family_tx_index = tx_index;
> > + pc[cd->queueCreateInfoCount++].queueFamilyIndex = tx_index;
> > + }
> > +
> > +#undef PRINT_QF_FLAGS
> > +
> > + av_free(qs);
> > +
> > + return 0;
> > +}
> > +
> > +int av_vk_create_exec_ctx(AVHWDeviceContext *ctx, AVVkExecContext *e,
> int queue)
>
> I don't think this function should be in the public API. It doesn't do
> anything the user can't do from the information they have in the public
> hwcontext information.
>
> In this file it's used to create a transfer queue for internally-created
> devices, but for externally created ones need to supply their own somehow?
> I think following how OpenCL deals with command queues might be sensible,
> where it will create one internally in the external device case if it isn't
> supplied.
>
> It's also used in libavfilter, but I don't think the function is long
> enough to find it objectionable to have a similar version there as well.
>
Yeah, that's fine too I guess. I'll keep those functions internal and
duplicate them in lavfi/vulkan.c, they're not too big.
>
> > +{
> > + VkResult ret;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +
> > + VkCommandPoolCreateInfo cqueue_create = {
> > + .sType = VK_STRUCTURE_TYPE_COMMAND_
> POOL_CREATE_INFO,
> > + .flags = VK_COMMAND_POOL_CREATE_RESET_
> COMMAND_BUFFER_BIT,
> > + .queueFamilyIndex = queue,
> > + };
> > + VkCommandBufferAllocateInfo cbuf_create = {
> > + .sType = VK_STRUCTURE_TYPE_COMMAND_
> BUFFER_ALLOCATE_INFO,
> > + .level = VK_COMMAND_BUFFER_LEVEL_PRIMARY,
> > + .commandBufferCount = 1,
> > + };
> > + VkFenceCreateInfo fence_spawn = { VK_STRUCTURE_TYPE_FENCE_CREATE_INFO
> };
> > +
> > + ret = vkCreateCommandPool(hwctx->act_dev, &cqueue_create, NULL,
> &e->pool);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Command pool creation failure: %s\n",
> > + av_vk_ret2str(ret));
> > + return 1;
> > + }
> > +
> > + cbuf_create.commandPool = e->pool;
> > +
> > + ret = vkAllocateCommandBuffers(hwctx->act_dev, &cbuf_create,
> &e->buf);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Command buffer alloc failure: %s\n",
> > + av_vk_ret2str(ret));
> > + return 1;
> > + }
> > +
> > + ret = vkCreateFence(hwctx->act_dev, &fence_spawn, NULL, &e->fence);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Failed to create frame fence: %s\n",
> > + av_vk_ret2str(ret));
> > + return 1;
> > + }
> > +
> > + vkGetDeviceQueue(hwctx->act_dev, queue, 0, &e->queue);
> > +
> > + return 0;
> > +}
> > +
> > +void av_vk_free_exec_ctx(AVHWDeviceContext *ctx, AVVkExecContext *e)
> > +{
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +
> > + if (!e)
> > + return;
> > +
> > + if (e->fence != VK_NULL_HANDLE)
> > + vkDestroyFence(hwctx->act_dev, e->fence, NULL);
> > + vkFreeCommandBuffers(hwctx->act_dev, e->pool, 1, &e->buf);
> > + vkDestroyCommandPool(hwctx->act_dev, e->pool, NULL);
> > +}
> > +
> > +static void vulkan_device_free(AVHWDeviceContext *ctx)
> > +{
> > + VulkanDevicePriv *p = ctx->internal->priv;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +
> > + if (p)
> > + av_vk_free_exec_ctx(ctx, &p->exec);
> > +
> > + vkDestroyDevice(hwctx->act_dev, NULL);
> > +
> > + if (p && p->debug_ctx != VK_NULL_HANDLE) {
> > + VK_LOAD_PFN(hwctx->inst, vkDestroyDebugUtilsMessengerEXT);
> > + pfn_vkDestroyDebugUtilsMessengerEXT(hwctx->inst, p->debug_ctx,
> NULL);
> > + }
> > +
> > + vkDestroyInstance(hwctx->inst, NULL);
> > +}
> > +
> > +static int vulkan_device_create_internal(AVHWDeviceContext *ctx,
> > + VulkanDeviceSelection
> *dev_select,
> > + AVDictionary *opts, int flags)
> > +{
> > + int err = 0;
> > + VkResult ret;
> > + AVDictionaryEntry *use_linear_images_opt;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + VkDeviceQueueCreateInfo queue_create_info[3] = {
> > + { .sType = VK_STRUCTURE_TYPE_DEVICE_
> QUEUE_CREATE_INFO,
> > + .pQueuePriorities = (float []){ 1.0f },
> > + .queueCount = 1, },
> > + { .sType = VK_STRUCTURE_TYPE_DEVICE_
> QUEUE_CREATE_INFO,
> > + .pQueuePriorities = (float []){ 1.0f },
> > + .queueCount = 1, },
> > + { .sType = VK_STRUCTURE_TYPE_DEVICE_
> QUEUE_CREATE_INFO,
> > + .pQueuePriorities = (float []){ 1.0f },
> > + .queueCount = 1, },
> > + };
> > +
> > + VkDeviceCreateInfo dev_info = {
> > + .sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
> > + .pQueueCreateInfos = queue_create_info,
> > + .queueCreateInfoCount = 0,
> > + };
> > +
> > + VulkanDevicePriv *p = av_mallocz(sizeof(*p));
> > + if (!p) {
> > + err = AVERROR(ENOMEM);
> > + goto fail;
> > + }
> > +
> > + ctx->internal->priv = p;
> > + ctx->free = vulkan_device_free;
> > +
> > + /* Create an instance if not given one */
> > + if (!hwctx->inst && (err = create_instance(ctx, opts)))
> > + goto fail;
> > +
> > + /* Find a device (if not given one) */
> > + if (!hwctx->phys_dev && (err = find_device(ctx, dev_select)))
> > + goto fail;
> > +
> > + vkGetPhysicalDeviceProperties(hwctx->phys_dev, &p->props);
> > + av_log(ctx, AV_LOG_INFO, "Using device: %s\n", p->props.deviceName);
> > + av_log(ctx, AV_LOG_INFO, "Alignments:\n");
> > + av_log(ctx, AV_LOG_INFO, " optimalBufferCopyOffsetAlignment:
> %li\n",
> > + p->props.limits.optimalBufferCopyOffsetAlignment);
> > + av_log(ctx, AV_LOG_INFO, " optimalBufferCopyRowPitchAlignment:
> %li\n",
> > + p->props.limits.optimalBufferCopyRowPitchAlignment);
> > + av_log(ctx, AV_LOG_INFO, " minMemoryMapAlignment:
> %li\n",
> > + p->props.limits.minMemoryMapAlignment);
> > +
> > + /* Search queue family */
> > + if ((err = search_queue_families(ctx, &dev_info)))
> > + goto fail;
> > +
> > + if (!hwctx->act_dev) {
> > + err = check_extensions(ctx, 1, &dev_info.
> ppEnabledExtensionNames,
> > + &dev_info.enabledExtensionCount, 0);
> > + if (err)
> > + goto fail;
> > +
> > + ret = vkCreateDevice(hwctx->phys_dev, &dev_info, NULL,
> &hwctx->act_dev);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Device creation failure: %s\n",
> > + av_vk_ret2str(ret));
> > + err = AVERROR_EXTERNAL;
> > + goto fail;
> > + }
> > +
> > + av_free((void *)dev_info.ppEnabledExtensionNames);
> > + }
> > +
> > + /* Get device capabilities */
> > + vkGetPhysicalDeviceMemoryProperties(hwctx->phys_dev, &p->mprops);
> > +
> > + err = av_vk_create_exec_ctx(ctx, &p->exec,
> hwctx->queue_family_tx_index);
> > + if (err)
> > + goto fail;
> > +
> > + /* Tiled images setting, use them by default */
> > + use_linear_images_opt = av_dict_get(opts, "linear_images", NULL, 0);
> > + if (use_linear_images_opt)
> > + p->use_linear_images = strtol(use_linear_images_opt->value,
> NULL, 10);
> > +
> > + return 0;
> > +
> > +fail:
> > + av_freep(&ctx->internal->priv);
> > + return err;
> > +}
> > +
> > +static int vulkan_device_create(AVHWDeviceContext *ctx, const char
> *device,
> > + AVDictionary *opts, int flags)
> > +{
> > + VulkanDeviceSelection dev_select = { 0 };
> > + if (device && device[0]) {
> > + if (av_isdigit(device[0]))
> > + dev_select.index = strtol(device, NULL, 10);
> > + else
> > + dev_select.name = device;
> > + }
> > +
> > + return vulkan_device_create_internal(ctx, &dev_select, opts,
> flags);
> > +}
> > +
> > +static int vulkan_device_derive(AVHWDeviceContext *ctx,
> > + AVHWDeviceContext *src_ctx, int flags)
> > +{
> > + VulkanDeviceSelection dev_select = { 0 };
> > +
> > + switch(src_ctx->type) {
> > +#if CONFIG_LIBDRM
> > +#if CONFIG_VAAPI
> > + case AV_HWDEVICE_TYPE_VAAPI: {
> > + AVVAAPIDeviceContext *src_hwctx = src_ctx->hwctx;
> > + const char *vendor = vaQueryVendorString(src_hwctx->display);
> > + if (!vendor) {
> > + av_log(ctx, AV_LOG_ERROR, "Unable to get device info from
> vaapi!\n");
> > + return AVERROR_EXTERNAL;
> > + }
> > +
> > + if (strstr(vendor, "Intel"))
> > + dev_select.vendor_id = 0x8086;
> > + if (strstr(vendor, "AMD"))
> > + dev_select.vendor_id = 0x1002;
>
> That's rather horrible :(
>
> Is it possible to make the derivation the other way around? (Given a
> Vulkan device, extract the DRM device and then create the VAAPI device from
> it.) The use might be slightly more clumsy on the command line, but the
> derivation code does do the right thing for giving you the original device
> when you map back from a derived one.
>
Nope, Vulkan gives you a vendor ID, PCI ID and something it calls uint8_t
pipelineCacheUUID[VK_UUID_SIZE];, "an array of size VK_UUID_SIZE,
containing 8-bit values that represent a universally unique identifier for
the device.". Never heard of such an identifier.
> Alternatively, maybe we could add a function to libva to extract a DRM fd
> from it - that could be implemented in libva alone, it wouldn't need
> support from the drivers.
>
Sure, that would make it easier (and more accurate). I don't mind having a
dependency on very new libva versions.
> > +
> > + return vulkan_device_create_internal(ctx, &dev_select, NULL,
> flags);
> > + }
> > +#endif
> > + case AV_HWDEVICE_TYPE_DRM: {
> > + AVDRMDeviceContext *src_hwctx = src_ctx->hwctx;
> > +
> > + drmDevice *drm_dev_info;
> > + int err = drmGetDevice(src_hwctx->fd, &drm_dev_info);
> > + if (err) {
> > + av_log(ctx, AV_LOG_ERROR, "Unable to get device info from
> drm fd!\n");
> > + return AVERROR_EXTERNAL;
> > + }
> > +
> > + dev_select.pci_device = drm_dev_info->deviceinfo.pci->
> device_id;
> > +
> > + return vulkan_device_create_internal(ctx, &dev_select, NULL,
> flags);
>
> I assume that means there isn't a way to pass in a DRM fd directly?
>
Nope, you have nothing to pass it to when you create an instance or a
device.
>
> I don't think the deviceinfo is actually sufficient - if you have two
> identical graphics cards they will appear identical. Can you use the bus
> info instead, since that will necessarily be unique?
>
> Also, only desktop platforms are going to be PCI, so please check for that
> as well. Mobile devices will probably use some sort of undefined platform
> bus, but at least in that case they are likely to only have one sensible
> Vulkan device.
>
Don't have one with Vulkan support. I plan to deal with that issue once
someone reports it.
>
> > + }
> > +#endif
> > + default:
> > + return AVERROR(ENOSYS);
> > + }
> > +}
> > +
> > +static int vulkan_frames_get_constraints(AVHWDeviceContext *ctx,
> > + const void *hwconfig,
> > + AVHWFramesConstraints
> *constraints)
> > +{
> > + int count = 0;
> > + enum AVPixelFormat i;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + VulkanDevicePriv *p = ctx->internal->priv;
> > +
> > + for (i = 0; i < AV_PIX_FMT_NB; i++)
> > + count += vkfmt_is_supported(hwctx, i, p->use_linear_images);
> > +
> > + constraints->valid_sw_formats = av_malloc_array(count + 1,
> > + sizeof(enum
> AVPixelFormat));
> > + if (!constraints->valid_sw_formats)
> > + return AVERROR(ENOMEM);
> > +
> > + count = 0;
> > + for (i = 0; i < AV_PIX_FMT_NB; i++)
> > + if (vkfmt_is_supported(hwctx, i, p->use_linear_images))
> > + constraints->valid_sw_formats[count++] = i;
> > + constraints->valid_sw_formats[count++] = AV_PIX_FMT_NONE;
> > +
> > + constraints->min_width = 0;
> > + constraints->min_height = 0;
> > + constraints->max_width = p->props.limits.maxImageDimension2D;
> > + constraints->max_height = p->props.limits.maxImageDimension2D;
> > +
> > + constraints->valid_hw_formats = av_malloc_array(2, sizeof(enum
> AVPixelFormat));
> > + if (!constraints->valid_hw_formats)
> > + return AVERROR(ENOMEM);
> > +
> > + constraints->valid_hw_formats[0] = AV_PIX_FMT_VULKAN;
> > + constraints->valid_hw_formats[1] = AV_PIX_FMT_NONE;
> > +
> > + return 0;
> > +}
> > +
> > +int av_vk_alloc_mem(AVHWDeviceContext *ctx, VkMemoryRequirements *req,
> > + VkMemoryPropertyFlagBits req_flags, void
> *alloc_extension,
> > + VkMemoryPropertyFlagBits *mem_flags, VkDeviceMemory
> *mem)
>
> I don't think this should be a public API. Maybe just specialise this
> inside the uses here, and add something similar to the users in libavfilter?
>
I think this ought to be public API. There's nothing to specialize - the
function exposes everything you could want when allocating memory and
prevents needing to expose any device properties in the context.
> > +{
> > + VkResult ret;
> > + int index = -1;
> > + VulkanDevicePriv *p = ctx->internal->priv;
> > + AVVulkanDeviceContext *dev_hwctx = ctx->hwctx;
> > + VkPhysicalDeviceMemoryProperties prop = p->mprops;
> > + VkMemoryAllocateInfo alloc_info = {
> > + .sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
> > + .pNext = alloc_extension,
> > + };
> > +
> > + /* Align if we need to */
> > + if (req_flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)
> > + req->size = FFALIGN(req->size, p->props.limits.
> minMemoryMapAlignment);
> > +
> > + alloc_info.allocationSize = req->size;
> > +
> > + /* The vulkan spec requires memory types to be sorted in the
> "optimal"
> > + * order, so the first matching type we find will be the
> best/fastest one */
> > + for (int i = 0; i < prop.memoryTypeCount; i++) {
> > + /* The memory type must be supported by the requirements
> (bitfield) */
> > + if (!(req->memoryTypeBits & (1 << i)))
> > + continue;
> > +
> > + /* The memory type flags must include our properties */
> > + if ((prop.memoryTypes[i].propertyFlags & req_flags) !=
> req_flags)
> > + continue;
> > +
> > + /* Found a suitable memory type */
> > + index = i;
> > + break;
> > + }
> > +
> > + if (index < 0) {
> > + av_log(ctx, AV_LOG_ERROR, "No memory type found for flags
> 0x%x\n",
> > + req_flags);
> > + return AVERROR(EINVAL);
> > + }
> > +
> > + alloc_info.memoryTypeIndex = index;
> > +
> > + ret = vkAllocateMemory(dev_hwctx->act_dev, &alloc_info, NULL, mem);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory: %s\n",
> > + av_vk_ret2str(ret));
> > + return AVERROR(ENOMEM);
> > + }
> > +
> > + *mem_flags |= prop.memoryTypes[index].propertyFlags;
> > +
> > + return 0;
> > +}
> > +
> > +static void vulkan_frame_free(void *opaque, uint8_t *data)
> > +{
> > + int i;
> > + AVVkFrame *f = (AVVkFrame *)data;
> > + AVVulkanDeviceContext *hwctx = opaque;
> > +
> > + if (!f)
> > + return;
> > +
> > + vkDestroyImage(hwctx->act_dev, f->img, NULL);
> > + for (i = 0; i < f->mem_count; i++)
> > + vkFreeMemory(hwctx->act_dev, f->mem[i], NULL);
> > +
> > + av_free(f);
> > +}
> > +
> > +static int create_frame(AVHWFramesContext *hwfc, AVVkFrame **frame,
> > + VkImageTiling tiling, VkImageUsageFlagBits
> usage,
> > + int disjoint, void *create_pnext, void
> *alloc_pnext[3])
> > +{
> > + int i, err;
> > + VkResult ret;
> > + AVHWDeviceContext *ctx = hwfc->device_ctx;
> > + enum AVPixelFormat format = hwfc->sw_format;
> > + VkFormat img_fmt = av_vkfmt_from_pixfmt(format);
> > +
> > + /* Allocated */
> > + AVVkFrame *f = NULL;
> > +
> > + /* Contexts */
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + VulkanDevicePriv *p = ctx->internal->priv;
> > +
> > + /* For some reason some extensions need function loading */
> > + VK_LOAD_PFN(hwctx->inst, vkBindImageMemory2KHR);
> > + VK_LOAD_PFN(hwctx->inst, vkGetImageMemoryRequirements2KHR);
> > +
> > + /* Image properties */
> > + VkFormat possible_fmts[2];
> > + VkImageFormatListCreateInfoKHR img_fmt_list = {
> > + .sType = VK_STRUCTURE_TYPE_IMAGE_
> FORMAT_LIST_CREATE_INFO_KHR,
> > + .pNext = create_pnext,
> > + .pViewFormats = possible_fmts,
> > + .viewFormatCount = 1,
> > + };
> > + VkBindImagePlaneMemoryInfo bind_plane_info[3] = {
> > + { .sType = VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO,
> > + .planeAspect = VK_IMAGE_ASPECT_PLANE_0_BIT },
> > + { .sType = VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO,
> > + .planeAspect = VK_IMAGE_ASPECT_PLANE_1_BIT },
> > + { .sType = VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO,
> > + .planeAspect = VK_IMAGE_ASPECT_PLANE_2_BIT },
> > + };
> > + VkBindImageMemoryInfo bind_info[3] = {
> > + { .sType = VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO, },
> > + { .sType = VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO, },
> > + { .sType = VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO, },
> > + };
> > + VkImageCreateInfo image_create_info = {
> > + .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
> > + .pNext = create_pnext,
> > + .imageType = VK_IMAGE_TYPE_2D,
> > + .format = img_fmt,
> > + .extent.width = hwfc->width,
> > + .extent.height = hwfc->height,
> > + .extent.depth = 1,
> > + .mipLevels = 1,
> > + .arrayLayers = 1,
>
> Is there any use for image arrays here? Will it be ok if the user
> supplies one? Do any of the interop extensions give you them, such that we
> will need to handle them?
>
Image arrays? As in, multiple images to represent one image (aka multiple
planes?). Of course not, this is a modern API and the developers are quite
clear they'll stay away from that mess. If any interop needs separate
planes, its fine, because you just allocate each plane separately and bind
it to the same images (aka disjoint images).
>
> > + .flags = VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT |
> > + VK_IMAGE_CREATE_EXTENDED_USAGE_BIT |
> > + (disjoint ? VK_IMAGE_CREATE_DISJOINT_BIT : 0),
> > + .tiling = tiling,
> > + .initialLayout = tiling == VK_IMAGE_TILING_LINEAR ?
> > + VK_IMAGE_LAYOUT_PREINITIALIZED :
> > + VK_IMAGE_LAYOUT_UNDEFINED,
> > + .usage = usage,
> > + .sharingMode = VK_SHARING_MODE_EXCLUSIVE,
> > + .samples = VK_SAMPLE_COUNT_1_BIT,
> > + };
> > +
> > + if (img_fmt == VK_FORMAT_UNDEFINED) {
> > + av_log(ctx, AV_LOG_ERROR, "Unsupported image format!\n");
> > + return AVERROR(EINVAL);
> > + }
> > +
> > + f = av_mallocz(sizeof(AVVkFrame));
>
> sizeof(*f)
>
Fixed.
>
> > + if (!f) {
> > + av_log(ctx, AV_LOG_ERROR, "Unable to allocate memory for
> AVVkFrame!\n");
> > + err = AVERROR(ENOMEM);
> > + goto fail;
> > + }
> > +
> > + /* Needed */
> > + f->flags = 0;
> > + f->mem_count = disjoint ? av_pix_fmt_count_planes(format) : 1;
> > + f->tiling = image_create_info.tiling;
> > + f->layout = image_create_info.initialLayout;
> > + f->access = 0;
> > +
> > + possible_fmts[0] = image_create_info.format;
> > + /* Mark the formats that a VkImageView can be made of if supported
> */
> > + if (p->extensions & EXT_IMAGE_FORMAT_LIST) {
> > + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(format);
> > + const int b = desc->comp[0].step;
> > + possible_fmts[1] = b > 1 ? VK_FORMAT_R16_UNORM :
> VK_FORMAT_R8_UNORM;
>
> I'm not entirely sure what this is doing, but it looks suspicious - YUYV
> has step > 1.
>
Replaced with
const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pixfmt);
const int high = desc->comp[0].depth > 8;
possible_fmts[1] = high ? VK_FORMAT_R16_UNORM : VK_FORMAT_R8_UNORM;
Its a driver hint not actually used yet by any driver (maybe the
proprietary AMD one, not sure). The spec says anything you don't specify
here would be otherwise slightly slower.
>
> > + img_fmt_list.viewFormatCount++;
> > + image_create_info.pNext = &img_fmt_list;
> > + }
> > +
> > + /* Create the image */
> > + ret = vkCreateImage(hwctx->act_dev, &image_create_info, NULL,
> &f->img);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Image creation failure: %s\n",
> > + av_vk_ret2str(ret));
> > + err = AVERROR(EINVAL);
> > + goto fail;
> > + }
> > +
> > + for (i = 0; i < f->mem_count; i++) {
> > + int use_ded_mem;
> > + VkImagePlaneMemoryRequirementsInfo plane_req = {
> > + .sType = VK_STRUCTURE_TYPE_IMAGE_PLANE_
> MEMORY_REQUIREMENTS_INFO,
> > + .planeAspect = i == 0 ? VK_IMAGE_ASPECT_PLANE_0_BIT :
> > + i == 1 ? VK_IMAGE_ASPECT_PLANE_1_BIT :
> > + VK_IMAGE_ASPECT_PLANE_2_BIT,
> > + };
> > + VkImageMemoryRequirementsInfo2 req_desc = {
> > + .sType = VK_STRUCTURE_TYPE_IMAGE_
> MEMORY_REQUIREMENTS_INFO_2,
> > + .pNext = disjoint ? &plane_req : NULL,
> > + .image = f->img,
> > + };
> > + VkMemoryDedicatedAllocateInfo ded_alloc = {
> > + .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO,
> > + .pNext = alloc_pnext[i],
> > + };
> > + VkMemoryDedicatedRequirements ded_req = {
> > + .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS,
> > + };
> > + VkMemoryRequirements2 req = {
> > + .sType = VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2,
> > + .pNext = (p->extensions & EXT_DEDICATED_ALLOC) ? &ded_req :
> NULL,
> > + };
> > +
> > + pfn_vkGetImageMemoryRequirements2KHR(hwctx->act_dev,
> &req_desc, &req);
> > +
> > + /* In case the implementation prefers/requires dedicated
> allocation */
> > + use_ded_mem = ded_req.prefersDedicatedAllocation |
> > + ded_req.requiresDedicatedAllocation;
> > + if (use_ded_mem)
> > + ded_alloc.image = f->img;
> > +
> > + /* Allocate host visible memory */
> > + if ((err = av_vk_alloc_mem(ctx, &req.memoryRequirements,
> > + tiling == VK_IMAGE_TILING_LINEAR ?
> > + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
> :
> > + VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT,
> > + use_ded_mem ? &ded_alloc :
> alloc_pnext[i],
> > + &f->flags, &f->mem[i])))
> > + goto fail;
> > +
> > + bind_info[i].pNext = disjoint ? &bind_plane_info[i] : NULL;
> > + bind_info[i].image = f->img;
> > + bind_info[i].memory = f->mem[i];
> > + }
> > +
> > + /* Bind the allocated memory to the image */
> > + ret = pfn_vkBindImageMemory2KHR(hwctx->act_dev, f->mem_count,
> bind_info);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Failed to bind memory: %s\n",
> > + av_vk_ret2str(ret));
> > + err = AVERROR_EXTERNAL;
> > + goto fail;
> > + }
> > +
> > + *frame = f;
> > + return 0;
> > +
> > +fail:
> > + vulkan_frame_free(hwctx, (uint8_t *)f);
> > + return err;
> > +}
> > +
> > +static AVBufferRef *vulkan_pool_alloc(void *opaque, int size)
> > +{
> > + int err;
> > + AVVkFrame *f;
> > + AVBufferRef *avbuf = NULL;
> > + AVHWFramesContext *hwfc = opaque;
> > + VulkanDevicePriv *p = hwfc->device_ctx->internal->priv;
> > + AVVulkanFramesContext *hwctx = hwfc->hwctx;
> > +
> > + err = create_frame(hwfc, &f,
> > + p->use_linear_images ? VK_IMAGE_TILING_LINEAR :
> > + VK_IMAGE_TILING_OPTIMAL,
> > + hwctx->usage ? hwctx->usage :
> DEFAULT_USAGE_FLAGS,
> > + hwctx->disjoint, hwctx->create_pnext,
> > + (void *[3]) { hwctx->alloc_pnext[0],
> > + hwctx->alloc_pnext[1],
> > + hwctx->alloc_pnext[2], });
> > + if (err)
> > + return NULL;
> > +
> > + avbuf = av_buffer_create((uint8_t *)f, sizeof(AVVkFrame),
> > + vulkan_frame_free,
> hwfc->device_ctx->hwctx, 0);
> > + if (!avbuf) {
> > + vulkan_frame_free(hwfc, (uint8_t *)f);
> > + return NULL;
> > + }
> > +
> > + return avbuf;
> > +}
> > +
> > +static int vulkan_frames_init(AVHWFramesContext *hwfc)
> > +{
> > + if (hwfc->pool)
> > + return 0;
> > +
> > + hwfc->internal->pool_internal = av_buffer_pool_init2(sizeof(
> AVVkFrame),
> > + hwfc,
> vulkan_pool_alloc,
> > + NULL);
> > + if (!hwfc->internal->pool_internal)
> > + return AVERROR(ENOMEM);
> > +
> > + return 0;
> > +}
> > +
> > +static int vulkan_get_buffer(AVHWFramesContext *hwfc, AVFrame *frame)
> > +{
> > + frame->buf[0] = av_buffer_pool_get(hwfc->pool);
> > + if (!frame->buf[0])
> > + return AVERROR(ENOMEM);
> > +
> > + frame->data[0] = (uint8_t *)frame->buf[0]->data;
>
> Unnecessary cast.
>
Fixed.
>
> > + frame->format = AV_PIX_FMT_VULKAN;
> > + frame->width = hwfc->width;
> > + frame->height = hwfc->height;
> > +
> > + return 0;
> > +}
> > +
> > +static int vulkan_transfer_get_formats(AVHWFramesContext *hwfc,
> > + enum AVHWFrameTransferDirection
> dir,
> > + enum AVPixelFormat **formats)
> > +{
> > + int pref_avail = 0, count = 0;
> > + enum AVPixelFormat i, *pix_fmts = NULL;
> > + VulkanDevicePriv *p = hwfc->device_ctx->internal->priv;
> > + AVVulkanDeviceContext *hwctx = hwfc->device_ctx->hwctx;
> > +
> > + for (i = 0; i < AV_PIX_FMT_NB; i++) {
> > + const int avail = vkfmt_is_supported(hwctx, i,
> p->use_linear_images);
> > + pref_avail |= (i == hwfc->sw_format) && avail;
> > + count += avail;
> > + }
> > +
> > + pix_fmts = av_malloc((count + 1) * sizeof(*pix_fmts));
> > + if (!pix_fmts)
> > + return AVERROR(ENOMEM);
> > +
> > + count = 0;
> > + if (pref_avail)
> > + pix_fmts[count++] = hwfc->sw_format;
> > + for (i = 0; i < AV_PIX_FMT_NB; i++) {
> > + if (vkfmt_is_supported(hwctx, i, p->use_linear_images) &&
> > + (i != hwfc->sw_format))
> > + pix_fmts[count++] = i;
> > + }
> > + pix_fmts[count++] = AV_PIX_FMT_NONE;
> > +
> > + *formats = pix_fmts;
> > +
> > + return 0;
> > +}
>
> You can do arbitrary conversions in transfer here? How does the RGB <->
> YUV work?
>
No, you can't. RGB <- YUV (not a typo) works by using a sampler. I was just
not sure what the function is supposed to do. I'll rewrite it now that I
know what its doing (was wondering why there were 2 functions which did the
same thing, but this one's meant to get the sw_format from the context and
determine to what you can download the frame to).
> > +
> > +typedef struct VulkanMapping {
> > + AVVkFrame *frame;
> > + int flags;
> > +} VulkanMapping;
> > +
> > +static void vulkan_unmap_frame(AVHWFramesContext *hwfc,
> HWMapDescriptor *hwmap)
> > +{
> > + int i;
> > + VulkanMapping *map = hwmap->priv;
> > + AVVulkanDeviceContext *hwctx = hwfc->device_ctx->hwctx;
> > + VkMappedMemoryRange flush_ranges[3] = {
> > + { .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
> > + .memory = map->frame->mem[0],
> > + .size = VK_WHOLE_SIZE, },
> > + { .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
> > + .memory = map->frame->mem[1],
> > + .size = VK_WHOLE_SIZE, },
> > + { .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
> > + .memory = map->frame->mem[2],
> > + .size = VK_WHOLE_SIZE, },
> > + };
> > +
> > + /* Check if buffer needs flushing */
> > + if ((map->flags & AV_HWFRAME_MAP_WRITE) &&
> > + !(map->frame->flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)) {
> > + VkResult ret;
> > + ret = vkFlushMappedMemoryRanges(hwctx->act_dev,
> map->frame->mem_count,
> > + flush_ranges);
> > + if (ret != VK_SUCCESS) {
> > + av_log(hwfc, AV_LOG_ERROR, "Failed to flush memory: %s\n",
> > + av_vk_ret2str(ret));
> > + }
> > + }
> > +
> > + for (i = 0; i < map->frame->mem_count; i++)
> > + vkUnmapMemory(hwctx->act_dev, map->frame->mem[i]);
> > +
> > + av_free(map);
> > +}
> > +
> > +static int vulkan_map_frame(AVHWFramesContext *hwfc, AVFrame *dst,
> > + const AVFrame *src, int flags)
> > +{
> > + void *mem[3];
> > + int i, err;
> > + VkResult ret;
> > + AVVkFrame *f = (AVVkFrame *)src->data[0];
> > + AVVulkanDeviceContext *hwctx = hwfc->device_ctx->hwctx;
> > + const int planes = av_pix_fmt_count_planes(hwfc->sw_format);
> > + VkMappedMemoryRange map_mem_ranges[3] = {
> > + { .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
> > + .size = VK_WHOLE_SIZE,
> > + .memory = f->mem[0], },
> > + { .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
> > + .size = VK_WHOLE_SIZE,
> > + .memory = f->mem[1], },
> > + { .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
> > + .size = VK_WHOLE_SIZE,
> > + .memory = f->mem[2], },
> > + };
> > +
> > + VulkanMapping *map = av_mallocz(sizeof(VulkanMapping));
> > + if (!map)
> > + return AVERROR(EINVAL);
> > +
> > + if (src->format != AV_PIX_FMT_VULKAN) {
> > + av_log(hwfc, AV_LOG_ERROR, "Cannot map from pixel format %s!\n",
> > + av_get_pix_fmt_name(src->format));
> > + err = AVERROR(EINVAL);
> > + goto fail;
> > + }
> > +
> > + if (!(f->flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)) {
> > + av_log(hwfc, AV_LOG_ERROR, "Unable to map frame, not host
> visible!\n");
> > + err = AVERROR(EINVAL);
> > + goto fail;
> > + }
> > +
> > + dst->width = src->width;
> > + dst->height = src->height;
> > +
> > + for (i = 0; i < f->mem_count; i++) {
> > + ret = vkMapMemory(hwctx->act_dev, f->mem[i], 0,
> > + VK_WHOLE_SIZE, 0, &mem[i]);
>
> What properties do we expect the mapped memory to have here? If you map
> something across a slow bus, is it actually the remote copy (and therefore
> you want to use magic memcpy or similar on it), or will it make a local
> copy for you to use? You might be able to implement AV_HWFRAME_MAP_DIRECT
> here too.
>
Mapping images directly on Intel is quite a bit faster than copying to
buffers and then copying to userspace. This is quite low level so it won't
copy them for you if you map them.
Mapping directly is possible as long as the image is linear. How should I
implement that? Add it in vulkan_map_from()?
> > + if (ret != VK_SUCCESS) {
> > + av_log(hwfc, AV_LOG_ERROR, "Failed to map image memory:
> %s\n",
> > + av_vk_ret2str(ret));
> > + err = AVERROR_EXTERNAL;
> > + goto fail;
> > + }
> > + if (f->mem_count == 1)
> > + mem[1] = mem[2] = mem[0];
> > + }
> > +
> > + /* Check if the memory contents matter */
> > + if ((flags & AV_HWFRAME_MAP_READ) &&
>
> Also check for not AV_HWFRAME_MAP_OVERWRITE, since we don't care about the
> previous contents if that is set.
>
Done, it looks like this now:
if (((flags & AV_HWFRAME_MAP_READ) || !(flags &
AV_HWFRAME_MAP_OVERWRITE)) &&
!(f->flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)) {
>
> > + !(f->flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)) {
> > + ret = vkInvalidateMappedMemoryRanges(hwctx->act_dev,
> f->mem_count,
> > + map_mem_ranges);
> > + if (ret != VK_SUCCESS) {
> > + av_log(hwfc, AV_LOG_ERROR, "Failed to invalidate memory:
> %s\n",
> > + av_vk_ret2str(ret));
> > + err = AVERROR_EXTERNAL;
> > + goto fail;
> > + }
> > + }
> > +
> > + for (i = 0; i < planes; i++) {
> > + VkImageSubresource sub = {
> > + .aspectMask = planes < 2 ? VK_IMAGE_ASPECT_COLOR_BIT :
> > + i == 0 ? VK_IMAGE_ASPECT_PLANE_0_BIT :
> > + i == 1 ? VK_IMAGE_ASPECT_PLANE_1_BIT :
> > + VK_IMAGE_ASPECT_PLANE_2_BIT,
> > + };
> > + VkSubresourceLayout layout;
> > + vkGetImageSubresourceLayout(hwctx->act_dev, f->img, &sub,
> &layout);
> > + dst->data[i] = ((uint8_t *)mem[i]) + layout.offset;
> > + dst->linesize[i] = layout.rowPitch;
> > + }
> > +
> > + map->frame = f;
> > + map->flags = flags;
> > +
> > + err = ff_hwframe_map_create(src->hw_frames_ctx, dst, src,
> > + &vulkan_unmap_frame, map);
> > + if (err < 0)
> > + goto fail;
> > +
> > + return 0;
> > +
> > +fail:
>
> I think you need some unmaps here.
>
Fixed.
>
> > + av_free(map);
> > + return err;
> > +}
> > +
> > +#if CONFIG_LIBDRM
> > +static void vulkan_unmap_from(AVHWFramesContext *hwfc, HWMapDescriptor
> *hwmap)
> > +{
> > + int i;
> > + VulkanMapping *map = hwmap->priv;
> > + AVVulkanDeviceContext *hwctx = hwfc->device_ctx->hwctx;
> > +
> > + vkDestroyImage(hwctx->act_dev, map->frame->img, NULL);
> > + for (i = 0; i < map->frame->mem_count; i++)
> > + vkFreeMemory(hwctx->act_dev, map->frame->mem[i], NULL);
> > +
> > + av_freep(&map->frame);
> > +}
> > +
> > +static int vulkan_map_from_drm(AVHWFramesContext *hwfc, AVFrame *dst,
> > + const AVFrame *src, int flags)
> > +{
> > + int i, err = 0;
> > + VulkanMapping *map = NULL;
> > +
> > + /* Source frame */
> > + AVDRMFrameDescriptor *desc = (AVDRMFrameDescriptor *)src->data[0];
> > +
> > + /* Destination frame */
> > + AVVkFrame *f = NULL;
> > +#if HAS_DRM_MODIFIERS
> > + uint64_t modifier_buf[3];
> > + VkImageDrmFormatModifierListCreateInfoEXT drm_mod = {
> > + .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_DRM_FORMAT_
> MODIFIER_INFO_EXT,
> > + };
> > +#endif
> > + VkExternalMemoryImageCreateInfo ext_info = {
> > + .sType = VK_STRUCTURE_TYPE_EXTERNAL_
> MEMORY_IMAGE_CREATE_INFO,
> > +#if HAS_DRM_MODIFIERS
> > + .pNext = &drm_mod,
> > +#endif
> > + .handleTypes = VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT,
> > + };
> > + VkImportMemoryFdInfoKHR import_desc[3];
> > +
> > + if ((desc->nb_objects > 1) &&
> > + (desc->nb_objects != av_pix_fmt_count_planes(hwfc->format))) {
> > + av_log(hwfc, AV_LOG_ERROR, "Number of DRM objects doesn't match
> "
> > + "plane count!\n");
> > + return AVERROR(EINVAL);
> > + }
> > +
> > + for (i = 0; i < desc->nb_objects; i++) {
> > + import_desc[i].sType = VK_STRUCTURE_TYPE_IMPORT_
> MEMORY_FD_INFO_KHR;
> > + import_desc[i].pNext = NULL;
> > + import_desc[i].handleType = ext_info.handleTypes;
> > + import_desc[i].fd = desc->objects[i].fd;
> > +#if HAS_DRM_MODIFIERS
> > + modifier_buf[i] = desc->objects[i].format_modifier;
> > +#endif
> > + }
> > +#if HAS_DRM_MODIFIERS
> > + drm_mod.pDrmFormatModifiers = modifier_buf;
> > + drm_mod.drmFormatModifierCount = desc->nb_objects;
> > +#endif
> > +
> > + err = create_frame(hwfc, &f,
> > +#if HAS_DRM_MODIFIERS
> > + VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT,
> > +#else
> > + desc->objects[0].format_modifier ==
> DRM_FORMAT_MOD_LINEAR ?
> > + VK_IMAGE_TILING_LINEAR : VK_IMAGE_TILING_OPTIMAL,
> > +#endif
> > + DEFAULT_USAGE_FLAGS, desc->nb_objects > 1,
> &ext_info,
> > + (void *[3]) { &import_desc[0],
> > + &import_desc[1],
> > + &import_desc[2], });
>
> Um, yeah. This could be made saner now that we have
> DRM_FORMAT_MOD_INVALID to use.
>
I've added a check above. Don't worry, the ifdeffery should disappear in a
few weeks once it gets merged.
> > + if (err < 0)
> > + goto fail;
> > +
> > + /* The unmapping function will free this */
> > + dst->data[0] = (uint8_t *)f;
> > + dst->width = src->width;
> > + dst->height = src->height;
> > +
> > + map = av_mallocz(sizeof(VulkanMapping));
> > + if (!map)
> > + return AVERROR(ENOMEM);
>
> goto fail;
>
Fix'd.
>
> > +
> > + map->frame = f;
> > + map->flags = flags;
> > +
> > + err = ff_hwframe_map_create(dst->hw_frames_ctx, dst, src,
> > + &vulkan_unmap_from, map);
> > + if (err < 0)
> > + goto fail;
> > +
> > + av_log(hwfc, AV_LOG_DEBUG, "Mapped DRM object to Vulkan!\n");
> > +
> > + return 0;
> > +
> > +fail:
> > + vulkan_frame_free(f, NULL);
> > + av_free(map);
> > + return err;
> > +}
> > +
> > +#if CONFIG_VAAPI
> > +static int vulkan_map_from_vaapi(AVHWFramesContext *dst_fc,
> > + AVFrame *dst, const AVFrame *src,
> > + int flags)
> > +{
> > + int err;
> > + HWMapDescriptor *hwmap;
> > + AVFrame *tmp = av_frame_alloc();
> > + if (!tmp)
> > + return AVERROR(ENOMEM);
> > +
> > + tmp->format = AV_PIX_FMT_DRM_PRIME;
> > +
> > + err = av_hwframe_map(tmp, src, flags);
> > + if (err < 0)
> > + goto fail;
> > +
> > + err = vulkan_map_from_drm(dst_fc, dst, tmp, flags);
> > + if (err < 0)
> > + goto fail;
> > +
> > + /* Adjust the map descriptor so that unmap works correctly. */
> > + hwmap = (HWMapDescriptor*)dst->buf[0]->data;
> > + av_frame_unref(hwmap->source);
> > + err = av_frame_ref(hwmap->source, src);
>
> Huh, I recognise this hack. I guess it should probably be made into a
> proper function so it's less naughty (ff_hwframe_map_remove_middle() or
> something).
>
I guess so, its the way to deal with intermediary mappings.
> > +
> > +fail:
> > + av_frame_free(&tmp);
> > + return err;
> > +}
> > +#endif
> > +#endif
> > +
> > +static int vulkan_map_to(AVHWFramesContext *hwfc, AVFrame *dst,
> > + const AVFrame *src, int flags)
> > +{
> > + VulkanDevicePriv *p = hwfc->device_ctx->internal->priv;
> > + int drm_map = p->extensions & (EXT_EXTERNAL_FD_MEMORY |
> > + EXT_EXTERNAL_DMABUF_MEMORY |
> > + EXT_DRM_MODIFIER_FLAGS);
>
> This gets used as boolean below so it's checking whether you have any one
> of those extensions. I don't think that's what you want.
>
Right, this should be p->extensions & EXT_DRM_MODIFIER_FLAGS, fixed.
> > +
> > + if (!(p->extensions & EXT_EXTERNAL_MEMORY)) {
> > + av_log(hwfc, AV_LOG_ERROR, "Cannot import any external memory, "
> > + "VK_KHR_external_memory is
> unsupported!\n");
> > + return AVERROR(ENOSYS);
> > + }
> > +
> > + switch (src->format) {
> > +#if CONFIG_LIBDRM
> > +#if CONFIG_VAAPI
> > + case AV_PIX_FMT_VAAPI:
> > + if (drm_map)
> > + return vulkan_map_from_vaapi(hwfc, dst, src, flags);
> > +#endif
> > + case AV_PIX_FMT_DRM_PRIME:
> > + if (drm_map)
> > + return vulkan_map_from_drm(hwfc, dst, src, flags);
> > +#endif
> > + default:
> > + return AVERROR(ENOSYS);
> > + }
> > +}
> > +
> > +static int vulkan_map_from(AVHWFramesContext *hwfc, AVFrame *dst,
> > + const AVFrame *src, int flags)
> > +{
> > + int err;
> > +
> > + if ((dst->format != AV_PIX_FMT_NONE && !av_vkfmt_from_pixfmt(dst->format)))
> {
> > + av_log(hwfc, AV_LOG_ERROR, "Unsupported destination pixel
> format!\n");
> > + return AVERROR(EINVAL);
> > + }
> > +
> > + err = vulkan_map_frame(hwfc, dst, src, flags);
> > + if (err)
> > + return err;
> > +
> > + err = av_frame_copy_props(dst, src);
> > + if (err)
> > + return err;
> > +
> > + return 0;
> > +}
> > +
> > +int av_vk_create_buf(AVHWDeviceContext *ctx, AVVkBuffer *buf, size_t
> size,
> > + VkBufferUsageFlags usage, VkMemoryPropertyFlagBits
> flags,
> > + void *create_pnext, void *alloc_pnext)
>
> I don't think a function like this should be exposed to the user here.
> Maybe we could make some sort of hwcontext buffer pool API for creating
> auxiliary buffers if it were needed, but I'm not entirely clear what the
> user would want it for here. Similarly the other buffer functions below.
>
Yeah, I guess. I can specialize it here by removing the malloc when
mapping/unmapping.
> > +{
> > + int err;
> > + VkResult ret;
> > + VkMemoryRequirements req;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +
> > + VkBufferCreateInfo buf_spawn = {
> > + .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
> > + .pNext = create_pnext,
> > + .usage = usage,
> > + .sharingMode = VK_SHARING_MODE_EXCLUSIVE,
> > + .size = size, /* Gets FFALIGNED during alloc if host
> visible
> > + but should be ok */
> > + };
> > +
> > + ret = vkCreateBuffer(hwctx->act_dev, &buf_spawn, NULL, &buf->buf);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Failed to create buffer: %s\n",
> > + av_vk_ret2str(ret));
> > + return AVERROR_EXTERNAL;
> > + }
> > +
> > + vkGetBufferMemoryRequirements(hwctx->act_dev, buf->buf, &req);
> > +
> > + err = av_vk_alloc_mem(ctx, &req, flags, alloc_pnext,
> > + &buf->flags, &buf->mem);
> > + if (err)
> > + return err;
> > +
> > + ret = vkBindBufferMemory(hwctx->act_dev, buf->buf, buf->mem, 0);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Failed to bind memory to buffer:
> %s\n",
> > + av_vk_ret2str(ret));
> > + return AVERROR_EXTERNAL;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +void av_vk_free_buf(AVHWDeviceContext *ctx, AVVkBuffer *buf)
> > +{
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + if (!buf)
> > + return;
> > +
> > + vkDestroyBuffer(hwctx->act_dev, buf->buf, NULL);
> > + vkFreeMemory(hwctx->act_dev, buf->mem, NULL);
> > +}
> > +
> > +int av_vk_map_buffers(AVHWDeviceContext *ctx, AVVkBuffer *buf, uint8_t
> *mem[],
> > + int nb_buffers, int invalidate)
> > +{
> > + VkResult ret;
> > + int i, err = 0;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + VkMappedMemoryRange *invalidate_ctx = NULL;
> > + int invalidate_count = 0;
> > +
> > + for (i = 0; i < nb_buffers; i++) {
> > + ret = vkMapMemory(hwctx->act_dev, buf[i].mem, 0,
> > + VK_WHOLE_SIZE, 0, (void **)&mem[i]);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Failed to map buffer memory:
> %s\n",
> > + av_vk_ret2str(ret));
> > + err = AVERROR_EXTERNAL;
> > + goto end;
> > + }
> > + }
> > +
> > + if (!invalidate)
> > + goto end;
> > +
> > + for (i = 0; i < nb_buffers; i++) {
> > + const VkMappedMemoryRange ival_buf = {
> > + .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
> > + .memory = buf[i].mem,
> > + .size = VK_WHOLE_SIZE,
> > + };
> > + if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)
> > + continue;
> > + ADD_VAL_TO_LIST(invalidate_ctx, invalidate_count, ival_buf);
> > + }
> > +
> > + if (invalidate_count) {
> > + ret = vkInvalidateMappedMemoryRanges(hwctx->act_dev,
> > + invalidate_count,
> > + invalidate_ctx);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Failed to invalidate memory:
> %s\n",
> > + av_vk_ret2str(ret));
> > + err = AVERROR_EXTERNAL;
> > + goto end;
> > + }
> > + }
> > +
> > +end:
> > + av_free(invalidate_ctx);
> > + return err;
> > +}
> > +
> > +int av_vk_unmap_buffers(AVHWDeviceContext *ctx, AVVkBuffer *buf, int
> nb_buffers,
> > + int flush)
> > +{
> > + VkResult ret;
> > + int i, err = 0;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + VkMappedMemoryRange *flush_ctx = NULL;
> > + int flush_count = 0;
> > +
> > + if (flush) {
> > + for (i = 0; i < nb_buffers; i++) {
> > + const VkMappedMemoryRange ival_buf = {
> > + .sType = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
> > + .memory = buf[i].mem,
> > + .size = VK_WHOLE_SIZE,
> > + };
> > + if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)
> > + continue;
> > + ADD_VAL_TO_LIST(flush_ctx, flush_count, ival_buf);
> > + }
> > + }
> > +
> > + if (flush_count) {
> > + ret = vkFlushMappedMemoryRanges(hwctx->act_dev, flush_count,
> flush_ctx);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Failed to flush memory: %s\n",
> > + av_vk_ret2str(ret));
> > + err = AVERROR_EXTERNAL; /* We still want to try to unmap
> them */
> > + }
> > + }
> > +
> > + for (i = 0; i < nb_buffers; i++)
> > + vkUnmapMemory(hwctx->act_dev, buf[i].mem);
> > +
> > +end:
> > + av_free(flush_ctx);
> > + return err;
> > +}
> > +
> > +static int transfer_image_buf(AVHWDeviceContext *ctx, AVVkFrame *frame,
> > + AVVkBuffer *buffer, const int *stride,
> int w, int h,
> > + enum AVPixelFormat pix_fmt, int to_buf)
> > +{
> > + int i;
> > + VkResult ret;
> > + AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > + VulkanDevicePriv *s = ctx->internal->priv;
> > +
> > + const int planes = av_pix_fmt_count_planes(pix_fmt);
> > + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
> > +
> > + VkCommandBufferBeginInfo cmd_start = {
> > + .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO,
> > + .flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT,
> > + };
> > +
> > + VkSubmitInfo s_info = {
> > + .sType = VK_STRUCTURE_TYPE_SUBMIT_INFO,
> > + .commandBufferCount = 1,
> > + .pCommandBuffers = &s->exec.buf,
> > + };
> > +
> > + vkBeginCommandBuffer(s->exec.buf, &cmd_start);
> > +
> > + { /* Change the image layout to something more optimal for
> transfers */
> > + VkImageMemoryBarrier bar = {
> > + .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER,
> > + .srcAccessMask = 0,
> > + .dstAccessMask = to_buf ? VK_ACCESS_TRANSFER_READ_BIT :
> > + VK_ACCESS_TRANSFER_WRITE_BIT,
> > + .oldLayout = frame->layout,
> > + .newLayout = to_buf ? VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL
> :
> > + VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
> > + .srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
> > + .dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
> > + .image = frame->img,
> > + .subresourceRange.levelCount = 1,
> > + .subresourceRange.layerCount = 1,
> > + };
> > +
> > + if (planes == 1) {
> > + bar.subresourceRange.aspectMask =
> VK_IMAGE_ASPECT_COLOR_BIT;
> > + } else {
> > + bar.subresourceRange.aspectMask =
> VK_IMAGE_ASPECT_PLANE_0_BIT;
> > + bar.subresourceRange.aspectMask |=
> VK_IMAGE_ASPECT_PLANE_1_BIT;
> > + if (planes > 2)
> > + bar.subresourceRange.aspectMask |=
> VK_IMAGE_ASPECT_PLANE_2_BIT;
> > + }
> > +
> > + vkCmdPipelineBarrier(s->exec.buf,
> VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
> > + VK_PIPELINE_STAGE_TRANSFER_BIT,
> > + 0, 0, NULL, 0, NULL, 1, &bar);
> > +
> > + /* Update to the new layout */
> > + frame->layout = bar.newLayout;
> > + frame->access = bar.dstAccessMask;
> > + }
> > +
> > + /* Schedule a copy for each plane */
> > + for (i = 0; i < planes; i++) {
> > + VkImageSubresourceLayers sub = {
> > + .aspectMask = planes < 2 ? VK_IMAGE_ASPECT_COLOR_BIT :
> > + i == 0 ? VK_IMAGE_ASPECT_PLANE_0_BIT :
> > + i == 1 ? VK_IMAGE_ASPECT_PLANE_1_BIT :
> > + VK_IMAGE_ASPECT_PLANE_2_BIT,
> > + .layerCount = 1,
> > + };
> > + const int p_w = i > 0 ? AV_CEIL_RSHIFT(w, desc->log2_chroma_w)
> : w;
> > + const int p_h = i > 0 ? AV_CEIL_RSHIFT(h, desc->log2_chroma_h)
> : h;
> > + VkBufferImageCopy buf_reg = {
> > + .bufferOffset = 0,
> > + /* Buffer stride isn't in bytes, it's in samples, the
> implementation
> > + * uses the image's VkFormat to know how many bytes per
> sample
> > + * the buffer has. So we have to convert by dividing.
> Stupid. */
> > + .bufferRowLength = stride[i] / desc->comp[i].step,
>
> And does the right thing for YUYV?
>
Apparently yes, uploading and downloading uyvy422 works.
>
> > + .bufferImageHeight = p_h,
> > + .imageSubresource = sub,
> > + .imageOffset = { 0 },
> > + .imageExtent = { p_w, p_h, 1, },
> > + };
> > + if (to_buf)
> > + vkCmdCopyImageToBuffer(s->exec.buf, frame->img,
> frame->layout,
> > + buffer[i].buf, 1, &buf_reg);
> > + else
> > + vkCmdCopyBufferToImage(s->exec.buf, buffer[i].buf,
> frame->img,
> > + frame->layout, 1, &buf_reg);
> > + }
> > +
> > + vkEndCommandBuffer(s->exec.buf);
> > +
> > + ret = vkQueueSubmit(s->exec.queue, 1, &s_info, s->exec.fence);
> > + if (ret != VK_SUCCESS) {
> > + av_log(ctx, AV_LOG_ERROR, "Unable to submit command buffer:
> %s\n",
> > + av_vk_ret2str(ret));
> > + return AVERROR_EXTERNAL;
> > + } else {
> > + vkWaitForFences(hwctx->act_dev, 1, &s->exec.fence, VK_TRUE,
> UINT64_MAX);
> > + vkResetFences(hwctx->act_dev, 1, &s->exec.fence);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int vulkan_transfer_data_to(AVHWFramesContext *hwfc, AVFrame
> *dst,
> > + const AVFrame *src)
> > +{
> > + int i, err = 0;
> > + AVFrame *map = NULL;
> > + AVVkBuffer buf[3] = { { 0 } };
> > + AVVkFrame *f = (AVVkFrame *)dst->data[0];
> > + AVHWDeviceContext *dev_ctx = hwfc->device_ctx;
> > + VulkanDevicePriv *p = dev_ctx->internal->priv;
> > + const int planes = av_pix_fmt_count_planes(src->format);
> > + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(src->format);
> > + int map_host = p->extensions & EXT_EXTERNAL_HOST_MEMORY;
> > +
> > + if ((src->format != AV_PIX_FMT_NONE && !av_vkfmt_from_pixfmt(src->format)))
> {
> > + av_log(hwfc, AV_LOG_ERROR, "Unsupported source pixel
> format!\n");
> > + return AVERROR(EINVAL);
> > + }
> > +
> > + if (src->width > hwfc->width || src->height > hwfc->height)
> > + return AVERROR(EINVAL);
> > +
> > + /* Path one - image is host visible and linear */
> > + if (f->tiling & VK_IMAGE_TILING_LINEAR &&
> > + f->flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) {
> > + map = av_frame_alloc();
> > + if (!map)
> > + return AVERROR(ENOMEM);
> > + map->format = src->format;
> > +
> > + err = vulkan_map_frame(hwfc, map, dst, AV_HWFRAME_MAP_WRITE);
> > + if (err)
> > + goto end;
> > +
> > + err = av_frame_copy(map, src);
> > + goto end;
> > + }
> > +
> > + /* Path three - we can import _host_ memory and bind it to a buffer
> */
> > + for (i = 0; i < planes; i++) {
> > + int h = src->height;
> > + int p_height = i > 0 ? AV_CEIL_RSHIFT(h, desc->log2_chroma_h) :
> h;
> > + size_t size = p_height*src->linesize[i];
> > + VkImportMemoryHostPointerInfoEXT import_desc = {
> > + .sType = VK_STRUCTURE_TYPE_IMPORT_
> MEMORY_HOST_POINTER_INFO_EXT,
> > + .handleType = VK_EXTERNAL_MEMORY_HANDLE_
> TYPE_HOST_ALLOCATION_BIT_EXT,
> > + .pHostPointer = src->data[i],
> > + };
> > + err = av_vk_create_buf(dev_ctx, &buf[i], size,
> > + VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
> > + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT,
> NULL,
> > + map_host ? &import_desc : NULL);
> > + if (err)
> > + goto end;
> > + }
> > +
> > + /* Path two - we can't import host memory so we have to do 2 copies
> */
> > + if (!map_host) {
> > + uint8_t *mem[3];
> > + if ((err = av_vk_map_buffers(dev_ctx, buf, mem, planes, 0)))
> > + goto end;
> > +
> > + for (i = 0; i < planes; i++) {
> > + int h = src->height;
> > + int p_height = i > 0 ? AV_CEIL_RSHIFT(h,
> desc->log2_chroma_h) : h;
> > + memcpy(mem[i], src->data[i], p_height*src->linesize[i]);
> > + }
> > +
> > + if ((err = av_vk_unmap_buffers(dev_ctx, buf, planes, 1)))
> > + goto end;
> > + }
>
> Any particular reason why the paths are labelled one, three, two?
>
> Which devices run each of these cases?
>
Path three is really path 2 + a copy.
It depends on the device and whether the image is linear.
If the image is linear, path 1 is used and the image mapped and downloaded
directly.
If not, path 2 is taken. A temporary buffer is created and the image is
copied to the buffer.
If the device supports host mapped memory, the buffer was mapped to a chunk
of RAM and downloaded to it. If the device doesn't support host mapped
memory, the buffer was allocated normally on the GPU and is mapped and
copied to RAM by us.
> > + /* Copy buffer to image */
> > + transfer_image_buf(dev_ctx, f, buf, src->linesize,
> > + src->width, src->height, src->format, 0);
> > +
> > +end:
> > + av_frame_free(&map);
> > + for (i = 0; i < planes; i++)
> > + av_vk_free_buf(dev_ctx, &buf[i]);
> > +
> > + return err;
> > +}
> > +
> > +static int vulkan_transfer_data_from(AVHWFramesContext *hwfc, AVFrame
> *dst,
> > + const AVFrame *src)
> > +{
> > + int i, err = 0;
> > + AVFrame *map = NULL;
> > + AVVkBuffer buf[3] = { { 0 } };
> > + AVVkFrame *f = (AVVkFrame *)src->data[0];
> > + AVHWDeviceContext *dev_ctx = hwfc->device_ctx;
> > + VulkanDevicePriv *p = dev_ctx->internal->priv;
> > + const int planes = av_pix_fmt_count_planes(dst->format);
> > + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dst->format);
> > + const int map_host = p->extensions & EXT_EXTERNAL_HOST_MEMORY;
> > +
> > + if (dst->width > hwfc->width || dst->height > hwfc->height)
> > + return AVERROR(EINVAL);
> > +
> > + /* Path one - image is host visible and linear */
> > + if (f->tiling & VK_IMAGE_TILING_LINEAR &&
> > + f->flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) {
> > + map = av_frame_alloc();
> > + if (!map)
> > + return AVERROR(ENOMEM);
> > + map->format = dst->format;
> > +
> > + err = vulkan_map_frame(hwfc, map, src, AV_HWFRAME_MAP_READ);
> > + if (err)
> > + goto end;
> > +
> > + err = av_frame_copy(dst, map);
> > + goto end;
> > + }
> > +
> > + /* Path two */
> > + for (i = 0; i < planes; i++) {
> > + VkImportMemoryHostPointerInfoEXT import_desc = {
> > + .sType = VK_STRUCTURE_TYPE_IMPORT_
> MEMORY_HOST_POINTER_INFO_EXT,
> > + .handleType = VK_EXTERNAL_MEMORY_HANDLE_
> TYPE_HOST_ALLOCATION_BIT_EXT,
> > + .pHostPointer = dst->data[i],
> > + };
> > + int h = dst->height;
> > + int p_height = i > 0 ? AV_CEIL_RSHIFT(h, desc->log2_chroma_h) :
> h;
> > + err = av_vk_create_buf(dev_ctx, &buf[i], p_height *
> dst->linesize[i],
> > + VK_BUFFER_USAGE_TRANSFER_DST_BIT,
> > + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT,
> NULL,
> > + map_host ? &import_desc : NULL);
> > + }
> > +
> > + /* Copy image to buffer */
> > + transfer_image_buf(dev_ctx, f, buf, dst->linesize,
> > + dst->width, dst->height, dst->format, 1);
> > +
> > + if (!map_host) {
> > + uint8_t *mem[3];
> > + av_vk_map_buffers(dev_ctx, buf, mem, planes, 1);
> > +
> > + for (i = 0; i < planes; i++) {
> > + int h = dst->height;
> > + int p_height = i > 0 ? AV_CEIL_RSHIFT(h,
> desc->log2_chroma_h) : h;
> > + memcpy(dst->data[i], mem[i], p_height * dst->linesize[i]);
> > + }
> > +
> > + av_vk_unmap_buffers(dev_ctx, buf, planes, 0);
> > + }
> > +
> > +end:
> > + av_frame_free(&map);
> > + for (i = 0; i < planes; i++)
> > + av_vk_free_buf(dev_ctx, &buf[i]);
> > +
> > + return err;
> > +}
> > +
> > +const HWContextType ff_hwcontext_type_vulkan = {
> > + .type = AV_HWDEVICE_TYPE_VULKAN,
> > + .name = "Vulkan",
> > +
> > + .device_hwctx_size = sizeof(AVVulkanDeviceContext),
> > + .device_priv_size = sizeof(VulkanDevicePriv),
> > + .frames_hwctx_size = sizeof(AVVulkanFramesContext),
> > +
> > + .device_create = &vulkan_device_create,
> > + .device_derive = &vulkan_device_derive,
>
> The lack of device_init is suspicious. Do you not need any setup on
> external devices? You probably want to at least check that they support
> the required extensions.
>
Forgot that users could do this. Will fix.
>
> > +
> > + .frames_get_constraints = &vulkan_frames_get_constraints,
> > + .frames_init = vulkan_frames_init,
> > + .frames_get_buffer = vulkan_get_buffer,
> > +
> > + .transfer_get_formats = vulkan_transfer_get_formats,
> > + .transfer_data_to = vulkan_transfer_data_to,
> > + .transfer_data_from = vulkan_transfer_data_from,
> > +
> > + .map_to = vulkan_map_to,
> > + .map_from = vulkan_map_from,
> > +
> > + .pix_fmts = (const enum AVPixelFormat[]) {
> > + AV_PIX_FMT_VULKAN,
> > + AV_PIX_FMT_NONE
> > + },
> > +};
> > diff --git a/libavutil/hwcontext_vulkan.h b/libavutil/hwcontext_vulkan.h
> > new file mode 100644
> > index 0000000000..db2204d590
> > --- /dev/null
> > +++ b/libavutil/hwcontext_vulkan.h
> > @@ -0,0 +1,116 @@
> > +/*
> > + * Vulkan hwcontext
> > + * Copyright (c) 2018 Rostislav Pehlivanov <atomnuker at gmail.com>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> > + */
> > +
> > +#ifndef AVUTIL_HWCONTEXT_VULKAN_H
> > +#define AVUTIL_HWCONTEXT_VULKAN_H
> > +
> > +#include <vulkan/vulkan.h>
> > +
> > +/**
> > + * @file
> > + * API-specific header for AV_HWDEVICE_TYPE_VULKAN.
> > + *
> > + * For user-allocated pools, AVHWFramesContext.pool must return
> AVBufferRefs
> > + * with the data pointer set to an AVVkFrame.
> > + */
> > +
> > +/* Convenience function to load extension functions */
> > +#define VK_LOAD_PFN(inst, name) PFN_##name pfn_##name = (PFN_##name)
> \
> > +
> vkGetInstanceProcAddr(inst, #name)
>
> This shouldn't be in the user-visible API.
>
Its not API, its a macro. But I guess I'll remove it.
> > +
> > +/* Main Vulkan context, allocated as AVHWDeviceContext.hwctx.
> > + * All of these can be set before init to change what the context uses
> */
> > +typedef struct AVVulkanDeviceContext {
> > + VkInstance inst; /* Instance */
> > + VkPhysicalDevice phys_dev; /* Physical device */
> > + VkDevice act_dev; /* Activated physical device */
> > + int queue_family_index; /* Queue family index for graphics */
> > + int queue_family_tx_index; /* Queue family index for transfer ops
> only */
> > + int queue_family_comp_index; /* Queue family index for compute ops
> */
> > +} AVVulkanDeviceContext;
>
> Should all be doxygen comments, and below.
>
Will fix.
> > +
> > +/* Allocated as AVHWFramesContext.hwctx, used to set pool-specific
> options */
> > +typedef struct AVVulkanFramesContext {
> > + VkImageTiling tiling; /* Controls the tiling of output frames
> */
> > + VkImageUsageFlagBits usage; /* Defines the usage of output frames */
> > + int disjoint; /* Set to 1 to allocate all planes
> separately */
> > + void *create_pnext; /* Extension data for creation */
> > + void *alloc_pnext[3]; /* Extension data for allocation */
> > +} AVVulkanFramesContext;
> > +
> > +/*
> > + * Frame structure, the VkFormat of the image will always match
> > + * the pool's sw_format.
> > + */
> > +typedef struct AVVkFrame {
> > + VkImage img;
> > + VkImageTiling tiling;
> > + int mem_count; /* Always 1 for non-disjoint images, #planes for
> disjoint */
> > + VkDeviceMemory mem[3];
> > + VkMemoryPropertyFlagBits flags; /* OR'd flags for all memory
> allocated */
> > +
> > + /* Updated after every barrier */
> > + VkAccessFlagBits access;
> > + VkImageLayout layout;
> > +} AVVkFrame;
>
> As noted above, I'm very dubious about including anything after this point
> in ther user API.
>
> > +/* Converts Vulkan return values to strings */
> > +const char *av_vk_ret2str(VkResult res);
> > +
> > +/* Converts AVPixelFormat to VkFormat, returns VK_FORMAT_UNDEFINED if
> unsupported
> > + * by the hwcontext */
> > +VkFormat av_vkfmt_from_pixfmt(enum AVPixelFormat p);
>
This will be gone if wm4's api gets in.
> > +
> > +/* Allocates memory, sets flags based on the memory type it finds for
> them.
> > + * Will align size to the minimum map alignment requirement in case
> req_flags
> > + * has VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT set */
> > +int av_vk_alloc_mem(AVHWDeviceContext *ctx, VkMemoryRequirements *req,
> > + VkMemoryPropertyFlagBits req_flags, void
> *alloc_extension,
> > + VkMemoryPropertyFlagBits *mem_flags, VkDeviceMemory
> *mem);
>
I think this should stay.
> > +
> > +/* Buffer helpers */
> > +typedef struct AVVkBuffer {
> > + VkBuffer buf;
> > + VkDeviceMemory mem;
> > + VkMemoryPropertyFlagBits flags;
> > +} AVVkBuffer;
> > +
> > +int av_vk_create_buf(AVHWDeviceContext *ctx, AVVkBuffer *buf, size_t
> size,
> > + VkBufferUsageFlags usage, VkMemoryPropertyFlagBits
> flags,
> > + void *create_pnext, void *alloc_pnext);
> > +int av_vk_map_buffers(AVHWDeviceContext *ctx, AVVkBuffer *buf, uint8_t
> *mem[],
> > + int nb_buffers, int invalidate);
> > +int av_vk_unmap_buffers(AVHWDeviceContext *ctx, AVVkBuffer *buf, int
> nb_buffers,
> > + int flush);
> > +void av_vk_free_buf(AVHWDeviceContext *ctx, AVVkBuffer *buf);
> > +
> > +/* Creates the needed contexts to run commands */
> > +typedef struct AVVkExecContext {
> > + VkCommandPool pool;
> > + VkCommandBuffer buf;
> > + VkQueue queue;
> > + VkFence fence;
> > +} AVVkExecContext;
> > +
> > +int av_vk_create_exec_ctx(AVHWDeviceContext *ctx, AVVkExecContext *e,
> int queue);
> > +void av_vk_free_exec_ctx(AVHWDeviceContext *ctx, AVVkExecContext *e);
>
These will be moved.
> > +
> > +#endif /* AVUTIL_HWCONTEXT_VULKAN_H */
> > diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
> > index e77d5f44b9..ef4933389c 100644
> > --- a/libavutil/pixdesc.c
> > +++ b/libavutil/pixdesc.c
> > @@ -1652,6 +1652,10 @@ static const AVPixFmtDescriptor
> av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
> > .name = "videotoolbox_vld",
> > .flags = AV_PIX_FMT_FLAG_HWACCEL,
> > },
> > + [AV_PIX_FMT_VULKAN] = {
> > + .name = "vulkan",
> > + .flags = AV_PIX_FMT_FLAG_HWACCEL,
> > + },
> > [AV_PIX_FMT_GBRP] = {
> > .name = "gbrp",
> > .nb_components = 3,
> > diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h
> > index e184a56672..a149aa67d9 100644
> > --- a/libavutil/pixfmt.h
> > +++ b/libavutil/pixfmt.h
> > @@ -330,6 +330,10 @@ enum AVPixelFormat {
> > */
> > AV_PIX_FMT_OPENCL,
> >
> > + /* Vulkan hardware images,
> > + * data[0] contain an AVVkFrame */
> > + AV_PIX_FMT_VULKAN,
> > +
> > AV_PIX_FMT_NB ///< number of pixel formats, DO NOT USE THIS
> if you want to link with shared libav* because the number of formats might
> differ between versions
> > };
> >
> > diff --git a/libavutil/version.h b/libavutil/version.h
> > index d3dd2df544..296c24bcd4 100644
> > --- a/libavutil/version.h
> > +++ b/libavutil/version.h
> > @@ -79,7 +79,7 @@
> > */
> >
> > #define LIBAVUTIL_VERSION_MAJOR 56
> > -#define LIBAVUTIL_VERSION_MINOR 12
> > +#define LIBAVUTIL_VERSION_MINOR 13
> > #define LIBAVUTIL_VERSION_MICRO 100
> >
> > #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR,
> \
> >
>
> Do you have a user program (other than ffmpeg.c) which uses any of this,
> ideally with an external Vulkan context it's also doing other stuff in?
> (mpv?)
>
No? Should I have?
> (I didn't read this carefully looking for memory issues, nor have I tested
> it since some previous version: I can do that if you want.)
>
Not yet, after I resubmit.
> Thanks,
>
> - Mark
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
More information about the ffmpeg-devel
mailing list