[FFmpeg-devel] [PATCH] avcodec/qsvenc: make QSV encoder encode VAAPI and D3D11 frames directly

Sat Jun 11 02:54:40 EEST 2022

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Xiang, Haihao
> Sent: Thursday, June 9, 2022 8:48 AM
> To: ffmpeg-devel at ffmpeg.org
> Cc: Wu, Tong1 <tong1.wu at intel.com>; Chen, Wenbin
> <wenbin.chen at intel.com>
> Subject: Re: [FFmpeg-devel] [PATCH] avcodec/qsvenc: make QSV encoder
> encode VAAPI and D3D11 frames directly
> 
> On Wed, 2022-06-08 at 11:13 +0000, Soft Works wrote:
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Xiang,
> > > Haihao
> > > Sent: Wednesday, June 8, 2022 10:42 AM
> > > To: ffmpeg-devel at ffmpeg.org
> > > Cc: Wu, Tong1 <tong1.wu at intel.com>; Chen, Wenbin
> <wenbin.chen at intel.com>
> > > Subject: Re: [FFmpeg-devel] [PATCH] avcodec/qsvenc: make QSV
> encoder encode
> > > VAAPI and D3D11 frames directly
> > >
> > > On Wed, 2022-06-08 at 05:08 +0000, Soft Works wrote:
> > > > > -----Original Message-----
> > > > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On
> Behalf Of Tong
> > > > > Wu
> > > > > Sent: Tuesday, June 7, 2022 11:22 AM
> > > > > To: ffmpeg-devel at ffmpeg.org
> > > > > Cc: Tong Wu <tong1.wu at intel.com>; Wenbin Chen
> <wenbin.chen at intel.com>
> > > > > Subject: [FFmpeg-devel] [PATCH] avcodec/qsvenc: make QSV
> encoder encode
> > > > > VAAPI
> > > > > and D3D11 frames directly
> >
> > [..]
> >
> > > > > 2.35.1.windows.2
> > > >
> > > > Hi,
> > > >
> > > > thanks for submitting this patch. Though, I'm afraid, but this
> > > >
> > > > - fundamentally contradicts the logic of ffmpeg's handling of
> hw
> > >
> > > acceleration,
> > > >   hw device and hw frames contexts
> > > > - adds code to an encoder, doing things an encoder is not
> supposed to do-
> > >
> > > qsv
> > > > encoders and decoders have their own context => QSV
> > >
> > > nvdec and nvenc have CUDA but nvenc can also support D3D11va, it
> sounds make
> > > sense for me to support D3D11va/vaapi in qsvenc too as
> d3d11va/vaapi are
> > > used
> > > internally in MediaSDK.
> >
> > Can you please post a command line showing nvenc working with input
> > from a D3D11VA decoder and without using any
> hwmap/hwupload/hwdownload
> > filters?
> >
> 
> According to the code below, nvenc may accept d3d11 frames directly,
> 
> https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/nvenc.c#L46-
> L72
> 
> so the command below should work
> 
> $> ffmpeg -y -hwaccel_output_format d3d11 -hwaccel d3d11va -i
> input.mp4 -c:v
> hevc_nvenc out.mp4

Right, it does work. Thanks for the command, I had tried like that
before, but in a "wrong" branch.

Now I took a bit of a deeper look into it and the ability of NVENC to
encode from plain D3D11 frames. There are quite a few differences
between NVENC and QSVENC.

HW Frames Contexts
------------------

QSVENV

MSDK cannot work with VAAPI frames, D3D9 frames or D3D11 frames directly.
An application is always required to wrap such frames via mfxSurface and
manage a collection of mfxSurface descriptions.
It's an abstraction that allows coding against the MSDK API independent
from the underlying technology. 
The technical representation of this in ffmpeg is the QSVFramesContext.
When there's an input of plain VAAPI or D3D11 frames (hw frames context),
then it is required to derive a new QSVFramesContext from the input hw
frames context (e.g. via hwmap filter) where the procedure of deriving
means to set up a new QSVFramesContext which does the required wrapping
(or "mapping" as ffmpeg calls it).

I think that the way how this logic is reflected in ffmpeg is thought
out very well and provides a high degree of flexibility.

NVENC

The situation is very different here. Nvidia provides platform independency
not by wrapping platform-specific GPU frame types, but instead uses its own
custom type - CUDA memory/frames. This is what decoders are outputting, filters
are using for input/output and encoders take as input.

What I do not know, is whether it would be possible to map D3D11 frames to
CUDA frames and vice versa. In case, that would be the preferable way IMO
to deal with different hw frame types. 
At least this isn't implemented at this time. The only possible frames
derivation/mapping is from and to Vulkan.

Hence I can't say whether the NVENC implementation to take D3D11 frames
directly has been done out of laziness or whether it was the only possible
way. In case when it wouldn't be possible to map D3D11 frames to CUDA 
frames, and only NVENC encoders would be able to process D3D11 frames,
then it would have been the only option of course.

But in any way, it's not the same as with QSVENC, because NVENC can take
D3D11 frames as input directly without wrapping/mapping first.

----------------

There are more differences, but I don't want to drive it too far.

What stands at the bottom line is:

- NVENC can take D3D11 frames context directly
- QSVENC can't - it needs to map it to a QSVFramesContext first

Concluding opinion:

An encoder should not include (duplicated) code for creating a derived frames
context. 
The same goal (= getting those command lines working that Haihao has posted)
could be achieved by auto-inserting a hwmap filter in those cases, which 
would probably take a few lines of code only. 

We don't have a precedence for auto-insertion of a hwmap filter, but we do
that in many other cases, so that would seem to me at least an option to 
think about.

I'm curious about other opinions...

Thanks,
softworkz