[FFmpeg-devel] [PATCH] lavc/qsvdec: allow qsv decoders to use initialized device

Wed Sep 2 11:41:02 EEST 2020

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Rogozhkin, Dmitry V
> Sent: Wednesday, September 2, 2020 8:45 AM
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvdec: allow qsv decoders to use
> initialized device
> 
> On Wed, 2020-09-02 at 04:32 +0000, Soft Works wrote:
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> > > Soft Works
> > > Sent: Wednesday, September 2, 2020 6:13 AM
> > > To: FFmpeg development discussions and patches <ffmpeg-
> > > devel at ffmpeg.org>
> > > Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvdec: allow qsv decoders
> > > to use initialized device
> > >
> > > >
> > > > > -----Original Message-----
> > > > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On
> Behalf
> > > > > Of Dmitry Rogozhkin
> > > > > Sent: Wednesday, September 2, 2020 4:44 AM
> > > > > To: ffmpeg-devel at ffmpeg.org
> > > > > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin at intel.com>
> > > > > Subject: [FFmpeg-devel] [PATCH] lavc/qsvdec: allow qsv decoders
> > > > > to use initialized device
> > > > >
> > > > > qsv decoders did not allow to use devices explicitly initialized
> > > > > on the command line and actually were using default device. This
> > > > > starts to cause confusion with intel discrete GPUs since in this
> > > > > case decoder might run on default integrated GPU device
> > > > > (/dev/dri/renderD128) and encoder on the device specified on the
> > > > > command line
> > > >
> > > > (/dev/dri/renderD129).
> > > > >
> > > > > Example:
> > > > > ffmpeg -init_hw_device vaapi=va:/dev/dri/renderD129
> > > > > -init_hw_device qsv=hw at va \
> > > > >   -c:v h264_qsv -i input.h264 -c:v hevc_qsv -y output.h264
> > > I apologize, I picked the wrong thing. The qsv_device parameter is
> > > what allows setting the device for a QSV decoder:
> > >
> > > fmpeg -qsv_device /dev/dri/renderD128 -c:v:0 h264_qsv -hwaccel:v:0
> > > qsv -i INPUT ....
> > >
> > > Kind regards,
> > > softworkz
> >
> > Here's the commit where the parameter had been added:
> >
> https://github.com/FFmpeg/FFmpeg/commit/1a79b8f8d2b5d26c60c237d6e5
> 8587
> > 3238e46914
> 
> I am aware of this option.
> 
> > -qsv_device /dev/dri/renderD129
> By itself this don’t work. Both decoder and encoder will run on
> /dev/dri/renderD128 instead.
> 
> > -hwaccel qsv -qsv_device /dev/dri/renderD129
> Adding -hwaccel helps. This works. However, to me this is non-
> intuitive: why qsv_device should be used instead of hwaccel_device while
> ffmpeg help gives a different hint:
>     -hwaccel hwaccel name  use HW accelerated decoding
>     -hwaccel_device devicename  select a device for HW acceleration From this
> perspective Haihao’s patch which is currently on the mailing list makes sense
> to me  it just simplifies things.

In case of QSV, the meaning of hwaccel_device is already defined:

device selects a value in ‘MFX_IMPL_*’. Allowed values are:
auto sw  hw  auto_any hw_any hw2 hw3 hw4

From my understanding, that's the reason why the qsv_device parameter
was introduced.

> 
> Unfortunately ffmpeg device selection options are quite confusing. 

That's true without doubt. Sadly, there's not much consistency among
all hw accelerations in general. 

> This will work. But returning back to transcoding case: there are 2 significantly
> different command lines which should be used for transcoding and encoding
> to make things run on /dev/dri/renderD129.
> This is inconvenient to handle… And additionally there is also -
> filter_hw_device which also contributes to the complication. Also there are
> special marks in documentation for the qsv “Unlike most other values, this
> option does not enable accelerated decoding (that is used automatically
> whenever a qsv decoder is selected), but accelerated transcoding, without
> copying the frames into the system memory. For it to work, both the
> decoder and the encoder must support QSV acceleration and no filters must
> be used.”
> One missing thing seems to be documentation on the scope of -
> init_hw_device option applicability. This seems to be a global option, but in
> the example from the commit message encoder actually takes device from it,
> but decoder just ignores it and goes with default device. Why? This does not
> seem to be right.
> 
> Can someone, please, shine the light on how all these device selection
> options were supposed to work?

From my understanding, this has evolved somewhat like this:

Initially, it was only possible to use a D3D or VAAPI context that was  internally
created by the MSDK. And the MSDK's mechanism for this are the MSDK's enum
values (auto sw  hw  auto_any hw_any hw2 hw3 hw4).

For that reason, those values were made the possible options for device selection
In case of QSV hwaccel. Probably, the initial implementation of QSV hwaccel didn't
do much more than connecting the device context between encoder and decoder
(probably joining the encoder session to the decoder session).

I think that the documentation text you quoted was added at around that time:

> Unlike most other values, this
> option does not enable accelerated decoding (that is used automatically
> whenever a qsv decoder is selected), but accelerated transcoding, without
> copying the frames into the system memory

Later, there were requirements that couldn't be implemented by using the D3D
Or VAAPI context that MSDK creates internally. At least these two:

- Interoperability with DXVA decoders on Windows
  (requires the D3D context to be created externally)
- VPP processing (IIRC it requires or required custom surface allocation and that
  Isn't possible with a hw context created by MSDK internally)

To manually create a D3D context on Windows, you need to know the right adapter
number of the GPU. But in earlier times of the MSDK dispatcher code, there 
was no direct correlation between the hw, hw2, hw3 and hw4 values and the 
D3D9 adapter numbers, because the dispatcher was counting Intel devices only
(was changed later, but it's still not exactly reliable).

Similar for VAAPI: it wasn't reliably possible to deduct the DRI path from the 
MFX_IMPL enum value.

As such, there had to be a way for specifying a D3D9 adapter id on Windows or 
DRI node on Linux. The hwaccel_device param already had Its semantics and
It wasn't possible to union the possible values, because - guess what - sometimes
It was and even still is required to specify hw, hw2, etc. in addition to the D3D adapter 
or DRI node (I'd need to look up the cases)).

All that lead to the introduction of the 'qsv_device' parameter as a separate parameter
and it should explain why it had to be a separate parameter (and still has to be).

That's all ugly and awful and I wouldn't want to defend it.

But an approach to make this any better should be well thought out and should not 
break anything.

Kind regards,
softworkz

PS: Corrections welcome!