[FFmpeg-devel] [PATCH] lavc/qsvdec: allow qsv decoders to use initialized device

Wed Sep 2 17:36:30 EEST 2020

On Wed, 2020-09-02 at 14:21 +0000, Rogozhkin, Dmitry V wrote:
> On Wed, 2020-09-02 at 08:41 +0000, Soft Works wrote:
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> > > Rogozhkin, Dmitry V
> > > Sent: Wednesday, September 2, 2020 8:45 AM
> > > To: ffmpeg-devel at ffmpeg.org
> > > Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvdec: allow qsv
> > > decoders
> > > to use
> > > initialized device
> > > 
> > > On Wed, 2020-09-02 at 04:32 +0000, Soft Works wrote:
> > > > > -----Original Message-----
> > > > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On
> > > > > Behalf
> > > > > Of
> > > > > Soft Works
> > > > > Sent: Wednesday, September 2, 2020 6:13 AM
> > > > > To: FFmpeg development discussions and patches <ffmpeg-
> > > > > devel at ffmpeg.org>
> > > > > Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvdec: allow qsv
> > > > > decoders
> > > > > to use initialized device
> > > > > 
> > > > > > 
> > > > > > > -----Original Message-----
> > > > > > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On
> > > 
> > > Behalf
> > > > > > > Of Dmitry Rogozhkin
> > > > > > > Sent: Wednesday, September 2, 2020 4:44 AM
> > > > > > > To: ffmpeg-devel at ffmpeg.org
> > > > > > > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin at intel.com>
> > > > > > > Subject: [FFmpeg-devel] [PATCH] lavc/qsvdec: allow qsv
> > > > > > > decoders
> > > > > > > to use initialized device
> > > > > > > 
> > > > > > > qsv decoders did not allow to use devices explicitly
> > > > > > > initialized
> > > > > > > on the command line and actually were using default
> > > > > > > device.
> > > > > > > This
> > > > > > > starts to cause confusion with intel discrete GPUs since
> > > > > > > in
> > > > > > > this
> > > > > > > case decoder might run on default integrated GPU device
> > > > > > > (/dev/dri/renderD128) and encoder on the device specified
> > > > > > > on the
> > > > > > > command line
> > > > > > 
> > > > > > (/dev/dri/renderD129).
> > > > > > > 
> > > > > > > Example:
> > > > > > > ffmpeg -init_hw_device vaapi=va:/dev/dri/renderD129
> > > > > > > -init_hw_device qsv=hw at va \
> > > > > > >   -c:v h264_qsv -i input.h264 -c:v hevc_qsv -y
> > > > > > > output.h264
> > > > > 
> > > > > I apologize, I picked the wrong thing. The qsv_device
> > > > > parameter
> > > > > is
> > > > > what allows setting the device for a QSV decoder:
> > > > > 
> > > > > fmpeg -qsv_device /dev/dri/renderD128 -c:v:0 h264_qsv
> > > > > -hwaccel:v:0
> > > > > qsv -i INPUT ....
> > > > > 
> > > > > Kind regards,
> > > > > softworkz
> > > > 
> > > > Here's the commit where the parameter had been added:
> > > > 
> > > 
> > > https://github.com/FFmpeg/FFmpeg/commit/1a79b8f8d2b5d26c60c237d6e5
> > > 8587
> > > > 3238e46914
> > > 
> > > I am aware of this option.
> > > 
> > > > -qsv_device /dev/dri/renderD129
> > > 
> > > By itself this don’t work. Both decoder and encoder will run on
> > > /dev/dri/renderD128 instead.
> > > 
> > > > -hwaccel qsv -qsv_device /dev/dri/renderD129
> > > 
> > > Adding -hwaccel helps. This works. However, to me this is non-
> > > intuitive: why qsv_device should be used instead of
> > > hwaccel_device
> > > while
> > > ffmpeg help gives a different hint:
> > >     -hwaccel hwaccel name  use HW accelerated decoding
> > >     -hwaccel_device devicename  select a device for HW
> > > acceleration
> > > From this
> > > perspective Haihao’s patch which is currently on the mailing list
> > > makes sense
> > > to me  it just simplifies things.
> > 
> > In case of QSV, the meaning of hwaccel_device is already defined:
> > 
> 
> Adding new values does not break functionality. I believe that
> definition for hwaccel_device on Linux for qsv can be extended w/ drm
> device specifiers for the convenience of overyone. As I noted in
> other
> thread, msdk on Linux does not distinguish hw,hw2,hw3 - these are all
> treated the same and library requires external device setting via
> SetHandle.

Small suggestion: let's move discussion around -qsv_device and
-hwaccel_device options entirely to the "ffmpeg_qsv: use
-hwaccel_device to specify a device for VAAPI backend" thread and a
return focus on the original patch which is not about -qsv_device, but
about this command line:

ffmpeg -init_hw_device vaapi=va:/dev/dri/renderD129 -init_hw_device 
qsv=hw at va -c:v h264_qsv -i input.h264 -c:v hevc_qsv -y output.h264

For what I see it uses valid device specifications and still in
original ffmpeg implementation it results in:
- decoder is being run on /dev/dri/renderD128
- encoder is being run on /dev/dri/renderD129

In general, per my taste, I would try to use the following device
specification working with QSV on Linux across all command lines:

-init_hw_device vaapi=va:/dev/dri/renderD129 -init_hw_device qsv=hw at va
-filter_hw_device hw

My primary goal is to make it workable for all pipelines. Hence current
patch.

> 
> > device selects a value in ‘MFX_IMPL_*’. Allowed values are:
> > auto sw  hw  auto_any hw_any hw2 hw3 hw4
> > 
> > From my understanding, that's the reason why the qsv_device
> > parameter
> > was introduced.
> > 
> > > 
> > > Unfortunately ffmpeg device selection options are quite
> > > confusing. 
> > 
> > That's true without doubt. Sadly, there's not much consistency
> > among
> > all hw accelerations in general. 
> > 
> > > This will work. But returning back to transcoding case: there are
> > > 2
> > > significantly
> > > different command lines which should be used for transcoding and
> > > encoding
> > > to make things run on /dev/dri/renderD129.
> > > This is inconvenient to handle… And additionally there is also -
> > > filter_hw_device which also contributes to the complication. Also
> > > there are
> > > special marks in documentation for the qsv “Unlike most other
> > > values, this
> > > option does not enable accelerated decoding (that is used
> > > automatically
> > > whenever a qsv decoder is selected), but accelerated transcoding,
> > > without
> > > copying the frames into the system memory. For it to work, both
> > > the
> > > decoder and the encoder must support QSV acceleration and no
> > > filters must
> > > be used.”
> > > One missing thing seems to be documentation on the scope of -
> > > init_hw_device option applicability. This seems to be a global
> > > option, but in
> > > the example from the commit message encoder actually takes device
> > > from it,
> > > but decoder just ignores it and goes with default device. Why?
> > > This
> > > does not
> > > seem to be right.
> > > 
> > > Can someone, please, shine the light on how all these device
> > > selection
> > > options were supposed to work?
> > 
> > From my understanding, this has evolved somewhat like this:
> > 
> > Initially, it was only possible to use a D3D or VAAPI context that
> > was  internally
> > created by the MSDK. And the MSDK's mechanism for this are the
> > MSDK's
> > enum
> > values (auto sw  hw  auto_any hw_any hw2 hw3 hw4).
> 
> Not fully right. That's correct for Windows. For Linux:
> - msdk library never created VAAPI context internally, this always
> have
> to be specified externally by application
> - Basically because of that msdk on Linux never distinguished between
> hw,hw2,hw3,etc. - it only was concerned about sw vs. hw.
> 
> > 
> > For that reason, those values were made the possible options for
> > device selection
> > In case of QSV hwaccel. Probably, the initial implementation of QSV
> > hwaccel didn't
> > do much more than connecting the device context between encoder and
> > decoder
> > (probably joining the encoder session to the decoder session).
> > 
> > I think that the documentation text you quoted was added at around
> > that time:
> > 
> > > Unlike most other values, this
> > > option does not enable accelerated decoding (that is used
> > > automatically
> > > whenever a qsv decoder is selected), but accelerated transcoding,
> > > without
> > > copying the frames into the system memory
> > 
> > Later, there were requirements that couldn't be implemented by
> > using
> > the D3D
> > Or VAAPI context that MSDK creates internally. At least these two:
> > 
> > - Interoperability with DXVA decoders on Windows
> >   (requires the D3D context to be created externally)
> > - VPP processing (IIRC it requires or required custom surface
> > allocation and that
> >   Isn't possible with a hw context created by MSDK internally)
> > 
> > To manually create a D3D context on Windows, you need to know the
> > right adapter
> > number of the GPU. But in earlier times of the MSDK dispatcher
> > code,
> > there 
> > was no direct correlation between the hw, hw2, hw3 and hw4 values
> > and
> > the 
> > D3D9 adapter numbers, because the dispatcher was counting Intel
> > devices only
> > (was changed later, but it's still not exactly reliable).
> > 
> > Similar for VAAPI: it wasn't reliably possible to deduct the DRI
> > path
> > from the 
> > MFX_IMPL enum value.
> > 
> > As such, there had to be a way for specifying a D3D9 adapter id on
> > Windows or 
> > DRI node on Linux. The hwaccel_device param already had Its
> > semantics
> > and
> > It wasn't possible to union the possible values, because - guess
> > what
> > - sometimes
> > It was and even still is required to specify hw, hw2, etc. in
> > addition to the D3D adapter 
> > or DRI node (I'd need to look up the cases)).
> 
> I believe this still can be handled via hw_accel to everyone's
> convenience. Basically, what we can do is: extend allowed set of
> values
> w/ drm device specification on Linux, i.e.:
>    device selects a value in ‘MFX_IMPL_*’. Allowed values are:
> auto sw  hw  auto_any hw_any hw2 hw3 hw4 >>>or drm device path. If
> drm
> device path is specified, 'hw' implementation is being used with the
> specified drm device.<<<
> 
> This is extension to the current specification and I don't think that
> it break anything.
> 
> > 
> > All that lead to the introduction of the 'qsv_device' parameter as
> > a
> > separate parameter
> > and it should explain why it had to be a separate parameter (and
> > still has to be).
> 
> I still don't see the full picture. What I am looking for at the
> first
> place is how maintainers and architects envision hwaccel to work in
> general. Basically, there are few ways to specify the device (-
> hwaccel, 
> -init_hw_device, -filter_hw_device), questions are:
> 1. What's scope of applicability of each option w/ explanation why
> each
> of the option is actually needed (as one of the examples: why -
> filter_hw_device is needed and why -init_hw_device can't be used
> instead?)
> 2. Since there are few methods how component can get a device: what's
> the priority order? (for example, if device can be deducted from
> incoming frames, is there a way to override it w/ some command line
> options?)
> 
> > 
> > That's all ugly and awful and I wouldn't want to defend it.
> > 
> > But an approach to make this any better should be well thought out
> > and should not 
> > break anything.
> > 
> > Kind regards,
> > softworkz
> > 
> > PS: Corrections welcome!
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > 
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".