[FFmpeg-devel] [PATCH, v2 2/2] lavc/vaapi_decode: recreate hw_frames_ctx for vp9 without destroy va_context

Tue Sep 10 03:01:34 EEST 2019

On 09/09/2019 16:40, Fu, Linjie wrote:
>> -----Original Message-----
>> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf
>> Of Fu, Linjie
>> Sent: Friday, August 30, 2019 16:05
>> To: FFmpeg development discussions and patches <ffmpeg-
>> devel at ffmpeg.org>
>> Subject: Re: [FFmpeg-devel] [PATCH, v2 2/2] lavc/vaapi_decode: recreate
>> hw_frames_ctx for vp9 without destroy va_context
>>
>>> -----Original Message-----
>>> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On
>> Behalf
>>> Of Fu, Linjie
>>> Sent: Friday, August 9, 2019 19:47
>>> To: FFmpeg development discussions and patches <ffmpeg-
>>> devel at ffmpeg.org>
>>> Subject: Re: [FFmpeg-devel] [PATCH, v2 2/2] lavc/vaapi_decode: recreate
>>> hw_frames_ctx for vp9 without destroy va_context
>>>
>>>> -----Original Message-----
>>>> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On
>>> Behalf
>>>> Of Hendrik Leppkes
>>>> Sent: Friday, August 9, 2019 17:40
>>>> To: FFmpeg development discussions and patches <ffmpeg-
>>>> devel at ffmpeg.org>
>>>> Subject: Re: [FFmpeg-devel] [PATCH, v2 2/2] lavc/vaapi_decode:
>> recreate
>>>> hw_frames_ctx for vp9 without destroy va_context
>>>>
>>>> On Tue, Aug 6, 2019 at 10:55 AM Fu, Linjie <linjie.fu at intel.com> wrote:
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On
>>>> Behalf
>>>>>> Of Hendrik Leppkes
>>>>>> Sent: Tuesday, August 6, 2019 16:27
>>>>>> To: FFmpeg development discussions and patches <ffmpeg-
>>>>>> devel at ffmpeg.org>
>>>>>> Subject: Re: [FFmpeg-devel] [PATCH, v2 2/2] lavc/vaapi_decode:
>>>> recreate
>>>>>> hw_frames_ctx for vp9 without destroy va_context
>>>>>>
>>>>>> On Thu, Jul 11, 2019 at 10:20 AM Linjie Fu <linjie.fu at intel.com> wrote:
>>>>>>>
>>>>>>> VP9 allows resolution changes per frame. Currently in VAAPI,
>>> resolution
>>>>>>> changes leads to va context destroy and reinit. This will cause
>>>>>>> reference frame surface lost and produce garbage.
>>>>>>>
>>>>>>> Though refs surface id could be passed to media driver and found in
>>>>>>> RTtbl, vp9RefList[] in hal layer has already been destroyed. Thus the
>>>>>>> new created VaContext could only got an empty RefList.
>>>>>>>
>>>>>>> As libva allows re-create surface separately without changing the
>>>>>>> context, this issue could be handled by only recreating
>>> hw_frames_ctx.
>>>>>>>
>>>>>>> Set hwaccel_priv_data_keeping flag for vp9 to only recreating
>>>>>>> hw_frame_ctx when dynamic resolution changing happens.
>>>>>>>
>>>>>>> Could be verified by:
>>>>>>> ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -i
>>>>>>>   ./resolutions.ivf -pix_fmt p010le -f rawvideo -vframes 20 -y
>> vaapi.yuv
>>>>>>>
>>>>>>> Signed-off-by: Linjie Fu <linjie.fu at intel.com>
>>>>>>> ---
>>>>>>>  libavcodec/decode.c        | 10 +++++-----
>>>>>>>  libavcodec/internal.h      |  1 +
>>>>>>>  libavcodec/pthread_frame.c |  2 ++
>>>>>>>  libavcodec/vaapi_decode.c  | 40 ++++++++++++++++++++++-------
>> --
>>> --
>>>> -----
>>>>>> --
>>>>>>>  libavcodec/vaapi_vp9.c     |  4 ++++
>>>>>>>  5 files changed, 34 insertions(+), 23 deletions(-)
>>>>>>>
>>>>>>> diff --git a/libavcodec/decode.c b/libavcodec/decode.c
>>>>>>> index 0863b82..7b15fa5 100644
>>>>>>> --- a/libavcodec/decode.c
>>>>>>> +++ b/libavcodec/decode.c
>>>>>>> @@ -1254,7 +1254,6 @@ int
>>>>>> ff_decode_get_hw_frames_ctx(AVCodecContext *avctx,
>>>>>>>
>>>>>>>      frames_ctx = (AVHWFramesContext*)avctx->hw_frames_ctx-
>>>> data;
>>>>>>>
>>>>>>> -
>>>>>>>      if (frames_ctx->initial_pool_size) {
>>>>>>>          // We guarantee 4 base work surfaces. The function above
>>>> guarantees
>>>>>> 1
>>>>>>>          // (the absolute minimum), so add the missing count.
>>>>>>
>>>>>> Unrelated whitespace change
>>>>>
>>>>> There is  a redundant whitespace here, so I removed it within this patch.
>>>>>
>>>>>>> @@ -1333,7 +1332,7 @@ static int hwaccel_init(AVCodecContext
>>>> *avctx,
>>>>>>>          return AVERROR_PATCHWELCOME;
>>>>>>>      }
>>>>>>>
>>>>>>> -    if (hwaccel->priv_data_size) {
>>>>>>> +    if (hwaccel->priv_data_size && !avctx->internal-
>>>>> hwaccel_priv_data) {
>>>>>>>          avctx->internal->hwaccel_priv_data =
>>>>>>>              av_mallocz(hwaccel->priv_data_size);
>>>>>>>          if (!avctx->internal->hwaccel_priv_data)
>>>>>>> @@ -1396,9 +1395,10 @@ int ff_get_format(AVCodecContext
>> *avctx,
>>>>>> const enum AVPixelFormat *fmt)
>>>>>>>      memcpy(choices, fmt, (n + 1) * sizeof(*choices));
>>>>>>>
>>>>>>>      for (;;) {
>>>>>>> -        // Remove the previous hwaccel, if there was one.
>>>>>>> -        hwaccel_uninit(avctx);
>>>>>>> -
>>>>>>> +        // Remove the previous hwaccel, if there was one,
>>>>>>> +        // and no need for keeping.
>>>>>>> +        if (!avctx->internal->hwaccel_priv_data_keeping)
>>>>>>> +            hwaccel_uninit(avctx);
>>>>>>>          user_choice = avctx->get_format(avctx, choices);
>>>>>>>          if (user_choice == AV_PIX_FMT_NONE) {
>>>>>>>              // Explicitly chose nothing, give up.
>>>>>>
>>>>>> There could be a dozen special cases how stuff can go wrong here.
>>> What
>>>>>> if get_format actually returns a different format then the one
>>>>>> currently in use? Or a software format?
>>>>>> Just removing this alone is not safe.
>>>>>
>>>>> Didn't quite get your point.
>>>>> IMHO,  avctx->internal->hwaccel_priv_data_keeping won't be set in
>>> other
>>>> cases
>>>>> other than vaapi_vp9, so this patch won't break the default pipeline,
>> and
>>>>> hwaccel_uninit(avctx) will always be called.
>>>>>
>>>>
>>>> The point is that you cannot rely on get_format to return the same
>>>> format that it previously did. It could return a software format, or
>>>> in some cases possibly even a different hardware format. And you just
>>>> don't handle that.
>>>
>>> Got it. Thanks for the explanation, it should be reconsidered in case it
>>> happens.
>>>
>>>> The entire approach here smells a bit of hack. Lets try to think this
>>>> through and do it properly. One idea that comes to mind is a new
>>>> hwaccel callback that checks if a in-place re-init is possible without
>>>> destroying everything. This could be used for a multitude of different
>>>> situations, and not just this one special case.
>>>
>>> Sounds great, and just FYI, this similar issue is reproduced with
>> nvdec/dxva2
>>> as well. Clips and some details are provided on trac #8068 in case you and
>>> other developers may be interested in or need to verify your solution.
>>> http://trac.ffmpeg.org/ticket/8068
>>
>> Any step-further progress for the hwaccel callback methods or something I
>> can
>> help to fix this gap?
>>
> 
> Ping? 
> A general solution works for multitude situation is great to me, and how about having
> one solution specific for vp9 which introduces no regression as the first step, 
> since there are lots of cases(1400 +) failed/blocked and could be fixed by this patch.
> 
> This blocked quite a lot, please comment what I can do to get this step further.

I still don't understand how the error here can be in FFmpeg - it looks more like a driver problem to me.

The sequence with VAAPI using HW_CONFIG_METHOD_HW_DEVICE_CTX on your lena_resolution_change_on_inter_frame.ivf should be as follows:

1.  The header of frame 0 is read, it's a key frame with resolution is 352x288.
2.  The user-supplied get_format() is called, they pick AV_PIX_FMT_VAAPI and supply a VAAPI device.
3.  Output/reference surfaces are created in a new frames context at 352x288.
4.  A decoder context is created for VP9 at 352x288, rendering to the surfaces created in the previous step.
5.  Frame 0 is decoded and placed in all reference slots.
6.  Frames 1-49 are decoded normally, they overwrite slots 0 and 1 only.
7.  The header of frame 50 is read, it's an inter frame but with a new resolution of 240x196.
8.  The old decoder context is discarded, since it has the wrong resolution and is bound to the wrong render targets.
9.  The old frames context is unreferenced, but references remain to its frames in slots 2-7 so the actual frames themselves stay around.
10. The user-supplied get_format() is called, they pick AV_PIX_FMT_VAAPI again.
11. Output/reference surfaces are created in a new frames context at 240x196.
12. A new decoder context is created for VP9 at 240x196, rendering to the new surfaces.
13. Frame 50 is decoded with reference to frame slots 0, 1 and 2 (those are all in the old frames context and have the old resolution); the result is placed in slots 0 and 1.
14. Frames 51-100 are all decoded with reference to slots 0, 1 and 2, overwriting slots 0 and/or 1 only (in every case slot 2 still contains the original key frame).

(Using HW_CONFIG_METHOD_HW_FRAMES_CTX the main difference is that steps 3 and 11 would be removed, replaced by user action in the get_format() callback in the steps immediately preceding them.)

The iHD driver indeed returns "internal error" immediately on step 13.  However, looking at traces made with libva everything about that render call looks correct - the decoder context is the new one which matches the resolution and render target, and the right surfaces are provided as reference frames (in the old frames context, but definitely haven't been destroyed).

So, can you explain more about what is going wrong?

Thanks,

- Mark

[I do see one dubious issue with the sequence: the call to get_format() in step 10 still offers the software format.  However, if the user actually picked that it would not work - the reference frames it needs are all stored in hardware surfaces which wouldn't be accessible to the software decoder.  It definitely does still have to call get_format() because the user might want to supply an hw_frames_ctx for direct rendering, but the formats offered should probably be restricted.  That does end up being fatal if the new resolution is outside the range supported by the hardware decoder, but I don't see any way to fix that at all.]