# [FFmpeg-devel] [PATCH] libavcodec/qsv.c: Re-design session control and internal allocation

Hendrik Leppkes h.leppkes at gmail.com
Mon Oct 26 12:56:31 CET 2015

On Mon, Oct 26, 2015 at 12:44 PM, wm4 <nfxjfg at googlemail.com> wrote:
> On Mon, 26 Oct 2015 11:22:38 +0100
> Gwenole Beauchesne <gb.devel at gmail.com> wrote:
>
>
>> >> /**
>> >>  * Hardware Accelerator identifier.
>> >>  *
>> >>  * @note
>> >>  * A hardware accelerator can be device-less. This means that only the
>> >>  * underlying hardware resource, e.g. a Linux dma_buf handle, is being
>> >>  * transported in the AVFrame. That hardware resource could be mapped
>> >>  * through standard OS-dependent calls, e.g. mmap() on Linux.
>> >>  */
>> >> enum AVHWAccelId {
>> >>     AV_HWACCEL_ID_NONE = -1,
>> >>     AV_HWACCEL_ID_VAAPI,
>> >>     AV_HWACCEL_ID_NB,           ///< Not part of ABI
>> >> };
>> >>
>> >> Name to be improved if people have better suggestions, as this really
>> >> is to be seen as HW resource, not necessarily attached to a particular
>> >> HW device. i.e. this could be a dma_buf handle from a V4L2 buffer or
>> >> VA surface.
>> >
>> > OK. (Minor nit: if ID_NONE is valid and means HW API without context,
>> > maybe it should be 0, not -1. Also, if it was meant this way, maybe
>> > these should still have their own ID for other purposes.)
>>
>> In my current model, ID_NONE is not meant to be valid because the
>> hwaccel side data shall only exist for hwaccel purposes. Besides,
>> having ID_NONE set to -1 is consistent with other liavu enums and
>> convenient to have ID_NB express directly the exact number of
>> hwaccels.
>
> OK, this makes sense to me.
>
>> >> I am reworking the patch series as I changed my mind again: current
>> >> map strategy was overly complex (and required to be). There were at
>> >> least 4 map flags: AV_FRAME_MAP_{READ,WRITE,READ_FIRST,USWC_MEMORY}. I
>> >> am now preferring a unique av_hwaccel_frame_get_pixels() defined as
>> >> follow:
>> >>
>> >> /**
>> >>  * Returns AVFrame pixels into linear memory
>> >>  *
>> >>  * This function takes a snapshot of the underlying HW surface and
>> >>  * exposes it to SW backed memory. This may involve a copy from GPU
>> >>  * memory to CPU memory.
>> >>  *
>> >>  * @note
>> >>  * There is no effort underway to commit the modified pixels back to
>> >>  * GPU memory when the \ref dst AVFrame is released.
>> >>  *
>> >>  * @param[in] src       the source frame to read
>> >>  * @param[inout] dst    the target frame to update, or create if NULL
>> >>  * @param[in] flags     an optional combination of AV_FRAME_FLAG_xxx flags
>> >>  * @return 0 on success, an AVERROR code on failure.
>> >>  */
>> >> int
>> >> av_hwaccel_frame_get_pixels(AVFrame *src, AVFrame **dst, unsigned flags);
>> >>
>> >> i.e. the cost of allocating and copying AVFrame metadata should be
>> >> less than the actual work needed behind the scene. So, it looks like a
>> >> better interim solution for starters.
>> >
>> > So this is for read-access only, right? If it can map data, there
>> > also needs to be an unmap function, and the user would have to know
>> > about when to use it.
>>
>> Well, put can be implementing by reversing src/dst in this function. :)
>> Actually, this can be av_hwaccel_frame_copy(), but the benefit of
>> having get_pixels() is to leave out the allocation business to lavu
>> and just having the user to bother about _unref().
>
> Also makes sense to me.
>
> What is a problem is that mapped frames and CPU frames (let's call pure
> CPU-allocated surfaces that) are not exactly the same thing. If the API
> user assumes the frame is a CPU frame, it might reference it for a long
> time, which would cause various problems. On the other hand, you don't
> want the user to force copying a frame if it's really a CPU frame.
>
> Maybe this is not really a problem. I'm just mentioning it as another
> detail.
>
>> >> For compatibility, that's also the idea behind another generic
>> >> AV_PIX_FMT_HWACCEL that would enforce data[i] to be clear of any
>> >> user-supplied pointers, and buf[i] shall be filled in by appropriate
>> >> accessors, or while creating the side-data, e.g.
>> >> av_vaapi_frame_create_side_data(). i.e. when lavc swallows up an
>> >> AVFrame with new-style hwaccel, this is where the AV_PIX_FMT_VAAPI
>> >> would be replaced with AV_PIX_FMT_HWACCEL. Replace "swallows up" with
>> >> e.g. av_vaapi_frame_convert_in_place() if you prefer. Otherwise, IMHO,
>> >> the old-style fields should live untouched, hence the need to keep the
>> >> hwaccel side-data around.
>> >
>> > Isn't the problem more about output?
>> >
>> > Again, there's the problem with the current hwaccel API selecting the
>> > hwaccel with get_format(), just using the hwaccel-specific pixfmt.
>>
>> I also envision a need for AVCodecContext.hwaccel_id field + possibly
>> .get_hwaccel(). Just so that to depart from that pixfmt tie.
>
> There are some of us who would like this. Of course it makes the API
> change larger. Also, I do find it useful to have pixfmt distinguish
> between underlying surface types (i.e. the hwaccel API). For example,
> if we add support for hw filters to libavfilter, how would you prevent
> that a vdpau filter takes vaapi surfaces as input?
>
> So I'm not sure if a single AV_PIX_FMT_HWACCEL is the way to go, even
> if we make access to hwaccel AVFrames somewhat more uniform.
>
>> > Also, AVFrame.buf[] should cover the memory referenced by
>> > AVFrame.data[]. It's merely a refcount helper for AVFrame.data[], and
>> > should not do additional things.
>> >
>> > I think using AVFrame side data for this would be a bit awkward.
>> > Possibly it _could_ be used to store things like VADisplay if we don't
>> > find a better way, but I think having a AVHWAccelFrame would be better.
>>
>> Side data is quite simple to use, and ref-counted easily. I didn't
>> want to touch to AVFrame fields. Though, it's of course possible to
>> extend it with either public or external fields.
>
> Side data is really just for things by the "side", not information that
> is critical and central.
>
> Maybe we agree that there's no technical issue here, and it's only a
> matter of API design and "taste".
>
>> >> >
>> >> >> PS: other benefit of the AVHWAccelFrame side-data is that I can stuff
>> >> >> crop information into there. Since this is only useful to hwaccel, no
>> >> >> need to populate AVFrame with additional fields IMHO.
>> >> >
>> >> > IMHO, crop information should be generally available, even for software
>> >> > surfaces. What we currently do are terrible hacks: align the X/Y
>> >> > coordinates to chroma boundaries and adjust the pointer (meaning
>> >> > everyone has to do with possibly unaligned pointers, and non-mod-2
>> >> > crops don't work correctly), and this also means you can have a
>> >> > non-mod-2 width/height, which doesn't really make sense for chroma.
>> >>
>> >> I don't really see why this would be needed for SW. AVFrame.buf[] will
>> >> hold the buffers as in "the original allocation". Then AVFrame.data[]
>> >> shall be filled in to fit the actual cropped/decoded region. Isn't it?
>> >
>> > Yes, AVFrame.buf[] does this, but you still don't know e.g. the
>> > original width, or even the pointer to a plane's original (0, 0) pixel
>> > if AVFrame.buf[0] covers all planes.
>> >
>> > I think doing cropping as metadata would also simplify code a lot. For
>> > example, nobody has to worry about non-mod-2 yuv420 anymore, and how it
>> > should be handled. It's less tricky, more correct, more efficient.
>>
>> OK. A crop side-data is desired then. I somehow was convinced that it
>> wouldn't matter for SW. Though, it would actually be really need that
>
> At least this is my opinion. I would like to have such cropping side
> data, instead of half-broken ad-hoc cropping in the decoder and things
> like coded_width.
>
> I don't know what most others think about this. I suspect most would
> find such a change too intrusive. At least for hwaccel it's mandatory
> though. (What we currently do just barely works, and I need hacks in my
> own code to reconstruct the real surface size.)

99% of all cropping use-cases are extremely simple (only bottom/right)
and don't require any secret magic in any code.
I don't mind adding extra cropping metadata, but if you just don't
care about top/left cropping, simply adjusting width/height is as
trivial as it gets.

Adding mandatory new metadata that every user app has to support to
avoid getting 1920x1088 in the future seems a bit ill-advised.

>
>> the user doesn't have to care about anything and just use .data[]
>> appropriately. So, probably have internal_data[] fields for that SIMD
>> alignment needs?
>
> This reminds me of AVFrame.base[], which was removed 2 years ago.
>
> I'm fairly sure a cropping rect would be cleaner.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel