[FFmpeg-devel] [PATCH] AVHWAccel infrastructure v2 (take 6) [ping^3]

Gwenole Beauchesne gbeauchesne
Wed Nov 10 16:18:45 CET 2010


Hi,

On Wed, 10 Nov 2010, Michael Niedermayer wrote:

> On Wed, Nov 10, 2010 at 11:01:11AM +0100, Gwenole Beauchesne wrote:
>> Hi,
>>
>> On Tue, 9 Nov 2010, Michael Niedermayer wrote:
>>
>>> On Tue, Nov 02, 2010 at 11:05:35AM +0100, Gwenole Beauchesne wrote:
>>>>
>>>> On Sat, 30 Oct 2010, Michael Niedermayer wrote:
>>>>
>>>>>> On Mon, 20 Sep 2010, Gwenole Beauchesne wrote:
>>>>>>
>>>>>>>>> So, HW accelerator selection works as is:
>>>>>>>>> - Set AVCodecContext.hwaccel_id to the desired accelerator
>>>>>>>>> - Set AVCodecContext.hwaccel_flags to HWACCEL_CAP_GET_PIXELS if the user
>>>>>>>>> wants AVFrame.data[0-2] populated with YV12 or NV12 pixels
>>>>>>>>>
>>>>>>>>> This means we can handle multiple accelerators automatically at once.
>>>>>>>>> Anyway, this lived in an ideal world without practical use since the
>>>>>>>>> user-application would still have to handle the HW accelerator
>>>>>>>>> specifically.
>>>>>>>>
>>>>>>>> [...]
>>>>>>>>> +    /**
>>>>>>>>> +     * Called in codec initialization.
>>>>>>>>> +     *
>>>>>>>>> +     * This is used to initialize the AVCodecContext.hwaccel_context
>>>>>>>>> +     * hardware accelerator context.
>>>>>>>>> +     *
>>>>>>>>> +     * @param avctx the codec context
>>>>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>>>>> +     */
>>>>>>>>> +    int (*init)(AVCodecContext *avctx);
>>>>>>>>> +
>>>>>>>>> +    /**
>>>>>>>>> +     * Called in codec finalization.
>>>>>>>>> +     *
>>>>>>>>> +     * This is used to clean-up any data from the hardware accelerator
>>>>>>>>> +     * context. Should this function be implemented, it shall reset
>>>>>>>>> +     * AVCodecContext.hwaccel_context to NULL.
>>>>>>>>> +     *
>>>>>>>>> +     * @param avctx the codec context
>>>>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>>>>> +     */
>>>>>>>>> +    int (*close)(AVCodecContext *avctx);
>>>>>>>>> +
>>>>>>>>> +    /**
>>>>>>>>> +     * Called at the beginning of each frame to allocate a HW surface.
>>>>>>>>> +     *
>>>>>>>>> +     * The returned surface will be stored in Picture.data[3].
>>>>>>>>> +     *
>>>>>>>>> +     * @param avctx the codec context
>>>>>>>>> +     * @param surface the pointer to the HW accelerator surface
>>>>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>>>>> +     */
>>>>>>>>> +    int (*alloc_surface)(AVCodecContext *avctx, uint8_t **surface);
>>>>>>>>> +
>>>>>>>>> +    /**
>>>>>>>>> +     * Called to free the HW surface that was allocated with
>>>>>>>>> alloc_surface().
>>>>>>>>> +     *
>>>>>>>>> +     * @param avctx the codec context
>>>>>>>>> +     * @param surface the HW accelerator surface
>>>>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>>>>> +     */
>>>>>>>>> +    void (*free_surface)(AVCodecContext *avctx, uint8_t *surface);
>>>>>>>>>  } AVHWAccel;
>>>>>>>>
>>>>>>>> iam not in favor of these. What are these good for?
>>>>>>>> Why are they needed?
>>>>>>>
>>>>>>> This is used to allocate the HW surfaces (VdpSurface, VASurface, etc.)
>>>>>>> from within FFmpeg. i.e. the AVFrame.data[3] part, hence the uint8_t *
>>>>>>> instead of the void *.
>>>>>>
>>>>>> Those functions are also used to help user implement
>>>>>> AVCodecContext.get_buffer() and release_buffer() if he wants to do so
>>>>>> manually.
>>>>>>
>>>>>> e.g. by default, AVCodecContext.get_buffer() will point to
>>>>>> avcodec_default_get_buffer(). If a user overrides that function, he will
>>>>>> lose automatic data[3] handling. So, two possible cases:
>>>>>> 1) He really knows how to do so, so he allocates the HW surface himself
>>>>>> 2) He can delegate that to FFmpeg that will know how to do that just fine
>>>>>> too. e.g. calling the right vaCreateSurfaces(), VdpVideoSurfaceCreate(),
>>>>>> etc.
>>>>>>
>>>>>>>>> +/** Identifies the hardware accelerator */
>>>>>>>>> +enum HWAccelID {
>>>>>>>>> +    HWACCEL_ID_NONE,
>>>>>>>>> +    HWACCEL_ID_VAAPI,           ///< Video Acceleration (VA) API
>>>>>>>>> +    HWACCEL_ID_NB
>>>>>>>>> +};
>>>>>>>>
>>>>>>>> this needs a any or auto for autodetection (which should be 0)
>>>>>>>
>>>>>>> I finally forgot about autodetection because it's too complex to handle
>>>>>>> at the FFmpeg level and delegated that to the player level. In
>>>>>>> particular, HWAccel context may depend on upper level info, like an X
>>>>>>> display...
>>>>>>>
>>>>>>>>> +
>>>>>>>>> +/** Identifies the hardware accelerator entry-point */
>>>>>>>>> +enum HWAccelEntrypoint {
>>>>>>>>> +    HWACCEL_ENTRYPOINT_VLD = 1, ///< Variable-length decoding
>>>>>>>>> +    HWACCEL_ENTRYPOINT_IDCT,    ///< Inverse discrete cosine transform
>>>>>>>>> +    HWACCEL_ENTRYPOINT_MOCO,    ///< Motion compensation
>>>>>>>>> +    HWACCEL_ENTRYPOINT_NB
>>>>>>>>> +};
>>>>>>>>
>>>>>>>> as said previously this is ambigous
>>>>>>>>
>>>>>>>> HWACCEL_ENTRYPOINT_VLD could mean the hw does just VLD and leaves IDCT/MC
>>>>>>>> to software
>>>>>>>
>>>>>>> Actually, I didn't find the exact meaning. But from a pipeline point of
>>>>>>> view, this would mean XXX and above. i.e. VLD and above, motion
>>>>>>> compensation and above. Flagging those would lead to more player work
>>>>>>> (well, just OR'ing more flags).
>>>>>>>
>>>>>>> I tried to express that in the comment.
>>>>>
>>>>>>  Makefile    |    2 -
>>>>>>  avcodec.h   |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>>>>>>  hwaccel.h   |   55 +++++++++++++++++++++++++++++++++++++
>>>>>>  internal.h  |   14 +++++++++
>>>>>>  mpegvideo.c |    4 ++
>>>>>>  options.c   |    2 +
>>>>>>  utils.c     |   52 +++++++++++++++++++++++++++++++++++
>>>>>>  7 files changed, 213 insertions(+), 3 deletions(-)
>>>>>> e890e0f1046043c9a28d3a64d9380146b0316139  ffmpeg.hwaccel.v2.8.patch
>>>>>> commit 8277742b00c2dbe41fd3af7f25c1c903f4477b43
>>>>>> Author: Gwenole Beauchesne <gbeauchesne at splitted-desktop.com>
>>>>>> Date:   Fri Oct 29 06:16:55 2010 +0200
>>>>>>
>>>>>>     Update AVHWAccel infrastructure to v2.
>>>>>>
>>>>>> diff --git a/libavcodec/Makefile b/libavcodec/Makefile
>>>>>> index 385ae02..1b64768 100644
>>>>>> --- a/libavcodec/Makefile
>>>>>> +++ b/libavcodec/Makefile
>>>>>> @@ -3,7 +3,7 @@ include $(SUBDIR)../config.mak
>>>>>>  NAME = avcodec
>>>>>>  FFLIBS = avutil avcore
>>>>>>
>>>>>> -HEADERS = avcodec.h avfft.h dxva2.h opt.h vaapi.h vdpau.h xvmc.h
>>>>>> +HEADERS = avcodec.h avfft.h dxva2.h hwaccel.h opt.h vaapi.h vdpau.h xvmc.h
>>>>>>
>>>>>>  OBJS = allcodecs.o                                                      \
>>>>>>         audioconvert.o                                                   \
>>>>>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>>>>>> index 705259e..137f485 100644
>>>>>> --- a/libavcodec/avcodec.h
>>>>>> +++ b/libavcodec/avcodec.h
>>>>>> @@ -29,6 +29,7 @@
>>>>>>  #include <errno.h>
>>>>>>  #include "libavutil/avutil.h"
>>>>>>  #include "libavutil/cpu.h"
>>>>>> +#include "libavcodec/hwaccel.h"
>>>>>>
>>>>>>  #define LIBAVCODEC_VERSION_MAJOR 52
>>>>>>  #define LIBAVCODEC_VERSION_MINOR 93
>>>>>> @@ -2607,7 +2608,7 @@ typedef struct AVCodecContext {
>>>>>>       * provided by the user. In that case, this holds display-dependent
>>>>>>       * data FFmpeg cannot instantiate itself. Please refer to the
>>>>>>       * FFmpeg HW accelerator documentation to know how to fill this
>>>>>> -     * is. e.g. for VA API, this is a struct vaapi_context.
>>>>>> +     * is. e.g. for VA API, this is a VADisplay.
>>>>>>       * - encoding: unused
>>>>>>       * - decoding: Set by user
>>>>>>       */
>>>>>
>>>>>> @@ -2753,6 +2754,27 @@ typedef struct AVCodecContext {
>>>>>>       * - decoding: unused
>>>>>>       */
>>>>>>      int slices;
>>>>>> +
>>>>>> +    /**
>>>>>> +     * User-requested hardware accelerator. See HWACCEL_ID_xxx.
>>>>>> +     * - encoding: unused
>>>>>> +     * - decoding: Set by user
>>>>>> +     */
>>>>>> +    enum HWAccelID hwaccel_id;
>>>>>
>>>>> could you remind me why this is needed compared to the apparently more flexible
>>>>> system with pixel formats?
>>>>
>>>> Before hwaccel v1, we had: PIX_FMT_<hwaccel>_<codec>. One <hwaccel> could
>>>> easily have ~5 codecs. If you have 2 <haccel>, you get 10 pixel formats.
>>>> Besides, you can have <codec> supported at different entrypoint. e.g.
>>>> VLD, MoComp, IDCT. Thus, we could have combinatorial explosion.
>>>>
>>>> With hwaccel v1, we factored this with PIX_FMT_<hwaccel>_<entrypoint>,
>>>> the <codec> information became irrelevant. <entrypoint> was reduced to
>>>> typically 1 to 3. What does a pixel format with a <codec> name in it
>>>> actually mean anyway?
>>>>
>>>> This is better, but this was still silly. A pixel format defines... well,
>>>> a pixel format. If the user wants to extract the decoded pixels from the
>>>> GPU, he requires so with AV_HWACCEL_CAP_GET_PIXELS. The pixel format of
>>>> those needs to be expressed the usual way, e.g. avctx->pix_fmt ==
>>>> PIX_FMT_YUV420P. How would this have been expressed if avctx->pix_fmt was
>>>> PIX_FMT_<hwaccel>_<entrypoint>?
>>>
>>> "A pixel format defines... well, a pixel format."
>>> thats a useless circular definition
>>>
>>> a pixel format defines the structuring of data in AVFrame.data[]
>>> that likely differs between 2 hw accelerator APIs, so should allow selecting
>>> hw accelerators
>>
>> And how do you define and select an HW accelerator that outputs NV12 or
>> I420 frames? Or other HW accelerators that the user requested decoded
>> pixels back to .data[0-2]?
>
> easy, pix_fmt = PIX_FMT_NV12
> its fully transparent then the user app does not even need to know a hwaccel
> is used

Actually, this is not that easy. The hwaccel can require the 
user-initialized, in a system-dependent way, hwaccel_context. The user may 
also want to explicitly require a specific hwaccel to use. How do you 
express those?

You have several cases to consider:

(Kind of hwaccel)
K1: the hwaccel decodes to opaque HW surfaces. i.e. the user can't access 
to the decoded pixels directly but through some explicit readback 
function.
K2: the hwaccel decodes to SW provided memory. i.e. this memory can be 
exposed directly to Picture.data[0-2].

(Where the user wants the decoded pixels)
L1: closest to the GPU
L2: accessible through the CPU only (AV_HWACCEL_CAP_GET_PIXELS)

(What hwaccel the user specifically wants)
H1 .. Hn.

In v1, we currently have the following code flow:
lavc: avctx->pix_fmt = get_format()
user: <selects hwaccel and initializes hwaccel_context>
lavc: avctx->hwaccel = ff_find_hwaccel(avctx->codec->id, avctx->pix_fmt);

This scenario doesn't work for K2 hwaccels because an hwaccel_context 
needs to be provided and/or user wants to choose H1..Hn specifically 
through get_format().

>> Would you introduce another level with the actual pixel format. e.g.
>> PIX_FMT_YUV420P_<hwaccel>_<entrypoint>?
>>
>>>> Besides, from an implementation point of view (in libavcodec), the pre-v1
>>>> and v1 way of using pixel formats becomes tedious to handle.
>>>>
>>>> We currently have hooks called through:
>>>>
>>>> if (avctx->hwaccel)
>>>>   avctx->hwaccel->decode_slice(...);
>>>>
>>>> this needs to become:
>>>> if (avctx->hwaccel && <is entrypoint VLD>)
>>>>   avctx->hwaccel->decode_slice(...);
>>>>
>>>> so that other entry-points could be managed easily.
>>>>
>>>> With the v1 way, <is entrypoint == VLD> becomes a big evil switch:
>>>> return avctx->pix_fmt == PIX_FMT_hwaccel1_VLD ||?avctx->pix_fmt ==
>>>> PIX_FMT_hwaccel2_VLD || avcxtx->pix_fmt == PIX_FMT_hwaccel3_VLD, etc.
>>>>
>>>> Horrible, isn't it?
>>>
>>> pix_fmt_descriptor[pix_fmt].flags & FLAG_VLD
>>> doesnt look horrible to me
>>
>> if (avctx->hwaccel && avctx->hwaccel->entrypoint == HWACCEL_ENTRYPOINT_VLD)
>> looks better, less indirections, and keeps things on a single line. :)
>
> if you put entrypoint in hwaccel you need a seperate
> AVHWAccel for each entrypoint. and then you could just do
> if(avctx->hwaccel && avctx->hwaccel->decode_slice)

I'd prefer lavc to ensure the implementor did not do anything insane. If 
we make all hooks optional without proper check or notification of error 
(segfault or assert()), the implementor could miss some functions to add.

e.g. VLD requires decode_slice(), IDCT & MoComp require decode_mb(). I 
would like to ensure that in VLD mode, only decode_slice() is implemented, 
and in IDCT & MoComp mode, only decode_mb() is implemented.

Hmmm, we could check this in the ff_init_hwaccel() though.



More information about the ffmpeg-devel mailing list