[FFmpeg-devel] [PATCH] AVHWAccel infrastructure v2 (take 6) [ping^3]

Sat Oct 30 01:16:25 CEST 2010

On Sat, Oct 30, 2010 at 12:32:31AM +0200, Gwenole Beauchesne wrote:
> Hi,
>
> Le 29 oct. 10 ? 23:45, Michael Niedermayer a ?crit :
>
>> On Fri, Oct 29, 2010 at 02:52:41PM +0200, Gwenole Beauchesne wrote:
>>> Hi,
>>>
>>> Here is a new patch that applies to current SVN tree with some  
>>> further
>>> cosmetics (comments) added.
>>>
>>> Regards,
>>> Gwenole.
>>>
>>> On Mon, 20 Sep 2010, Gwenole Beauchesne wrote:
>>>
>>>> It seems I didn't reply to this one or the message got lost  
>>>> because I
>>>> had a v5 lying around for some time. Here is a v6 anyway, that
>>>> intersects with the tidsp needs.
>>>>
>>>> On Mon, 22 Feb 2010, Michael Niedermayer wrote:
>>>>
>>>>> On Mon, Jan 04, 2010 at 04:39:02PM +0100, Gwenole Beauchesne wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Mon, 4 Jan 2010, Michael Niedermayer wrote:
>>>>>>
>>>>>>>> In the new model I wanted to use, hwaccel_context was to be 
>>>>>>>> solely
>>>>>>>> maintained by libavcodec. However, it still needed a way to 
>>>>>>>> get some
>>>>>>>> hwaccel context initialized by the user-application. e.g. a 
>>>>>>>> VADisplay
>>>>>>>> (void
>>>>>>>> *), an LPDIRECTXVIDEODECODER, etc.
>>>>>>>>
>>>>>>>> I felt interesting to pass this information through hwaccel 
>>>>>>>> attrs along
>>>>>>>> with other int attributes.
>>>>>>>
>>>>>>> We have existing means to pass information, that is through
>>>>>>> AVCodecContext
>>>>>>> and using AVOption where appropriate
>>>>>>> As well as using AVFrame instead of AVCodecContext where its  
>>>>>>> per frame
>>>>>>> data
>>>>>>
>>>>>> OK, I nuked the hwaccel attrs and replaced it with
>>>>>> AVCodecContext.hwaccel_{id,flags}. hwaccel_flags is handled by  
>>>>>> AVOption. I
>>>>>> left hwaccel_id as is to match existing practise (pix_fmt et al.).
>>>>>>
>>>>>> I haven't tested with MPlayer yet but this new approach still  
>>>>>> works
>>>>>> with my
>>>>>> modified ffplay.
>>>>>>
>>>>>> So, HW accelerator selection works as is:
>>>>>> - Set AVCodecContext.hwaccel_id to the desired accelerator
>>>>>> - Set AVCodecContext.hwaccel_flags to HWACCEL_CAP_GET_PIXELS if 
>>>>>> the user
>>>>>> wants AVFrame.data[0-2] populated with YV12 or NV12 pixels
>>>>>>
>>>>>> This means we can handle multiple accelerators automatically at 
>>>>>> once.
>>>>>> Anyway, this lived in an ideal world without practical use 
>>>>>> since the
>>>>>> user-application would still have to handle the HW accelerator
>>>>>> specifically.
>>>>>>
>>>>>> Regards,
>>>>>> Gwenole.
>>>>>
>>>>> ]...]
>>>>>> @@ -2497,7 +2498,7 @@ typedef struct AVCodecContext {
>>>>>>      * provided by the user. In that case, this holds display- 
>>>>>> dependent
>>>>>>      * data FFmpeg cannot instantiate itself. Please refer to the
>>>>>>      * FFmpeg HW accelerator documentation to know how to fill  
>>>>>> this
>>>>>> -     * is. e.g. for VA API, this is a struct vaapi_context.
>>>>>> +     * is. e.g. for VA API, this is a VADisplay.
>>>>>>      * - encoding: unused
>>>>>>      * - decoding: Set by user
>>>>>>      */
>>>>>
>>>>> hunk ok probably
>>>>>
>>>>> [...]
>>>>>> +    /**
>>>>>> +     * Called in codec initialization.
>>>>>> +     *
>>>>>> +     * This is used to initialize the  
>>>>>> AVCodecContext.hwaccel_context
>>>>>> +     * hardware accelerator context.
>>>>>> +     *
>>>>>> +     * @param avctx the codec context
>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>> +     */
>>>>>> +    int (*init)(AVCodecContext *avctx);
>>>>>> +
>>>>>> +    /**
>>>>>> +     * Called in codec finalization.
>>>>>> +     *
>>>>>> +     * This is used to clean-up any data from the hardware  
>>>>>> accelerator
>>>>>> +     * context. Should this function be implemented, it shall  
>>>>>> reset
>>>>>> +     * AVCodecContext.hwaccel_context to NULL.
>>>>>> +     *
>>>>>> +     * @param avctx the codec context
>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>> +     */
>>>>>> +    int (*close)(AVCodecContext *avctx);
>>>>>> +
>>>>>> +    /**
>>>>>> +     * Called at the beginning of each frame to allocate a HW  
>>>>>> surface.
>>>>>> +     *
>>>>>> +     * The returned surface will be stored in Picture.data[3].
>>>>>> +     *
>>>>>> +     * @param avctx the codec context
>>>>>> +     * @param surface the pointer to the HW accelerator surface
>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>> +     */
>>>>>> +    int (*alloc_surface)(AVCodecContext *avctx, uint8_t  
>>>>>> **surface);
>>>>>> +
>>>>>> +    /**
>>>>>> +     * Called to free the HW surface that was allocated with
>>>>>> alloc_surface().
>>>>>> +     *
>>>>>> +     * @param avctx the codec context
>>>>>> +     * @param surface the HW accelerator surface
>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>> +     */
>>>>>> +    void (*free_surface)(AVCodecContext *avctx, uint8_t  
>>>>>> *surface);
>>>>>> } AVHWAccel;
>>>>>
>>>>> iam not in favor of these. What are these good for?
>>>>> Why are they needed?
>>>>
>>>> This is used to allocate the HW surfaces (VdpSurface, VASurface,  
>>>> etc.)
>>>> from within FFmpeg. i.e. the AVFrame.data[3] part, hence the  
>>>> uint8_t *
>>>> instead of the void *.
>>>
>>> Those functions are also used to help user implement
>>> AVCodecContext.get_buffer() and release_buffer() if he wants to do so
>>> manually.
>>>
>>> e.g. by default, AVCodecContext.get_buffer() will point to
>>> avcodec_default_get_buffer(). If a user overrides that function, he  
>>> will
>>> lose automatic data[3] handling. So, two possible cases:
>>> 1) He really knows how to do so, so he allocates the HW surface  
>>> himself
>>> 2) He can delegate that to FFmpeg that will know how to do that just 
>>> fine
>>> too. e.g. calling the right vaCreateSurfaces(),  
>>> VdpVideoSurfaceCreate(),
>>> etc.
>>
>> the user can call the default get/release buffer. And if the user  
>> doesnt know
>> how to allocate the surface what is he doing with the functions thats 
>> their
>> purpose. Your explanations are
>> so incomplete and terse that i cannot really comment. Because i do not 
>> know
>> what problem this solves or where this problem would arrise
>
> I want the default avcodec_default_get_buffer() allocate HW surfaces  
> (Picture.data[3]) itself the same way it would allocate memory for SW  
> decoding (Picture.data[0-2]). Since the way a HW surface is allocated is 
> HW-dependent, this needs per-accelerator hooks.
>
> There are two cases to consider:
> (a) The user requested the decoded pixels back to Picture.data[0-2]  
> (AV_HWACCEL_CAP_GET_PIXELS), thus we need that along with the  
> Picture.data[3].
> (b) The user wants to retain the decoded pixels in GPU memory and avoid a 
> round-trip GPU -> CPU -> GPU (for presentation). Thus, data[0-2] doesn't 
> need to be allocated but Picture.data[3] must still be allocated, the HW 
> surface.
>
> In the V6 revision of the patch, I proposed a way to account for both.  
> i.e. if HW accelerator is enabled, only allocate the Picture.data[0-2]  
> if the user requested for it through the AV_HWACCEL_CAP_GET_PIXELS flag. 
> Otherwise, don't allocate anything in there. This can save up to 50 MB of 
> memory. e.g. 16x 1080p reference frames.
>
> Now, imagine the user wants to use his own version of get_buffer() to  
> e.g. manage reordered_opaque or whatever other field. In his  
> get_buffer() for SW decoding, he would initially have:
>
> /* SW */
> pic->type = FF_BUFFER_TYPE_USER;
> pic->reordered_opaque = avctx->reordered_opaque;
> // allocate pic->data[0-2] planes
>
> /* HW: case (a) */
> pic->type = FF_BUFFER_TYPE_USER;
> pic->reordered_opaque = avctx->reordered_opaque;
> // allocate pic->data[0-2] planes with his own memory
> if (avctx->hwaccel && avctx->hwaccel->alloc_surface)
>   pic->data[3] = avctx->hwaccel->alloc_surface(avctx);
>
> /* HW: case (b) */
> pic->type = FF_BUFFER_TYPE_USER;
> pic->reordered_opaque = avctx->reordered_opaque;
> if (avctx->hwaccel) {
>   pic->data[0..2] = NULL;
>   pic->data[3] = avctx->hwaccel->alloc_surface(avctx);
> } else {
> // allocate pic->data[0-2] planes with his own memory
> }
>
> Now, if we don't have the {alloc/free}_surface() functions, we would  
> need to:
>
> /* HW: case (b) */
> pic->type = FF_BUFFER_TYPE_USER;
> pic->reordered_opaque = avctx->reordered_opaque;
> if (avctx->hwaccel) {
>   switch (avctx->hwaccel_id) {
>   case HWACCEL_ID_API_1: pic->data[3] =  
> allocate_hw_surface_for_API_1(avctx); break;
>   case HWACCEL_ID_API_2: pic->data[3] =  
> allocate_hw_surface_for_API_2(avctx); break;
>   // ...
>   }
> } else {
> // allocate pic->data[0-2] planes with his own memory
> }

hmm, i see what this improves now, so iam ok with these 2 functions
will review the patch soon

thanks for the explanation

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I hate to see young programmers poisoned by the kind of thinking
Ulrich Drepper puts forward since it is simply too narrow -- Roman Shaposhnik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101030/2f3e5e61/attachment.pgp>