[FFmpeg-devel] [PATCH][7/8] Add VA API accelerated H.264 decoding (take 4)

Mon Feb 9 10:25:49 CET 2009

Hi Michael,

On Fri, 6 Feb 2009, Michael Niedermayer wrote:

> Please factorize the code somehow.
> I dont want to have all codecs salted with 2 dozen video acceleration APIs

This looks a little unfair to me. I posted the VA API patches at the very 
same time as VDPAU and the only complains at that time were indentation & 
cosmetics so I thought the approach was right. Now, you say the current 
implementation is somewhat crap. I am sorry, it's pretty demotivating.

I believe the current VA API implement should get in as is (with the usual 
cosmetics changes) and have a third party couple of eyes leveraging, which 
generally works better that way. VA API and VDPAU needs different things, 
at different places, and it seems hazardous to try to factor them all at 
once.

I'd pretty would like VA API support integrated for the next FFmpeg 
release as it's the *only* means to have all 3 major vendors HW video 
decode (VLD level) acceleration working through a single API. It turns out 
the current implementation now even outperforms commercial and proprietary 
implementations (Helix, and other GStreamer-based implementation). So 
having this integrated for the upcoming release is a good step forward to 
get broader adoption by other players.

> B. a AVHWAccel struct is added similar to AVCodec and that codecs are
> only allowed to call its members like start_frame() start_slice() end_frame()
> and such

Where would that struct be initialized? Having each codec calling an 
ff_API_CODEC_decode_init() function or have avcodec_get_context_defaults() 
call an ff_API_get_defaults() to initialize the vtable? Though, 
avctx->codec wouldn't be set at this time.

Suggested struct:

typedef struct AVHWAccel {
     void *avctx;
     unsigned int flags; ///< Hardware decode pipeline configuration flags
     int (*start_frame)(AVCodecContext *avctx);
     int (*start_slice)(AVCodecContext *avctx, const uint8_t *buffer, uint32_t size);
     int (*decode_slice)(AVCodecContext *avctx);
     int (*end_slice)(AVCodecContext *avctx);
     int (*end_frame)(AVCodecContext *avctx);
} AVHWAccel;

But there is a problem with this approach. Sometimes, the size of the 
slice can't be known beforehand. e.g. MPEG-4. Unless you accept to 
duplicate code to move to the next resync marker or end of buffer, I don't 
see how to elegantly handle this code.

Probably make start_slice() take a buffer base + offset into this buffer 
(for internal optimisation => use a single whole buffer to emit) and have 
end_slice() take the buffer_end (or size wrt. the previous call to 
start_slice())?

AVHWAccel::flags could be any of:

enum {
   FF_HWACCEL_CAP_DECODE_PICTURE = 1 << 1, ///< HW decoder accepts full picture buffers offloadd. Otherwise, it's on a per-slice basis
   FF_HWACCEL_CAP_DECODE_ESCAPED = 1 << 2, ///< HW decoder accepts raw (escaped) bitstreams. Otherwise, the decoder needs unescaped buffers
};

Regards,
Gwenole.