[FFmpeg-devel] [PATCH] HWAccel infrastructure (take 3)

Wed Feb 18 16:23:22 CET 2009

On Wed, 18 Feb 2009, Michael Niedermayer wrote:

>>>> +    int (*start_slice)(AVCodecContext *avctx, const uint8_t *buf, uint32_t buf_offset);
>>>
>>> hmm this seems poorly designed
>>> first, it needs a buf_size or you will segfault
>>> second buf / buf_offset can be replaced by buf or why not?
>>
>> Actually, the correct sequence order was meant to be start_slice(),
>> decode_slice(), end_slice() so that to follow natural processing of the
>> bitstream in the codec implementation. Otherwise, we can end up
>> duplicating code to advance to the next slice and/or add other variables
>> to record the previous position then compute size. i.e. IMO, uglification
>> in some locations (in h263dec.c in particular).
>>
>> If it's OK with you to slightly uglify h263dec but make it possible to
>> merge all 3 into a single decode_slice() call, then that'd be fine with me
>> too.
>
> start&decode as in your patch can be merged without any other changes
> i cant speak about if i would accept some uglification without really seeing
> it
> also considering the special casing in mpeg1/2 it does not seem the split in
> 3 really avoids uglification

OK, I probably could do something like the following in 
h263dec.c::decode_slice():

if (s->avctx->hwaccel) {
     /* FIXME: we only handle "normal" H.263 / MPEG-4 bitstreams */
     assert(!s->msmpeg4_version);
     int current_pos = get_bits_count(&s->gb) / 8;
     int end_pos, resync_marker_pos = -1;
     if ((resync_marker_pos = ff_h263_resync(s)) < 0)
         end_pos = (s->gb.size_in_bits+7)/8;
     else
         end_pos = resync_marker_pos;
     if (s->avctx->hwaccel->start_slice(s->avctx, s->gb.buffer + current_pos, end_pos - current_pos) < 0)
         return -1;
     return s->avctx->hwaccel->end_slice(s->avctx);
}

then, don't perform the second ff_h263_resync() in the 
while(s->mb_y<s->mb_height) block from ff_h263_decode_frame() i.e.

              if(s->slice_height==0 || s->mb_x!=0 || (s->mb_y%s->slice_height)!=0 || get_bits_count(&s->gb) > s->gb.size_in_bits)
                  break;
-        }else{
+        }else if(!s->avctx->hwaccel){
              if ((resync_marker_pos = ff_h263_resync(s)) < 0)
                  break;
          }

That should be enough. Would that be OK with you (stylistically speaking)?

>> Concerning buf / buf_offset, this was meant for optimization purposes in
>> the HWAccel implementation. That way we can reduce control blocks and/or
>> (it depends on the HWAccel) copy only one buffer in the end.
>
> if you have buf + buf_size and the buffers overlap you can do that too,
> besides if you have a buf/buf_size from start_frame you could check if
> the slices buffer is within that.

It depends, if the hardware accelerator wants the unescaped data (e.g. for 
H.264 and VC-1), then the check wouldn't be valid since the passed args 
would be from another (temporary) buffer. On the other hand, VA API and 
VDPAU are working fine with the escaped (raw bitstream) data.

buf/buf_size from start_frame would be the buf/buf_size from 
AVCodec::decode_frame()? I really want start_frame() to be at a location 
where we are assured that frame info is parsed.

For h264.c, that'd be a little problematic. I could use the same mpeg12.c 
trick to use s->first_slice and:

NAL_SLICE: [...]
             if((err = decode_slice_header(hx, h)))
                break;
             if(s->avctx->hwaccel && s->first_slice){
                 s->first_slice=0;
                 s->avctx->hwaccel->start_frame(s->avctx, buf, buf_size);
             }

but I might not get the whole frame buffer. Or I could place the 
start_frame() call where ff_vdpau_h264_set_reference_frames(s); currently 
is. But in that case, I might have the following AVHWAccel hooks calling 
sequence: [?start_slice() ]+, end_slice(), start_frame(), end_frame(), 
which is not really desirable.

IMO, this complicates a little the code for no real gain. I don't actually 
need buf/buf_size in neither VA API nor VDPAU.

>>>> @@ -950,7 +950,8 @@ void MPV_frame_end(MpegEncContext *s)
>>>>      //just to make sure that all data is rendered.
>>>>      if(CONFIG_MPEG_XVMC_DECODER && s->avctx->xvmc_acceleration){
>>>
>>>>          ff_xvmc_field_end(s);
>>>> -    }else if(!(s->avctx->codec->capabilities&CODEC_CAP_HWACCEL_VDPAU)
>>>> +    }else if(!s->avctx->hwaccel
>>>> +       && !(s->avctx->codec->capabilities&CODEC_CAP_HWACCEL_VDPAU)
>>>>         && s->unrestricted_mv
>>>
>>>    }else if(!(s->avctx->codec->capabilities&CODEC_CAP_HWACCEL_VDPAU)
>>> +      !s->avctx->hwaccel
>>>       && s->unrestricted_mv
>>
>> Order does not really matter here since the next patch will remove the
>> HWACCEL_VDPAU thing. So, currently it's 1-/2+ then 1-, and with your
>> suggestion that will be 1+ then 2-/1+ -- do you prefer keeping some more
>> work for next time? ;-)
>
> you can do it as 1+ and then 1-/1+

Sorry, I don't see how. Could you please show it off to me?

Thanks,
Gwenole.