[Ffmpeg-devel] Native H.264 encoder

Thu Jan 18 13:07:03 CET 2007

Hi,

On Mon, 2006-12-18 at 22:04 +0100, Michael Niedermayer wrote:
> Hi
> 
> On Fri, Dec 15, 2006 at 11:34:37PM +0100, Panagiotis Issaris wrote:
> [...]
> > > > It
> > > > seems most codecs reuse parts of MpegEncContext code combined with some
> > > > conditional parts added to mpegvideo.c. I would prefer to see the motion
> > > > estimation code separated from the codecs,
> > > In snow.c, I found the following comment at line 471:
> > >     MpegEncContext m; // needed for motion estimation, should not be
> > > used for anything else, the idea is to make the motion estimation
> > > eventually independant of MpegEncContext, so this will be removed then
> > > (FIXME/XXX)
> > Yep, I saw that comment too, but if I recall correctly, it has been there for
> > quiet a while :)
> > 
> > > Since it seems that making the motion estimation codec independant of
> > > MpegEncContext is in the plans (and can be useful for the whole ffmpeg
> > > project), I'll try to have a better look to see how to do this (as soon
> > > as I'll have some time :)...
> > > I do not promise anything, but I'll try.
> > Ah! That would be awesome! :) I'll help out whereever I can!
> 
> btw 2 tips
> 1. submit small patches, a 40k motion estimation cleanup patch would be
>    a nightmare for me and whoever submits it ...
> 2. look at snow.c and try to reduce the number of lines of code which
> depend on MpegEncContext if it reaches 0 then motion estimation can be
> used without MpegEncContext ...
I've been looking at a way to do this, and started by looking for each
access of the MpegEncContext field in the SnowContext struct.

in encode_q_branch():
MotionEstContext *c= &s->m.me;
s->m.mb_stride=2;
s->m.mb_x=
s->m.mb_y= 0;
s->m.me.skip= 0;
c->current_mv_penalty= c->mv_penalty[s->m.f_code=1] + MAX_MV;
ref_score= s->m.me.sub_motion_search(&s->m, &ref_mx, &ref_my, ref_score,
0, 0, level-LOG2_MB_SIZE+4, block_w)
c->scene_change_score+= s->m.qscale;

in get_dc():
DWTELEM *dst= (DWTELEM*)s->m.obmc_scratchpad +
plane_index*block_size*block_size*4;

in get_block_rd():
DWTELEM *pred= (DWTELEM*)s->m.obmc_scratchpad +
plane_index*block_size*block_size*4;

in ratecontrol_1pass():
s->m.current_picture.mb_var_sum= coef_sum;
s->m.current_picture.mc_mb_var_sum= 0;
s->m.current_picture.mc_mb_var_sum= coef_sum;
s->m.current_picture.mb_var_sum= 0;

in encode_init():
s->m.avctx   = avctx;
s->m.flags   = avctx->flags;
s->m.bit_rate= avctx->bit_rate;
s->m.me.scratchpad= av_mallocz((avctx->width
+64)*2*16*2*sizeof(uint8_t));
s->m.me.map       = av_mallocz(ME_MAP_SIZE*sizeof(uint32_t));
s->m.me.score_map = av_mallocz(ME_MAP_SIZE*sizeof(uint32_t));
s->m.obmc_scratchpad= av_mallocz(MB_SIZE*MB_SIZE*12*sizeof(uint32_t));

in encode_frame():
s->m.picture_number= avctx->frame_number;
s->m.pict_type = pict->pict_type=
s->m.rc_context.entry[avctx->frame_number].new_pict_type;
s->m.pict_type= pict->pict_type= s->keyframe ? FF_I_TYPE : FF_P_TYPE;
s->m.current_picture_ptr= &s->m.current_picture;
s->m.avctx= s->avctx;
s->m.current_picture.data[0]= s->current_picture.data[0];
s->m.   last_picture.data[0]= s->last_picture[0].data[0];
s->m.    new_picture.data[0]= s->  input_picture.data[0];
s->m.   last_picture_ptr= &s->m.   last_picture;
s->m.linesize=
s->m.   last_picture.linesize[0]=
s->m.    new_picture.linesize[0]=
s->m.current_picture.linesize[0]= stride;
s->m.uvlinesize= s->current_picture.linesize[1];
s->m.width = width;
s->m.height= height;
s->m.mb_width = block_width;
s->m.mb_height= block_height;
s->m.mb_stride=   s->m.mb_width+1;
s->m.b8_stride= 2*s->m.mb_width+1;
s->m.f_code=1;
s->m.pict_type= pict->pict_type;
s->m.me_method= s->avctx->me_method;
s->m.me.scene_change_score=0;
s->m.flags= s->avctx->flags;
s->m.quarter_sample= (s->avctx->flags & CODEC_FLAG_QPEL)!=0;
s->m.out_format= FMT_H263;
s->m.unrestricted_mv= 1;
s->m.lambda = s->lambda;
s->m.qscale= (s->m.lambda*139 + FF_LAMBDA_SCALE*64) >> (FF_LAMBDA_SHIFT
+ 7);
s->lambda2= s->m.lambda2= (s->m.lambda*s->m.lambda + FF_LAMBDA_SCALE/2)
>> FF_LAMBDA_SHIFT;
s->m.dsp= s->dsp; //move
ff_init_me(&s->m);
s->dsp= s->m.dsp;
s->m.pict_type = pict->pict_type;
s->m.misc_bits = 8*(s->c.bytestream - s->c.bytestream_start);
s->m.mv_bits = 8*(s->c.bytestream - s->c.bytestream_start) -
s->m.misc_bits;
... && s->m.me.scene_change_score > s->avctx->scenechange_threshold){
s->m.frame_bits = 8*(s->c.bytestream - s->c.bytestream_start);
s->m.p_tex_bits = s->m.frame_bits - s->m.misc_bits - s->m.mv_bits;
s->m.current_picture.display_picture_number =
s->m.current_picture.coded_picture_number = avctx->frame_number;
s->m.current_picture.quality = pict->quality;
s->m.total_bits += 8*(s->c.bytestream - s->c.bytestream_start);
if(s->pass1_rc)
    if (ff_rate_estimate_qscale(&s->m, 0) < 0)
        return -1;
if(avctx->flags&CODEC_FLAG_PASS1)
    ff_write_pass1_stats(&s->m);
s->m.last_pict_type = s->m.pict_type;
avctx->frame_bits = s->m.frame_bits;
avctx->mv_bits = s->m.mv_bits;
avctx->misc_bits = s->m.misc_bits;
avctx->p_tex_bits = s->m.p_tex_bits;

in common_end()
av_freep(&s->m.me.scratchpad);
av_freep(&s->m.me.map);
av_freep(&s->m.me.score_map);
av_freep(&s->m.obmc_scratchpad);

Afterwards, I filtered out all references to members of the embedded
MotionEstContext and came to this list:
avctx, b8_stride, bit_rate, current_picture, current_picture_ptr, dsp,
f_code, flags, frame_bits, height, lambda, lambda2, last_picture,
last_picture_ptr, linesize, mb_height, mb_stride, mb_width, mb_x, mb_y,
me, me_method, misc_bits, mv_bits, new_picture, obmc_scratchpad,
out_format, pict_type, picture_number, p_tex_bits, qscale,
quarter_sample, rc_context, total_bits, unrestricted_mv, uvlinesize,
width

Some of these fields are also available in SnowContext and are mostly
likely assigned to the equivalently named fields in the MpegEncContext
only to get motion estimation working:
avctx, dsp, new_picture, current_picture, last_picture, lambda, lambda2,
b_width (==mb_width?), b_height (==mb_height?)

So, the MotionEstContext apparently does not provide enough contextual
data for the motion estimation functions and therefore they need access
to the encompassing MpegEncContext. Would that be a good part to focus
on first? I'd guess one way to solve this would be to move the required
fields into the MotionEstContext, or another way would be to pass the
required field in the relevant function calls of the motion estimation
functions. Both would require quite a lot of changes as the
encode_frame() functions accesses quite a lot of MpegEncContext fields. 

Or should a new indepent context be created, that is not specific to
mpeg (and related codecs) encoding but contains enough info to allow
motion estimation? But, ... I'd guess the result would be nearly the
same as moving all the needed stuff to MotionEstContext, right?

Feel free to comment or suggest other changes as I am not familiar with
neither the Snow codec nor the motion estimation and MpegEncContext
code.

With friendly regards,
Takis

-- 
vCard: http://www.issaris.org/pi.vcf
Public key: http://www.issaris.org/pi.key