[FFmpeg-devel] [Ffmpeg-devel][PATCH] WMV 3 encoding support (intra main profile)

Michael Niedermayer michaelni
Tue May 1 18:15:04 CEST 2007


Hi

On Tue, May 01, 2007 at 05:16:16PM +0200, Denis Fortin wrote:
> On sam, 2007-04-28 at 10:44 -0700, Mike Melanson wrote:
> > Stas Oskin wrote:
> > > Hi.
> > > 
> > > Is there any work going on WMV3 encoder in FFMPEG? Or it was superseded by
> > > VC-1?
> > 
> > To my knowledge, no one is working on WMV3/VC-1 encoding support for 
> > FFmpeg. It is an open field if you would like to jump in. Patch welcome. 
> > It will make sense to start with the intraframe coding.
> 
> I am working on WMV3/VC-1 encoding support.
> I was holding out a patch until i get at least P frames, but i have been
> out of sync from svn for too long and i still have some bugs in mc ; so
> i guess it's time to get some feedback about intra frame main profile
> VC1 support.
> I tried to reuse as much code as possible from ffmpeg, ratecontrol,
> motion estimation ...
> I tought it was a good idea at first but now i'm not that sure because
> to have correct mc i had to mess mpegvideo.c:qpel_motion which is not
> vc1 specific.
> So i'm interested to know what is the endorsed way to add a mpeg-like
> encoder inside ffmpeg (and reuse some algorithms already coded).
> 
> Summary : 
> -modify some msmpeg4.c functions to handle vc1
> -rl.h: switch table_vlc from uint16_t to uint32_t for new vlc tables

> -vc1.h: move some definitions from vc1.c to this new filw

new files should always be diffed agianst their "parent" file if one exists


> -vc1enc.c: new functions (this file is included in msmpeg4.c, like
> wmv2.c is)
> 
> Comments? 

split the patch, your summary is a good start for the points into which
it can be split


[...]
> Index: libavcodec/h263data.h
> ===================================================================
> --- libavcodec/h263data.h        (r??vision 8855)
> +++ libavcodec/h263data.h        (copie de travail)
> @@ -111,7 +111,7 @@
>  };
>  
>  /* third non intra table */
> -const uint16_t inter_vlc[103][2] = {
> +const uint32_t inter_vlc[103][2] = {
>  { 0x2, 2 },{ 0xf, 4 },{ 0x15, 6 },{ 0x17, 7 },
>  { 0x1f, 8 },{ 0x25, 9 },{ 0x24, 9 },{ 0x21, 10 },
>  { 0x20, 10 },{ 0x7, 11 },{ 0x6, 11 },{ 0x20, 11 },

iam against doubling the size of all tables because 1 new table doesnt
fit in 16bit, how "ugly" would a if() be at the one or 2 spots where
the table is used?


[...]
> @@ -248,11 +248,22 @@
>  
>  static void find_best_tables(MpegEncContext * s)
>  {
> -    int i;
> +    int i,last;
> +    int chroma_offset;
>      int best       =-1, best_size       =9999999;
>      int chroma_best=-1, best_chroma_size=9999999;
>  
> -    for(i=0; i<3; i++){
> +    i=0;
> +    last=3;
> +    chroma_offset = 4;
> +    
> +    if(s->msmpeg4_version>=6 && s->qscale<=8){
> +        i=1;
> +        last =4;
> +    }

trailing whitespace and tabs are forbidden in svn

also 

i= s->msmpeg4_version>=6 && s->qscale<=8;
last= 3+i;

does the same and is simpler


[...]

>  
> +

cosmetic change

>          if (s->dc_table_index == 0) {
>              if (n < 4) {
>                  put_bits(&s->pb, ff_table0_dc_lum[code][1], ff_table0_dc_lum[code][0]);
> @@ -839,10 +858,21 @@
>              }
>          }
>  
> -        if (code == DC_MAX)
> +        if (level != 0) {
> +            if (code == DC_MAX){
> +                if(s->qscale==1 && s->msmpeg4_version>=6)
> +                    put_bits(&s->pb, 10, level);
> +                else if (s->qscale==2 && s->msmpeg4_version>=6)
> +                    put_bits(&s->pb, 9, level);
> +                else 
>              put_bits(&s->pb, 8, level);
> +            } else {  
> +                if(s->qscale==1 && s->msmpeg4_version>=6)
> +                    put_bits(&s->pb, 2, extquant);
> +                else if(s->qscale==2  && s->msmpeg4_version>=6 )
> +                    put_bits(&s->pb, 1, extquant);
> +            }

int extrabits=0;
if(s->msmpeg4_version>=6 && s->qscale<=2)
    extrabits= 3-s->qscale;

if (code == DC_MAX){
    put_bits(&s->pb, 8+extrabits, level);
}else{
    put_bits(&s->pb, extrabits, extquant);
}


[...]
>          } else {
> -            rl = &rl_table[3 + s->rl_chroma_table_index];
> +            rl = &rl_table[4 + s->rl_chroma_table_index];
>          }

these changes could be in a seperate patch


> +        if(s->msmpeg4_version>=6)
> +            run_diff = 1;
> +        else
>          run_diff = 0;

run_diff= s->msmpeg4_version>=6;


> -        scantable= s->intra_scantable.permutated;
> +        scantable= s->intra_scantable.scantable;

why?


[...]
>  };
>  
> -const uint8_t wmv3_dc_scale_table[32]={
> +static const uint8_t wmv3_dc_scale_table[32]={
>      0, 2, 4, 8, 8, 8, 9, 9,10,10,11,11,12,12,13,13,14,14,15,15,16,16,17,17,18,18,19,19,20,20,21,21
>  };

this looks like it belongs in a seperate patch


>  
> @@ -649,4 +649,7 @@
>      0x14E6, 0x147B, 0x1414, 0x13B1, 0x1352, 0x12F7, 0x129E, 0x1249,
>      0x11F7, 0x11A8, 0x115B, 0x1111, 0x10C9, 0x1084, 0x1000
>  };
> +
> +
> +
>  #endif /* VC1DATA_H */

cosmetic


[...]

> +    /* dc pred */
> +    if(s->msmpeg4_version>=6){
>      s->dc_val[0][xy           ] =
>      s->dc_val[0][xy + 1       ] =
>      s->dc_val[0][xy     + wrap] =
> +        s->dc_val[0][xy + 1 + wrap] = 0;
> +    } else {
> +        s->dc_val[0][xy           ] =
> +        s->dc_val[0][xy + 1       ] =
> +        s->dc_val[0][xy     + wrap] =
>      s->dc_val[0][xy + 1 + wrap] = 1024;
> +    }

code duplication


[...]
> +void ff_vc1_unquantize_c(MpegEncContext *s, DCTELEM *block, int n, int qscale)
> +{
> +    VC1Context * const t= s->avctx->priv_data;
> +    int i, level, nCoeffs, q;
> +    ScanTable scantable;
> +
> +    if(s->mb_intra){
> +        if( P_TYPE == s->pict_type ){
> +            for(i=0;i<64;i++) 
> +                block[i] += 128;
> +            scantable = s->inter_scantable;
> +        }
> +        else 
> +            scantable = s->intra_scantable;
> +    } else {
> +        scantable = s->inter_scantable;
> +    }

if(s->pict_type == I_TYPE)
    scantable = s->intra_scantable;
else
    scantable = s->inter_scantable;


> +
> +    nCoeffs= s->block_last_index[n];
> +
> +    if (n < 4)
> +        block[0] *= s->y_dc_scale;
> +    else
> +        block[0] *= s->c_dc_scale;
> +
> +    q = 2 * qscale + t->halfpq;
> +
> +    for(i=1; i<= nCoeffs; i++) {
> +        int j= scantable.permutated[i];
> +        level = block[j];
> +        if (level) {
> +            level = level * q + t->pquantizer*(FFSIGN(block[j]) * qscale);
> +        }
> +        block[j]=level;
> +    }

duplicate of dct_unquantize_h263* ?


[...]
> +int ff_vc1_quantize_c(MpegEncContext *s, DCTELEM *block, int n, int qscale, int *overflow)
> +{
> +    VC1Context * const t= s->avctx->priv_data;
> +    const uint8_t *scantable;
> +    int q, i, j, level, last_non_zero, start_i;
> +    
> +    if(s->mb_intra){
> +        if( P_TYPE == s->pict_type ){
> +            for(i=0;i<64;i++) 
> +                block[i] -= 128;
> +            scantable = s->inter_scantable.scantable;
> +        }
> +        else 
> +            scantable = s->intra_scantable.scantable;
> +        last_non_zero = 0;
> +        start_i = 1;            
> +    } else {
> +        scantable = s->inter_scantable.scantable;
> +        last_non_zero = -1;
> +        start_i = 0;
> +    }
> +
> +
> +    s->dsp.vc1_fwd_trans_8x8(block);
> +
> +    if (n < 4)
> +        q = s->y_dc_scale;
> +    else
> +        q = s->c_dc_scale;
> +    block[0] /= q;
> +
> +
> +    q = 2 * qscale + t->halfpq;
> +    
> +    for(i=63;i>=start_i;i--) {
> +        j = scantable[i];
> +        level =  (block[j] - t->pquantizer*(FFSIGN(block[j]) * qscale)) / q;
> +
> +        if(level){
> +            last_non_zero = i;
> +            break;
> +        }
> +    }
> +
> +    for(i=start_i; i<=last_non_zero; i++) {
> +        j = scantable[i];
> +        block[j] =  (block[j] - t->pquantizer*(FFSIGN(block[j]) * qscale)) / q ;
> +    }
> +
> +    *overflow = 0;
> +
> +    return last_non_zero;
> +}

looks like a duplicate of the existing quantization code


> +
> +
> +
> +void ff_vc1_init_mv_table()
> +{
> +    static int vc1_mv_init = 0;
> +    int size_table [6]   = {0  ,2  ,3  ,4  ,5  ,8};
> +    int offset_table [6] = {0  ,1  ,3  ,7 ,15 ,31};

static const


> +    int i, j, size, offset;
> +    
> +    if(!vc1_mv_init){
> +        vc1_mv_init = 1;
> +        
> +        ff_vc1_table_mvx_idx = (uint8_t *)av_malloc(sizeof(uint8_t) * VC1_TABLE_MV );
> +        ff_vc1_table_mvy_idx = (uint8_t *)av_malloc(sizeof(uint8_t) * VC1_TABLE_MV );

unneeded casts


[...]
> +void ff_vc1_find_best_mv_table(MpegEncContext * s)
> +{
> +    int i, mb_y, mb_x, xy;
> +    int idx, motion_x, motion_y;
> +    int best = -1 , best_size = 9999999;
> +    VC1_MVTable *mv;
> +
> +    for(i=0;i<4;i++){
> +        int size = 0;
> +        mv = (VC1_MVTable *)&vc1_mv_tables[i];        

unneeded cast


[...]
> +static inline int ff_vc1_get_mv_index(MpegEncContext * s, int motion_x, int motion_y)
> +{
> +    assert(motion_x >= 0);
> +    assert(motion_y >= 0);
> +    
> +    return (  ff_vc1_table_mvx_idx[ motion_x ] + ff_vc1_table_mvy_idx[ motion_y ] );
> +}

MpegEncContext is unused
() is unneeded

also if the funcion is static then it doesnt need a ff_ prefix


> +
> +
> +static inline int ff_vc1_get_mv_size(MpegEncContext * s, int motion_x, 
> +                                     int motion_y, int more_present_flag )
> +{
> +    int mv_x, mv_y, idx;
> +    VC1_MVTable *mv;
> +    mv = (VC1_MVTable *)&vc1_mv_tables[s->mv_table_index];        
> +    mv_x = motion_x ;
> +    mv_y = motion_y ; 
> +    idx = ff_vc1_get_mv_index( s, mv_x, mv_y) ;
> +    idx += 37 * more_present_flag;
> +    return ( mv->table_mv_bits[idx]) ;
> +}

mv_x/mv_y are uneeded


> +
> +
> +/**
> + *@return 1 if explicitly coded, 0 otherwise
> + */
> +int ff_vc1_mv1_predictor(MpegEncContext * s, const int more_present_flag, 
> +                         int *motion_x, int *motion_y, int *hybrid_pred )
> +{
> +    int predA, predB, predC;
> +    int predictor_pre_x, predictor_pre_y;
> +
> +    const int stride = s->mb_stride;
> +    const int xy= s->mb_x + s->mb_y * stride;
> +
> +    if( 1 == s->mb_width ){
> +        *motion_x  = s->p_mv_table[xy - stride][0];
> +        *motion_y  = s->p_mv_table[xy - stride][1];
> +        return 0;
> +    }
> +
> +    predA = xy - stride;
> +    predC = xy - 1;
> +   
> +    if(s->mb_x < s->mb_width - 1){
> +        predB = xy - stride + 1;
> +    } else {
> +        predB = xy - stride - 1;
> +    }
> +
> +    if( predA >= 0 ){ // predictor A is not out of bound
> +        if( (predC < 0) && (predB < 0 ) ) { // B and C out of bound
> +            predictor_pre_x = s->p_mv_table[ predA ][0];
> +            predictor_pre_y = s->p_mv_table[ predA ][1];
> +        } else {
> +            int px = mid_pred(s->p_mv_table[ predA ][0],
> +                              s->p_mv_table[ predB ][0],
> +                              s->p_mv_table[ predC ][0]);
> +            int py = mid_pred(s->p_mv_table[ predA ][1],
> +                              s->p_mv_table[ predB ][1],
> +                              s->p_mv_table[ predC ][1]);
> +            predictor_pre_x = px;
> +            predictor_pre_y = py;
> +        }
> +    } else if ( predC >= 0 ) { // predictor C is not out of bound
> +        predictor_pre_x = s->p_mv_table[ predC ][0];
> +        predictor_pre_y = s->p_mv_table[ predC ][1];
> +    } else {
> +        predictor_pre_x = predictor_pre_y = 0;
> +    }
> +
> +    //Pullback
> +    {
> +        int qx, qy, X, Y;
> +        qx = s->mb_x << 6;
> +        qy = s->mb_y << 6;
> +        X = (s->mb_width << 6) - 4;
> +        Y = (s->mb_height << 6) - 4;
> +        
> +        if(qx + predictor_pre_x < -60)
> +            predictor_pre_x = -60 - qx;
> +        if(qx + predictor_pre_y < -60)
> +            predictor_pre_y = -60 - qy;
> +        if(qx + predictor_pre_x > X)
> +            predictor_pre_x = X - qx;
> +        if(qy + predictor_pre_y > Y)
> +            predictor_pre_y = Y - qy;
> +    }
> +    { //HYBRIDPRED
> +        int sum;
> +        if(predA < 0 || predC < 0){
> +            *motion_x = predictor_pre_x;
> +            *motion_y = predictor_pre_y;
> +            return 0;
> +        } else {
> +            int sizeA, sizeC;
> +            int dmv_x, dmv_y;
> +
> +            if(IS_INTRA(s->mb_type[predA]))
> +                sum = FFABS(predictor_pre_x) + FFABS(predictor_pre_y);
> +            else
> +                sum = FFABS(predictor_pre_x - s->p_mv_table[ predA ][0]) 
> +                    + FFABS(predictor_pre_x - s->p_mv_table[ predA ][1]) ;
> +            if(sum > 32) {
> +                dmv_x = s->p_mv_table[ xy ][0] - s->p_mv_table[ predA ][0] ;
> +                dmv_y = s->p_mv_table[ xy ][1] - s->p_mv_table[ predA ][1] ;
> +                sizeA = ff_vc1_get_mv_size(s, dmv_x, dmv_y, more_present_flag );
> +
> +                dmv_x = s->p_mv_table[ xy ][0] - s->p_mv_table[ predC ][0] ;
> +                dmv_y = s->p_mv_table[ xy ][1] - s->p_mv_table[ predC ][1] ;
> +                sizeC = ff_vc1_get_mv_size(s, dmv_x, dmv_y, more_present_flag );
> +
> +                if(sizeC<sizeA){
> +                    *hybrid_pred = 0;
> +                    *motion_x =  s->p_mv_table[ predC ][0] ;
> +                    *motion_y =  s->p_mv_table[ predC ][1] ;
> +                } else {
> +                    *hybrid_pred = 1;
> +                    *motion_x =  s->p_mv_table[ predA ][0] ;
> +                    *motion_y =  s->p_mv_table[ predA ][1] ;
> +                }
> +                return 1;
> +            } else {
> +                if(IS_INTRA(s->mb_type[predC]))
> +                    sum = FFABS(predictor_pre_x) + FFABS(predictor_pre_y);
> +                else
> +                    sum = FFABS(predictor_pre_x - s->p_mv_table[ predC ][0]) 
> +                        + FFABS(predictor_pre_x - s->p_mv_table[ predC ][1]) ;
> +                if(sum > 32){
> +                    dmv_x = s->p_mv_table[ xy ][0] - s->p_mv_table[ predA ][0] ;
> +                    dmv_y = s->p_mv_table[ xy ][1] - s->p_mv_table[ predA ][1] ;
> +                    sizeA = ff_vc1_get_mv_size(s, dmv_x, dmv_y, more_present_flag );
> +
> +                    dmv_x = s->p_mv_table[ xy ][0] - s->p_mv_table[ predC ][0] ;
> +                    dmv_y = s->p_mv_table[ xy ][1] - s->p_mv_table[ predC ][1] ;
> +                    sizeC = ff_vc1_get_mv_size(s, dmv_x, dmv_y, more_present_flag );
> +
> +                    if(sizeC<sizeA){
> +                        *hybrid_pred = 0;
> +                        *motion_x =  s->p_mv_table[ predC ][0] ;
> +                        *motion_y =  s->p_mv_table[ predC ][1] ;
> +                    } else {
> +                        *hybrid_pred = 1;
> +                        *motion_x =  s->p_mv_table[ predA ][0] ;
> +                        *motion_y =  s->p_mv_table[ predA ][1] ;
> +                    }
> +                    return 1;
> +                } else {
> +                    *motion_x = predictor_pre_x;
> +                    *motion_y = predictor_pre_y;
> +                    return 0;
> +                }    
> +            }
> +        }
> +    }
> +    return 0;
> +}

i assume this function calculates the mv predictor? if so its a duplicate as
the decoder must already contain code to do this


[...]
> @@ -646,6 +830,7 @@
>          table4_run,
>          table4_level,
>      },
> +    /* mid rate inter */
>      {

these changes belong into a seperate patch


[...]
> +    static const int norm_8[64] = { 
> +    82944, 83232, 84096, 83232, 82944, 83232, 84096, 83232,
> +    83232, 83521, 84388, 83521, 83232, 83521, 84388, 83521,
> +    84096, 84388, 85264, 84388, 84096, 84388, 85264, 84388,
> +    83232, 83521, 84388, 83521, 83232, 83521, 84388, 83521,
> +    82944, 83232, 84096, 83232, 82944, 83232, 84096, 83232,
> +    83232, 83521, 84388, 83521, 83232, 83521, 84388, 83521,
> +    84096, 84388, 85264, 84388, 84096, 84388, 85264, 84388,
> +    83232, 83521, 84388, 83521, 83232, 83521, 84388, 83521
> +    };

duplicate


> +
> +    src  = block;
> +    dest = block;
> +    
> +    for(i=0 ; i < 8 ; i++){
> +        t1 = src[0]  + src[56];
> +        t2 = src[0]  - src[56];
> +        t3 = src[8]  + src[48];
> +        t4 = src[8]  - src[48];
> +        t5 = src[16] + src[40];
> +        t6 = src[16] - src[40];
> +        t7 = src[24] + src[32];
> +        t8 = src[24] - src[32];
> +        
> +        dest[0]  = 12 * (t1 + t3 + t5 + t7);
> +        dest[8]  = 16 * t2 + 15 * t4 + 9 * t6 + 4 * t8;
> +        dest[16] = 16 * (t1 - t7) + 6 * ( t3 - t5);
> +        dest[24] = 15 * t2 - 4 * t4 - 16 * t6 - 9 * t8;
> +        dest[32] = 12 * ( t1 - t3 - t5 + t7);
> +        dest[40] = 9 * t2 - 16 * t4 + 4 * t6 + 15 * t8;
> +        dest[48] = 6 * (t1 - t7) - 16 * (t3 - t5);
> +        dest[56] = 4 * t2 - 9 * t4 + 15 * t6 - 16 * t8;

i bet this can be simplified alot, i suggest you look at the exiting dct code
as the same factorizations are likely applicable here


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070501/94c3903f/attachment.pgp>



More information about the ffmpeg-devel mailing list