[FFmpeg-devel] [Ffmpeg-devel][PATCH] WMV 3 encoding support (intra main profile)
Michael Niedermayer
michaelni
Tue May 1 18:15:04 CEST 2007
Hi
On Tue, May 01, 2007 at 05:16:16PM +0200, Denis Fortin wrote:
> On sam, 2007-04-28 at 10:44 -0700, Mike Melanson wrote:
> > Stas Oskin wrote:
> > > Hi.
> > >
> > > Is there any work going on WMV3 encoder in FFMPEG? Or it was superseded by
> > > VC-1?
> >
> > To my knowledge, no one is working on WMV3/VC-1 encoding support for
> > FFmpeg. It is an open field if you would like to jump in. Patch welcome.
> > It will make sense to start with the intraframe coding.
>
> I am working on WMV3/VC-1 encoding support.
> I was holding out a patch until i get at least P frames, but i have been
> out of sync from svn for too long and i still have some bugs in mc ; so
> i guess it's time to get some feedback about intra frame main profile
> VC1 support.
> I tried to reuse as much code as possible from ffmpeg, ratecontrol,
> motion estimation ...
> I tought it was a good idea at first but now i'm not that sure because
> to have correct mc i had to mess mpegvideo.c:qpel_motion which is not
> vc1 specific.
> So i'm interested to know what is the endorsed way to add a mpeg-like
> encoder inside ffmpeg (and reuse some algorithms already coded).
>
> Summary :
> -modify some msmpeg4.c functions to handle vc1
> -rl.h: switch table_vlc from uint16_t to uint32_t for new vlc tables
> -vc1.h: move some definitions from vc1.c to this new filw
new files should always be diffed agianst their "parent" file if one exists
> -vc1enc.c: new functions (this file is included in msmpeg4.c, like
> wmv2.c is)
>
> Comments?
split the patch, your summary is a good start for the points into which
it can be split
[...]
> Index: libavcodec/h263data.h
> ===================================================================
> --- libavcodec/h263data.h (r??vision 8855)
> +++ libavcodec/h263data.h (copie de travail)
> @@ -111,7 +111,7 @@
> };
>
> /* third non intra table */
> -const uint16_t inter_vlc[103][2] = {
> +const uint32_t inter_vlc[103][2] = {
> { 0x2, 2 },{ 0xf, 4 },{ 0x15, 6 },{ 0x17, 7 },
> { 0x1f, 8 },{ 0x25, 9 },{ 0x24, 9 },{ 0x21, 10 },
> { 0x20, 10 },{ 0x7, 11 },{ 0x6, 11 },{ 0x20, 11 },
iam against doubling the size of all tables because 1 new table doesnt
fit in 16bit, how "ugly" would a if() be at the one or 2 spots where
the table is used?
[...]
> @@ -248,11 +248,22 @@
>
> static void find_best_tables(MpegEncContext * s)
> {
> - int i;
> + int i,last;
> + int chroma_offset;
> int best =-1, best_size =9999999;
> int chroma_best=-1, best_chroma_size=9999999;
>
> - for(i=0; i<3; i++){
> + i=0;
> + last=3;
> + chroma_offset = 4;
> +
> + if(s->msmpeg4_version>=6 && s->qscale<=8){
> + i=1;
> + last =4;
> + }
trailing whitespace and tabs are forbidden in svn
also
i= s->msmpeg4_version>=6 && s->qscale<=8;
last= 3+i;
does the same and is simpler
[...]
>
> +
cosmetic change
> if (s->dc_table_index == 0) {
> if (n < 4) {
> put_bits(&s->pb, ff_table0_dc_lum[code][1], ff_table0_dc_lum[code][0]);
> @@ -839,10 +858,21 @@
> }
> }
>
> - if (code == DC_MAX)
> + if (level != 0) {
> + if (code == DC_MAX){
> + if(s->qscale==1 && s->msmpeg4_version>=6)
> + put_bits(&s->pb, 10, level);
> + else if (s->qscale==2 && s->msmpeg4_version>=6)
> + put_bits(&s->pb, 9, level);
> + else
> put_bits(&s->pb, 8, level);
> + } else {
> + if(s->qscale==1 && s->msmpeg4_version>=6)
> + put_bits(&s->pb, 2, extquant);
> + else if(s->qscale==2 && s->msmpeg4_version>=6 )
> + put_bits(&s->pb, 1, extquant);
> + }
int extrabits=0;
if(s->msmpeg4_version>=6 && s->qscale<=2)
extrabits= 3-s->qscale;
if (code == DC_MAX){
put_bits(&s->pb, 8+extrabits, level);
}else{
put_bits(&s->pb, extrabits, extquant);
}
[...]
> } else {
> - rl = &rl_table[3 + s->rl_chroma_table_index];
> + rl = &rl_table[4 + s->rl_chroma_table_index];
> }
these changes could be in a seperate patch
> + if(s->msmpeg4_version>=6)
> + run_diff = 1;
> + else
> run_diff = 0;
run_diff= s->msmpeg4_version>=6;
> - scantable= s->intra_scantable.permutated;
> + scantable= s->intra_scantable.scantable;
why?
[...]
> };
>
> -const uint8_t wmv3_dc_scale_table[32]={
> +static const uint8_t wmv3_dc_scale_table[32]={
> 0, 2, 4, 8, 8, 8, 9, 9,10,10,11,11,12,12,13,13,14,14,15,15,16,16,17,17,18,18,19,19,20,20,21,21
> };
this looks like it belongs in a seperate patch
>
> @@ -649,4 +649,7 @@
> 0x14E6, 0x147B, 0x1414, 0x13B1, 0x1352, 0x12F7, 0x129E, 0x1249,
> 0x11F7, 0x11A8, 0x115B, 0x1111, 0x10C9, 0x1084, 0x1000
> };
> +
> +
> +
> #endif /* VC1DATA_H */
cosmetic
[...]
> + /* dc pred */
> + if(s->msmpeg4_version>=6){
> s->dc_val[0][xy ] =
> s->dc_val[0][xy + 1 ] =
> s->dc_val[0][xy + wrap] =
> + s->dc_val[0][xy + 1 + wrap] = 0;
> + } else {
> + s->dc_val[0][xy ] =
> + s->dc_val[0][xy + 1 ] =
> + s->dc_val[0][xy + wrap] =
> s->dc_val[0][xy + 1 + wrap] = 1024;
> + }
code duplication
[...]
> +void ff_vc1_unquantize_c(MpegEncContext *s, DCTELEM *block, int n, int qscale)
> +{
> + VC1Context * const t= s->avctx->priv_data;
> + int i, level, nCoeffs, q;
> + ScanTable scantable;
> +
> + if(s->mb_intra){
> + if( P_TYPE == s->pict_type ){
> + for(i=0;i<64;i++)
> + block[i] += 128;
> + scantable = s->inter_scantable;
> + }
> + else
> + scantable = s->intra_scantable;
> + } else {
> + scantable = s->inter_scantable;
> + }
if(s->pict_type == I_TYPE)
scantable = s->intra_scantable;
else
scantable = s->inter_scantable;
> +
> + nCoeffs= s->block_last_index[n];
> +
> + if (n < 4)
> + block[0] *= s->y_dc_scale;
> + else
> + block[0] *= s->c_dc_scale;
> +
> + q = 2 * qscale + t->halfpq;
> +
> + for(i=1; i<= nCoeffs; i++) {
> + int j= scantable.permutated[i];
> + level = block[j];
> + if (level) {
> + level = level * q + t->pquantizer*(FFSIGN(block[j]) * qscale);
> + }
> + block[j]=level;
> + }
duplicate of dct_unquantize_h263* ?
[...]
> +int ff_vc1_quantize_c(MpegEncContext *s, DCTELEM *block, int n, int qscale, int *overflow)
> +{
> + VC1Context * const t= s->avctx->priv_data;
> + const uint8_t *scantable;
> + int q, i, j, level, last_non_zero, start_i;
> +
> + if(s->mb_intra){
> + if( P_TYPE == s->pict_type ){
> + for(i=0;i<64;i++)
> + block[i] -= 128;
> + scantable = s->inter_scantable.scantable;
> + }
> + else
> + scantable = s->intra_scantable.scantable;
> + last_non_zero = 0;
> + start_i = 1;
> + } else {
> + scantable = s->inter_scantable.scantable;
> + last_non_zero = -1;
> + start_i = 0;
> + }
> +
> +
> + s->dsp.vc1_fwd_trans_8x8(block);
> +
> + if (n < 4)
> + q = s->y_dc_scale;
> + else
> + q = s->c_dc_scale;
> + block[0] /= q;
> +
> +
> + q = 2 * qscale + t->halfpq;
> +
> + for(i=63;i>=start_i;i--) {
> + j = scantable[i];
> + level = (block[j] - t->pquantizer*(FFSIGN(block[j]) * qscale)) / q;
> +
> + if(level){
> + last_non_zero = i;
> + break;
> + }
> + }
> +
> + for(i=start_i; i<=last_non_zero; i++) {
> + j = scantable[i];
> + block[j] = (block[j] - t->pquantizer*(FFSIGN(block[j]) * qscale)) / q ;
> + }
> +
> + *overflow = 0;
> +
> + return last_non_zero;
> +}
looks like a duplicate of the existing quantization code
> +
> +
> +
> +void ff_vc1_init_mv_table()
> +{
> + static int vc1_mv_init = 0;
> + int size_table [6] = {0 ,2 ,3 ,4 ,5 ,8};
> + int offset_table [6] = {0 ,1 ,3 ,7 ,15 ,31};
static const
> + int i, j, size, offset;
> +
> + if(!vc1_mv_init){
> + vc1_mv_init = 1;
> +
> + ff_vc1_table_mvx_idx = (uint8_t *)av_malloc(sizeof(uint8_t) * VC1_TABLE_MV );
> + ff_vc1_table_mvy_idx = (uint8_t *)av_malloc(sizeof(uint8_t) * VC1_TABLE_MV );
unneeded casts
[...]
> +void ff_vc1_find_best_mv_table(MpegEncContext * s)
> +{
> + int i, mb_y, mb_x, xy;
> + int idx, motion_x, motion_y;
> + int best = -1 , best_size = 9999999;
> + VC1_MVTable *mv;
> +
> + for(i=0;i<4;i++){
> + int size = 0;
> + mv = (VC1_MVTable *)&vc1_mv_tables[i];
unneeded cast
[...]
> +static inline int ff_vc1_get_mv_index(MpegEncContext * s, int motion_x, int motion_y)
> +{
> + assert(motion_x >= 0);
> + assert(motion_y >= 0);
> +
> + return ( ff_vc1_table_mvx_idx[ motion_x ] + ff_vc1_table_mvy_idx[ motion_y ] );
> +}
MpegEncContext is unused
() is unneeded
also if the funcion is static then it doesnt need a ff_ prefix
> +
> +
> +static inline int ff_vc1_get_mv_size(MpegEncContext * s, int motion_x,
> + int motion_y, int more_present_flag )
> +{
> + int mv_x, mv_y, idx;
> + VC1_MVTable *mv;
> + mv = (VC1_MVTable *)&vc1_mv_tables[s->mv_table_index];
> + mv_x = motion_x ;
> + mv_y = motion_y ;
> + idx = ff_vc1_get_mv_index( s, mv_x, mv_y) ;
> + idx += 37 * more_present_flag;
> + return ( mv->table_mv_bits[idx]) ;
> +}
mv_x/mv_y are uneeded
> +
> +
> +/**
> + *@return 1 if explicitly coded, 0 otherwise
> + */
> +int ff_vc1_mv1_predictor(MpegEncContext * s, const int more_present_flag,
> + int *motion_x, int *motion_y, int *hybrid_pred )
> +{
> + int predA, predB, predC;
> + int predictor_pre_x, predictor_pre_y;
> +
> + const int stride = s->mb_stride;
> + const int xy= s->mb_x + s->mb_y * stride;
> +
> + if( 1 == s->mb_width ){
> + *motion_x = s->p_mv_table[xy - stride][0];
> + *motion_y = s->p_mv_table[xy - stride][1];
> + return 0;
> + }
> +
> + predA = xy - stride;
> + predC = xy - 1;
> +
> + if(s->mb_x < s->mb_width - 1){
> + predB = xy - stride + 1;
> + } else {
> + predB = xy - stride - 1;
> + }
> +
> + if( predA >= 0 ){ // predictor A is not out of bound
> + if( (predC < 0) && (predB < 0 ) ) { // B and C out of bound
> + predictor_pre_x = s->p_mv_table[ predA ][0];
> + predictor_pre_y = s->p_mv_table[ predA ][1];
> + } else {
> + int px = mid_pred(s->p_mv_table[ predA ][0],
> + s->p_mv_table[ predB ][0],
> + s->p_mv_table[ predC ][0]);
> + int py = mid_pred(s->p_mv_table[ predA ][1],
> + s->p_mv_table[ predB ][1],
> + s->p_mv_table[ predC ][1]);
> + predictor_pre_x = px;
> + predictor_pre_y = py;
> + }
> + } else if ( predC >= 0 ) { // predictor C is not out of bound
> + predictor_pre_x = s->p_mv_table[ predC ][0];
> + predictor_pre_y = s->p_mv_table[ predC ][1];
> + } else {
> + predictor_pre_x = predictor_pre_y = 0;
> + }
> +
> + //Pullback
> + {
> + int qx, qy, X, Y;
> + qx = s->mb_x << 6;
> + qy = s->mb_y << 6;
> + X = (s->mb_width << 6) - 4;
> + Y = (s->mb_height << 6) - 4;
> +
> + if(qx + predictor_pre_x < -60)
> + predictor_pre_x = -60 - qx;
> + if(qx + predictor_pre_y < -60)
> + predictor_pre_y = -60 - qy;
> + if(qx + predictor_pre_x > X)
> + predictor_pre_x = X - qx;
> + if(qy + predictor_pre_y > Y)
> + predictor_pre_y = Y - qy;
> + }
> + { //HYBRIDPRED
> + int sum;
> + if(predA < 0 || predC < 0){
> + *motion_x = predictor_pre_x;
> + *motion_y = predictor_pre_y;
> + return 0;
> + } else {
> + int sizeA, sizeC;
> + int dmv_x, dmv_y;
> +
> + if(IS_INTRA(s->mb_type[predA]))
> + sum = FFABS(predictor_pre_x) + FFABS(predictor_pre_y);
> + else
> + sum = FFABS(predictor_pre_x - s->p_mv_table[ predA ][0])
> + + FFABS(predictor_pre_x - s->p_mv_table[ predA ][1]) ;
> + if(sum > 32) {
> + dmv_x = s->p_mv_table[ xy ][0] - s->p_mv_table[ predA ][0] ;
> + dmv_y = s->p_mv_table[ xy ][1] - s->p_mv_table[ predA ][1] ;
> + sizeA = ff_vc1_get_mv_size(s, dmv_x, dmv_y, more_present_flag );
> +
> + dmv_x = s->p_mv_table[ xy ][0] - s->p_mv_table[ predC ][0] ;
> + dmv_y = s->p_mv_table[ xy ][1] - s->p_mv_table[ predC ][1] ;
> + sizeC = ff_vc1_get_mv_size(s, dmv_x, dmv_y, more_present_flag );
> +
> + if(sizeC<sizeA){
> + *hybrid_pred = 0;
> + *motion_x = s->p_mv_table[ predC ][0] ;
> + *motion_y = s->p_mv_table[ predC ][1] ;
> + } else {
> + *hybrid_pred = 1;
> + *motion_x = s->p_mv_table[ predA ][0] ;
> + *motion_y = s->p_mv_table[ predA ][1] ;
> + }
> + return 1;
> + } else {
> + if(IS_INTRA(s->mb_type[predC]))
> + sum = FFABS(predictor_pre_x) + FFABS(predictor_pre_y);
> + else
> + sum = FFABS(predictor_pre_x - s->p_mv_table[ predC ][0])
> + + FFABS(predictor_pre_x - s->p_mv_table[ predC ][1]) ;
> + if(sum > 32){
> + dmv_x = s->p_mv_table[ xy ][0] - s->p_mv_table[ predA ][0] ;
> + dmv_y = s->p_mv_table[ xy ][1] - s->p_mv_table[ predA ][1] ;
> + sizeA = ff_vc1_get_mv_size(s, dmv_x, dmv_y, more_present_flag );
> +
> + dmv_x = s->p_mv_table[ xy ][0] - s->p_mv_table[ predC ][0] ;
> + dmv_y = s->p_mv_table[ xy ][1] - s->p_mv_table[ predC ][1] ;
> + sizeC = ff_vc1_get_mv_size(s, dmv_x, dmv_y, more_present_flag );
> +
> + if(sizeC<sizeA){
> + *hybrid_pred = 0;
> + *motion_x = s->p_mv_table[ predC ][0] ;
> + *motion_y = s->p_mv_table[ predC ][1] ;
> + } else {
> + *hybrid_pred = 1;
> + *motion_x = s->p_mv_table[ predA ][0] ;
> + *motion_y = s->p_mv_table[ predA ][1] ;
> + }
> + return 1;
> + } else {
> + *motion_x = predictor_pre_x;
> + *motion_y = predictor_pre_y;
> + return 0;
> + }
> + }
> + }
> + }
> + return 0;
> +}
i assume this function calculates the mv predictor? if so its a duplicate as
the decoder must already contain code to do this
[...]
> @@ -646,6 +830,7 @@
> table4_run,
> table4_level,
> },
> + /* mid rate inter */
> {
these changes belong into a seperate patch
[...]
> + static const int norm_8[64] = {
> + 82944, 83232, 84096, 83232, 82944, 83232, 84096, 83232,
> + 83232, 83521, 84388, 83521, 83232, 83521, 84388, 83521,
> + 84096, 84388, 85264, 84388, 84096, 84388, 85264, 84388,
> + 83232, 83521, 84388, 83521, 83232, 83521, 84388, 83521,
> + 82944, 83232, 84096, 83232, 82944, 83232, 84096, 83232,
> + 83232, 83521, 84388, 83521, 83232, 83521, 84388, 83521,
> + 84096, 84388, 85264, 84388, 84096, 84388, 85264, 84388,
> + 83232, 83521, 84388, 83521, 83232, 83521, 84388, 83521
> + };
duplicate
> +
> + src = block;
> + dest = block;
> +
> + for(i=0 ; i < 8 ; i++){
> + t1 = src[0] + src[56];
> + t2 = src[0] - src[56];
> + t3 = src[8] + src[48];
> + t4 = src[8] - src[48];
> + t5 = src[16] + src[40];
> + t6 = src[16] - src[40];
> + t7 = src[24] + src[32];
> + t8 = src[24] - src[32];
> +
> + dest[0] = 12 * (t1 + t3 + t5 + t7);
> + dest[8] = 16 * t2 + 15 * t4 + 9 * t6 + 4 * t8;
> + dest[16] = 16 * (t1 - t7) + 6 * ( t3 - t5);
> + dest[24] = 15 * t2 - 4 * t4 - 16 * t6 - 9 * t8;
> + dest[32] = 12 * ( t1 - t3 - t5 + t7);
> + dest[40] = 9 * t2 - 16 * t4 + 4 * t6 + 15 * t8;
> + dest[48] = 6 * (t1 - t7) - 16 * (t3 - t5);
> + dest[56] = 4 * t2 - 9 * t4 + 15 * t6 - 16 * t8;
i bet this can be simplified alot, i suggest you look at the exiting dct code
as the same factorizations are likely applicable here
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070501/94c3903f/attachment.pgp>
More information about the ffmpeg-devel
mailing list