[FFmpeg-devel] [PATCH] RV40 Loop Filter
Michael Niedermayer
michaelni
Mon Oct 27 09:07:31 CET 2008
On Sun, Oct 26, 2008 at 03:41:09PM +0200, Kostya wrote:
> On Sat, Oct 25, 2008 at 11:14:25AM +0200, Michael Niedermayer wrote:
> > On Sat, Oct 25, 2008 at 10:08:44AM +0300, Kostya wrote:
> > > On Wed, Oct 22, 2008 at 10:53:23AM +0200, Michael Niedermayer wrote:
> > > > On Tue, Oct 21, 2008 at 09:23:21AM +0300, Kostya wrote:
> > [...]
> > > > [...]
> > > > > +static int rv40_set_deblock_coef(RV34DecContext *r)
> > > > > +{
> > > > > + MpegEncContext *s = &r->s;
> > > > > + int mvmask = 0, i, j, dx, dy;
> > > > > + int midx = s->mb_x * 2 + s->mb_y * 2 * s->b8_stride;
> > > >
> > > > > + if(s->pict_type == FF_I_TYPE)
> > > > > + return 0;
> > > >
> > > > why is this even called for i frames?
> > >
> > > I intend to use it for calculating macroblock-specific deblock
> > > strength in RV30.
> >
> > fine but how is that related to having the pict_type check inside the
> > function compared to outside?
>
> For RV30 setting deblock coefficients would be performed for
> I-frames as well.
so there are 2 different functions
if(rv30)
rv30_set_deblock_coef()
else if(!I)
rv40_set_deblock_coef()
clean, simple, fast, ...
vs.
ctx->func_ptr()
init(){
if(rv30)
ctx->func_ptr= func30
else
ctx->func_ptr= func40
}
func40(){
if(I)
return;
}
This is not simple, and calling functions that just return is IMHO also
not clean.
>
> > [...]
> > > > > + if(dx > 3 || dy > 3){
> > > > > + mvmask |= 0x03 << (i*2 + j*8);
> > > > > + }
> > > > > + }
> > > > > + }
> > > > > + midx += s->b8_stride;
> > > > > + }
> > > >
> > > > i think the if() can be moved out of the loop like
> > > > if(first_slice_line)
> > > > mvmask &= 123;
> > >
> > > IMO it can't.
> > > It constructs mask based on motion vectors difference in the
> > > horizontal/vertical neighbouring blocks after all.
> >
> > one way (there surely are thousend others)
> >
> > get_mask(int delta)
> > for()
> > for()
> > v0= motion_val[x+y*stride]
> > v1= motion_val[x+y*stride+delta]
> > if(FFABS(v0[0]-v1[0])>3 || FFABS(v0[1]-v1[1])>3)
> > mask |= 1<<(2*x+8*y);
> > return mask
> >
> > hmask= get_mask(1 );
> > vmask= get_mask(stride);
> > if(!mb_x)
> > hmask &= 0x...
> > if(first_slice_line)
> > vmask &= 0x...
> > mask = hmask | (hmask<<1) | vmask | (vmask<<4);
> >
> > besides, the way mask bits are combined looks strange/wrong
>
> Per my understanding it sets edges for 2x2 groups of 4x4 subblocks.
>
> > >
> > > > > + return mvmask;
> > > > > +}
> > > > > +
> > > > > +static void rv40_loop_filter(RV34DecContext *r)
> > > > > +{
> > > > > + MpegEncContext *s = &r->s;
> > > > > + int mb_pos;
> > > > > + int i, j;
> > > > > + uint8_t *Y, *C;
> > > > > + int alpha, beta, betaY, betaC;
> > > > > + int q;
> > > > > + // 0 - cur block, 1 - top, 2 - left, 3 - bottom
> > > > > + int btype[4], clip[4], mvmasks[4], cbps[4], uvcbps[4][2];
> > > > > +
> > > >
> > > > > + if(s->pict_type == FF_B_TYPE)
> > > > > + return;
> > > >
> > > > why is this even called for b frames?
> > >
> > > Because the spec says so :)
> > > RV40 has many special cases for B-frame loop filter which
> > > I didn't care to implement.
> >
> > :/
> > i hope it cannot use B frames as reference?
>
> Looks like it does not
>
> > [...]
> > > [lots of loop filter invoking]
> > > >
> > > > the word mess is probably the best way to describe this
> > > > as far as i can tell you are packing all the bits related to deblocking
> > > > and then later duplicate code each with hardcoded masks to extract them
> > > > again.
> > >
> > > We have a saying here "To make a candy from crap", which I think describes
> > > current situation. I'd like to shot the group of men who proposed the loop
> > > filter in the form RV40 has it.
> >
> > there arent many codecs around that are cleanly designed ...
> > Some things here and there are ok but terrible messes like this are more
> > common.
> > We dont have too much of a choice, to support things the mess has to be
> > implemented. If it can be done cleaner/simpler thats a big advantage in the
> > long term, easier to maintain, understand, optimize; smaller and faste, ...
>
> Also I think that forcing someone to understand it counts as
> a psychological abuse and the sentence on it should be
> debugging X8 frames or implementing interlaced mode in VC-1
> (sorry, can't remember more evil codecs).
>
> > >
> > > The problem is that edges should be filtered in that order with clipping
> > > values depending on clipping values selected depending on whether
> > > neighbouring block coded is not and if it belongs to the same MB or not.
> > > It's possible to all of the into loop, but it will have too many additional
> > > conditions to my taste. I've merged some of them though.
> >
> > iam not suggesting to build a complex and ugly loop, rather something like
> > storing all the numbers that might differ in a 2d array and then
> > having a loop go over this.
> > the mb edge flags, coded info and all that would be in the array so that
> > reading it is a matter of coded[y][x], mb_edge[y][x], mb_type[y][x]
> > i think this would be cleaner IMHO
>
> done
>
> > Ill review the new patch soon
>
> here it is
>
> > [...]
> > --
> > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> Index: libavcodec/rv40.c
> ===================================================================
> --- libavcodec/rv40.c (revision 15305)
> +++ libavcodec/rv40.c (working copy)
> @@ -247,7 +247,462 @@
> return 0;
> }
>
> +#define CLIP_SYMM(a, b) av_clip(a, -(b), b)
> /**
> + * weaker deblocking very similar to the one described in 4.4.2 of JVT-A003r1
> + */
> +static inline void rv40_weak_loop_filter(uint8_t *src, const int step,
> + const int flag0, const int flag1,
> + const int alpha,
> + const int lim0, const int lim1,
> + const int difflim, const int beta,
> + const int S0, const int S1,
> + const int S2, const int S3)
> +{
> + uint8_t *cm = ff_cropTbl + MAX_NEG_CROP;
> + int t, u, diff;
> +
> + t = src[0*step] - src[-1*step];
> + if(!t){
> + return;
> + }
> + u = (alpha * FFABS(t)) >> 7;
> + if(u > 3 - (flag0 && flag1)){
> + return;
> + }
> +
> + t <<= 2;
> + if(flag0 && flag1)
> + t += src[-2*step] - src[1*step];
> + diff = CLIP_SYMM((t + 4) >> 3, difflim);
> + src[-1*step] = cm[src[-1*step] + diff];
> + src[ 0*step] = cm[src[ 0*step] - diff];
> + if(FFABS(S2) <= beta && flag0){
> + t = (S0 + S2 - diff) >> 1;
> + src[-2*step] = cm[src[-2*step] - CLIP_SYMM(t, lim1)];
> + }
> + if(FFABS(S3) <= beta && flag1){
> + t = (S1 + S3 + diff) >> 1;
> + src[ 1*step] = cm[src[ 1*step] - CLIP_SYMM(t, lim0)];
> + }
> +}
rename flag0/1 to filter_first / filter_last or some other name that is
related to what they do!
> +
> +/**
> + * This macro is used for calculating 25*x0+26*x1+26*x2+26*x3+25*x4
> + * or 25*x0+26*x1+51*x2+26*x3
> + * @param sub - index of the value with coefficient = 25
idx25 maybe
> + * @param last - index of the value with coefficient 25 or 51
idx25_51
but still the doxy is not sufficient to understand what the function
does and how overlapping of the 2 variables behave and are used.
> + */
> +#define RV40_STRONG_FILTER(src, step, start, last, sub) \
> + 26*(src[start *step] + src[(start+1)*step] + src[(start+2)*step] \
> + + src[(start+3)*step] + src[last *step]) - src[last *step] \
> + - src[sub *step]
> +
> +/**
> + * Deblocking filter, the altered version from JVT-A003r1 H.26L draft.
> + */
> +static inline void rv40_adaptive_loop_filter(uint8_t *src, const int step,
> + const int stride, const int dmode,
> + const int lim0, const int lim1,
> + const int alpha,
> + const int beta, const int beta2,
> + const int chroma, const int edge)
> +{
> + int diffs[4][4];
> + int s0 = 0, s1 = 0, s2 = 0, s3 = 0;
> + uint8_t *ptr;
> + int flag0 = 1, flag1 = 1;
> + int strength0 = 3, strength1 = 3;
> + int i;
> + int lims;
> +
> + for(i = 0, ptr = src; i < 4; i++, ptr += stride){
> + diffs[i][0] = ptr[-2*step] - ptr[-1*step];
> + diffs[i][1] = ptr[ 1*step] - ptr[ 0*step];
> + s0 += diffs[i][0];
> + s1 += diffs[i][1];
> + }
> + if(FFABS(s0) >= (beta<<2)){
> + strength0 = 1;
> + }
> + if(FFABS(s1) >= (beta<<2)){
> + strength1 = 1;
> + }
> + if(strength0 + strength1 <= 2){
> + return;
> + }
> +
> + for(i = 0, ptr = src; i < 4; i++, ptr += stride){
> + diffs[i][2] = ptr[-2*step] - ptr[-3*step];
> + diffs[i][3] = ptr[ 1*step] - ptr[ 2*step];
> + s2 += diffs[i][2];
> + s3 += diffs[i][3];
> + }
> +
> + if(!edge)
> + flag0 = flag1 = 0;
> + else{
> + flag0 = (strength0 == 3) && (FFABS(s2) < beta2);
> + flag1 = (strength1 == 3) && (FFABS(s3) < beta2);
> + }
> +
> + lims = (lim0 + lim1 + strength0 + strength1) >> 1;
> + if(flag0 && flag1){ /* strong filtering */
> + for(i = 0; i < 4; i++, src += stride){
> + int diff[2], sflag, p0, p1;
> + int t = src[0*step] - src[-1*step];
> +
> + if(!t) continue;
> + sflag = (alpha * FFABS(t)) >> 7;
> + if(sflag > 1) continue;
> +
> + p0 = (RV40_STRONG_FILTER(src, step, -3, 1, -3) + rv40_dither_l[dmode + i]) >> 7;
> + p1 = (RV40_STRONG_FILTER(src, step, -2, 2, -2) + rv40_dither_r[dmode + i]) >> 7;
> + diff[0] = src[-1*step];
> + diff[1] = src[ 0*step];
> + src[-1*step] = sflag ? av_clip(p0, src[-1*step] - lims, src[-1*step] + lims) : p0;
> + src[ 0*step] = sflag ? av_clip(p1, src[ 0*step] - lims, src[ 0*step] + lims) : p1;
> + diff[0] -= src[-1*step];
> + diff[1] -= src[ 0*step];
> + p0 = (RV40_STRONG_FILTER(src, step, -4, 0, -4) + rv40_dither_l[dmode + i] + diff[1]*25) >> 7;
> + p1 = (RV40_STRONG_FILTER(src, step, -1, 3, -1) + rv40_dither_r[dmode + i] + diff[0]*25) >> 7;
> + src[-2*step] = sflag ? av_clip(p0, src[-2*step] - lims, src[-2*step] + lims) : p0;
> + src[ 1*step] = sflag ? av_clip(p1, src[ 1*step] - lims, src[ 1*step] + lims) : p1;
> + if(!chroma){
> + src[-3*step] = (RV40_STRONG_FILTER(src, step, -4, -3, -1) + 64) >> 7;
> + src[ 2*step] = (RV40_STRONG_FILTER(src, step, 0, 2, 0) + 64) >> 7;
> + }
> + }
> + }else if(strength0 == 3 && strength1 == 3){
> + for(i = 0; i < 4; i++, src += stride)
> + rv40_weak_loop_filter(src, step, 1, 1, alpha, lim0, lim1, lims, beta,
> + diffs[i][0], diffs[i][1], diffs[i][2], diffs[i][3]);
> + }else{
> + for(i = 0; i < 4; i++, src += stride)
> + rv40_weak_loop_filter(src, step, strength0==3, strength1==3,
> + alpha, lim0>>1, lim1>>1, lims>>1, beta,
> + diffs[i][0], diffs[i][1], diffs[i][2], diffs[i][3]);
> + }
> +}
> +
> +static void rv40_v_loop_filter(uint8_t *src, int stride, int dmode, int lim0, int lim1,
> + int alpha, int beta, int beta2, int chroma, int edge){
> + rv40_adaptive_loop_filter(src, 1, stride, dmode, lim0, lim1, alpha, beta, beta2, chroma, edge);
> +}
> +static void rv40_h_loop_filter(uint8_t *src, int stride, int dmode, int lim0, int lim1,
> + int alpha, int beta, int beta2, int chroma, int edge){
> + rv40_adaptive_loop_filter(src, stride, 1, dmode, lim0, lim1, alpha, beta, beta2, chroma, edge);
> +}
> +
> +static int check_mv(int16_t (*motion_val)[2], int step)
> +{
> + int d;
> + d = motion_val[0][0] - motion_val[-step][0];
> + if(d < -3 || d > 3)
> + return 1;
> + d = motion_val[0][1] - motion_val[-step][1];
> + if(d < -3 || d > 3)
> + return 1;
> + return 0;
> +}
the name check_mv() is too generic
> +
> +static int rv40_set_deblock_coef(RV34DecContext *r)
> +{
> + MpegEncContext *s = &r->s;
> + int mvmask = 0, i, j, dx, dy;
> + int midx = s->mb_x * 2 + s->mb_y * 2 * s->b8_stride;
> + int16_t (*motion_val)[2] = s->current_picture_ptr->motion_val[0][midx];
> + if(s->pict_type == FF_I_TYPE)
> + return 0;
> + for(j = 0; j < 2; j++){
> + for(i = 0; i < 2; i++){
> + if(i || s->mb_x){
> + if(check_mv(motion_val, 1)){
> + mvmask |= 0x11 << (i*2 + j*8);
> + }
> + }
> + if(j || !s->first_slice_line){
> + if(check_mv(motion_val, s->b8_stride)){
> + mvmask |= 0x03 << (i*2 + j*8);
> + }
> + }
> + }
> + motion_val += s->b8_stride;
> + }
> + return mvmask;
> +}
this is still doing the s->mb_x and first_slice_line checks in the inner loop
> +
> +/** This structure holds conditions on applying loop filter to some edge */
> +typedef struct RV40LoopFilterCond{
> + int x; ///< x coordinate of edge start
> + int y; ///< y coordinate of edge start
> + int dir; ///< edge filtering direction (horizontal or vertical)
and what value does dir have for each?
> + int filt_mask; ///< mask specifying what deblock pattern bit should be tested for filtering
> + int edge_mbtype; ///< edge condition testing - number of neighbouring mbtype or -1
> + int nonedge_mbtype; ///< not at edge condition testing - number of neighbouring mbtype or -1
> + int next_clip_mask; ///< mask specifying bit to test to select neighbour block clip value
> + int dither; ///< dither parameter for the current loop filtering
> +}RV40LoopFilterCond;
> +
> +#define RV40_LUMA_LOOP_FIRST 13
> +static const RV40LoopFilterCond rv40_loop_cond_luma_first_row[RV40_LUMA_LOOP_FIRST] = {
> + { 0, 4, 0, 0x0010, -1, -1, 0x0001, 0 }, // subblock 0
> + { 0, 0, 1, 0x0001, -1, 2, 0x0008, 0 },
> + { 0, 0, 0, 0x0001, 1, -1, 0x1000, 0 },
> + { 0, 0, 1, 0x0001, 2, -1, 0x0008, 0 },
> + { 4, 4, 0, 0x0020, -1, -1, 0x0002, 4 }, // subblocks 1-3
> + { 4, 0, 1, 0x0002, -1, -1, 0x0001, 4 },
> + { 4, 0, 0, 0x0002, 1, -1, 0x2000, 4 },
> + { 8, 4, 0, 0x0040, -1, -1, 0x0004, 8 },
> + { 8, 0, 1, 0x0004, -1, -1, 0x0002, 8 },
> + { 8, 0, 0, 0x0004, 1, -1, 0x4000, 8 },
> + { 12, 4, 0, 0x0080, -1, -1, 0x0008, 12 },
> + { 12, 0, 1, 0x0008, -1, -1, 0x0004, 12 },
> + { 12, 0, 0, 0x0008, 1, -1, 0x8000, 12 }
> +};
> +
> +#define RV40_LUMA_LOOP_NEXT 9
> +static const RV40LoopFilterCond rv40_loop_cond_luma_next_rows[RV40_LUMA_LOOP_NEXT] = {
> + { 0, 4, 0, 0x0010, -1, -1, 0x0001, 0 }, // first subblock of the row
> + { 0, 0, 1, 0x0001, 2, -1, 0x0008, 0 },
> + { 0, 0, 1, 0x0001, -1, 2, 0x0008, 0 },
> + { 4, 4, 0, 0x0020, -1, -1, 0x0002, 1 }, // the rest of subblocks
> + { 4, 0, 1, 0x0002, -1, -1, 0x0001, 1 },
> + { 8, 4, 0, 0x0040, -1, -1, 0x0004, 2 },
> + { 8, 0, 1, 0x0004, -1, -1, 0x0002, 2 },
> + { 12, 4, 0, 0x0080, -1, -1, 0x0008, 3 },
> + { 12, 0, 1, 0x0008, -1, -1, 0x0004, 3 }
> +};
> +
> +#define RV40_CHROMA_LOOP 12
> +static const RV40LoopFilterCond rv40_loop_cond_chroma[RV40_CHROMA_LOOP] = {
> + { 0, 4, 0, 0x04, -1, -1, 0x01, 0 }, // subblock 0
> + { 0, 0, 1, 0x01, -1, 2, 0x02, 0 },
> + { 0, 0, 0, 0x01, 1, -1, 0x04, 0 },
> + { 0, 0, 1, 0x01, 2, -1, 0x02, 0 },
> + { 4, 4, 0, 0x08, -1, -1, 0x02, 8 }, // subblock 1
> + { 4, 4, 1, 0x02, -1, -1, 0x01, 0 },
> + { 4, 4, 0, 0x02, 1, -1, 0x08, 8 },
> + { 0, 8, 0, 0x10, -1, -1, 0x04, 0 }, // subblock 2
> + { 0, 4, 1, 0x04, -1, 2, 0x08, 8 },
> + { 0, 4, 1, 0x04, 2, -1, 0x08, 8 },
> + { 4, 8, 0, 0x20, -1, -1, 0x08, 8 }, // subblock 3
> + { 4, 4, 1, 0x08, -1, -1, 0x04, 8 },
> +};
> +
> +static void rv40_loop_filter(RV34DecContext *r)
> +{
> + MpegEncContext *s = &r->s;
> + int mb_pos;
> + int i, j, k;
> + uint8_t *Y, *C;
> + int alpha, beta, betaY, betaC;
> + int q;
> + // 0 - cur block, 1 - top, 2 - left, 3 - bottom
> + int mbtype[4], clip[4], mvmasks[4], cbp[4], uvcbp[4][2];
> +
> + if(s->pict_type == FF_B_TYPE)
> + return;
> +
> + for(s->mb_y = 0; s->mb_y < s->mb_height; s->mb_y++){
> + mb_pos = s->mb_y * s->mb_stride;
> + for(s->mb_x = 0; s->mb_x < s->mb_width; s->mb_x++, mb_pos++){
> + int btype = s->current_picture_ptr->mb_type[mb_pos];
> + if(IS_INTRA(btype) || IS_SEPARATE_DC(btype)){
> + r->cbp_luma [mb_pos] = 0xFFFF;
> + }
> + if(IS_INTRA(btype)){
> + r->cbp_chroma[mb_pos] = 0xFF;
> + }
> + }
> + }
> + for(s->mb_y = 0; s->mb_y < s->mb_height; s->mb_y++){
> + mb_pos = s->mb_y * s->mb_stride;
> + for(s->mb_x = 0; s->mb_x < s->mb_width; s->mb_x++, mb_pos++){
> + int y_h_deblock, y_v_deblock;
> + int c_v_deblock[2], c_h_deblock[2];
> +
> + ff_init_block_index(s);
> + ff_update_block_index(s);
> + Y = s->dest[0];
> + q = s->current_picture_ptr->qscale_table[mb_pos];
> + alpha = rv40_alpha_tab[q];
> + beta = rv40_beta_tab [q];
> + betaY = betaC = beta * 3;
> + if(s->width * s->height <= 0x6300){
> + betaY += beta;
> + }
> +
> + mvmasks[0] = r->deblock_coefs[mb_pos];
> + mbtype [0] = s->current_picture_ptr->mb_type[mb_pos];
> + cbp [0] = r->cbp_luma[mb_pos];
> + uvcbp[0][0] = r->cbp_chroma[mb_pos] & 0xF;
> + uvcbp[0][1] = r->cbp_chroma[mb_pos] >> 4;
> + for(i = 1; i < 4; i++){
> + mvmasks[i] = 0;
> + mbtype [i] = mbtype[0];
> + cbp [i] = 0;
> + uvcbp[1][0] = uvcbp[1][1] = 0;
> + }
> + if(s->mb_y){
> + mvmasks[1] = r->deblock_coefs[mb_pos - s->mb_stride] & 0xF000;
> + mbtype [1] = s->current_picture_ptr->mb_type[mb_pos - s->mb_stride];
> + cbp [1] = r->cbp_luma[mb_pos - s->mb_stride] & 0xF000;
> + uvcbp[1][0] = r->cbp_chroma[mb_pos - s->mb_stride] & 0xC;
> + uvcbp[1][1] = (r->cbp_chroma[mb_pos - s->mb_stride] >> 4) & 0xC;
> + }
> + if(s->mb_x){
> + mvmasks[2] = r->deblock_coefs[mb_pos - 1] & 0x8888;
> + mbtype [2] = s->current_picture_ptr->mb_type[mb_pos - 1];
> + cbp [2] = r->cbp_luma[mb_pos - 1] & 0x8888;
> + uvcbp[2][0] = r->cbp_chroma[mb_pos - 1] & 0xA;
> + uvcbp[2][1] = (r->cbp_chroma[mb_pos - 1] >> 4) & 0xA;
> + }
> + if(s->mb_y < s->mb_height - 1){
> + mvmasks[3] = r->deblock_coefs[mb_pos + s->mb_stride] & 0x000F;
> + mbtype [3] = s->current_picture_ptr->mb_type[mb_pos + s->mb_stride];
> + cbp [3] = r->cbp_luma[mb_pos + s->mb_stride] & 0x000F;
> + uvcbp[3][0] = r->cbp_chroma[mb_pos + s->mb_stride] & 0x3;
> + uvcbp[3][1] = (r->cbp_chroma[mb_pos + s->mb_stride] >> 4) & 0x3;
> + }
> + for(i = 0; i < 4; i++){
> + mbtype[i] = (IS_INTRA(mbtype[i]) || IS_SEPARATE_DC(mbtype[i])) ? 2 : 1;
> + clip[i] = rv40_filter_clip_tbl[mbtype[i]][q];
> + }
> + y_h_deblock = cbp[0] | ((cbp[0] << 4) & ~0x000F) | (cbp[1] >> 12)
> + | ((cbp[3] << 20) & ~0x000F) | (cbp[3] << 16)
> + | mvmasks[0] | (mvmasks[3] << 16);
> + y_v_deblock = ((cbp[0] << 1) & ~0x1111) | (cbp[2] >> 3)
> + | cbp[0] | (cbp[3] << 16)
> + | mvmasks[0] | (mvmasks[3] << 16);
> + if(!s->mb_x){
> + y_v_deblock &= ~0x1111;
> + }
> + if(!s->mb_y){
> + y_h_deblock &= ~0x000F;
> + }
> + if(s->mb_y == s->mb_height - 1 || (mbtype[0] == 2 || mbtype[3] == 2)){
> + y_h_deblock &= ~0xF0000;
> + }
> + cbp[0] = cbp[0] | (cbp[3] << 16)
> + | mvmasks[0] | (mvmasks[3] << 16);
> + for(i = 0; i < 2; i++){
> + c_v_deblock[i] = ((uvcbp[0][i] << 1) & ~0x5) | (uvcbp[2][i] >> 1)
> + | (uvcbp[3][i] << 4) | uvcbp[0][i];
> + c_h_deblock[i] = (uvcbp[3][i] << 4) | uvcbp[0][i] | (uvcbp[1][i] >> 2)
> + | (uvcbp[3][i] << 6) | (uvcbp[0][i] << 2);
> + uvcbp[0][i] = (uvcbp[3][i] << 4) | uvcbp[0][i];
> + if(!s->mb_x){
> + c_v_deblock[i] &= ~0x5;
> + }
> + if(!s->mb_y){
> + c_h_deblock[i] &= ~0x3;
> + }
> + if(s->mb_y == s->mb_height - 1 || mbtype[0] == 2 || mbtype[3] == 2){
> + c_h_deblock[i] &= ~0x30;
> + }
> + }
> +
> + for(j = 0; j < RV40_LUMA_LOOP_FIRST; j++){
> + RV40LoopFilterCond *loop = rv40_loop_cond_luma_first_row + j;
> + int cond, edgecond = 1, nonedgecond = 1, clip_cur, clip_next;
> + Y = s->dest[0] + loop->x + loop->y * s->linesize;
> + cond = (loop->dir ? y_v_deblock : y_h_deblock) & loop->filt_mask;
> + if(loop->edge_mbtype != -1){
> + edgecond = (mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
> + }
> + if(loop->nonedge_mbtype != -1){
> + nonedgecond = !(mbtype[0] == 2 || mbtype[loop->nonedge_mbtype] == 2);
> + }
> + clip_cur = cbp[0] & loop->filt_mask ? clip[0] : 0;
> + if(!loop->x && loop->dir){
> + clip_next = (cbp[2] | mvmasks[2]) & loop->next_clip_mask ? clip[2] : 0;
> + }else if(!loop->y && !loop->dir){
> + clip_next = (cbp[1] | mvmasks[1]) & loop->next_clip_mask ? clip[1] : 0;
> + }else{
> + clip_next = cbp[0] & loop->next_clip_mask ? clip[0] : 0;
> + }
> + if(cond && edgecond && nonedgecond){
> + if(loop->dir){
> + rv40_v_loop_filter(Y, s->linesize, loop->dither,
> + clip_cur, clip_next,
> + alpha, beta, betaY, 0, loop->edge_mbtype != -1);
> + }else{
> + rv40_h_loop_filter(Y, s->linesize, loop->dither,
> + clip_cur, clip_next,
> + alpha, beta, betaY, 0, loop->edge_mbtype != -1);
> + }
> + }
> + }
> + for(j = 4; j < 12; j++){
> + for(k = 0; k < RV40_LUMA_LOOP_NEXT; k++){
> + RV40LoopFilterCond *loop = rv40_loop_cond_luma_next_rows + k;
> + int cond, edgecond = 1, nonedgecond = 1, clip_cur, clip_next;
> + Y = s->dest[0] + loop->x + (loop->y + j) * s->linesize;
> + cond = (loop->dir ? y_v_deblock : y_h_deblock) & (loop->filt_mask << j);
> + if(loop->edge_mbtype != -1){
> + edgecond = (mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
> + }
> + if(loop->nonedge_mbtype != -1){
> + nonedgecond = !(mbtype[0] == 2 || mbtype[loop->nonedge_mbtype] == 2);
> + }
> + clip_cur = cbp[0] & (loop->filt_mask << j) ? clip[0] : 0;
> + if(!loop->x && loop->dir){
> + clip_next = (cbp[2] | mvmasks[2]) & (loop->next_clip_mask << j) ? clip[2] : 0;
> + }else{
> + clip_next = cbp[0] & (loop->next_clip_mask << j) ? clip[0] : 0;
> + }
> + if(cond && edgecond && nonedgecond){
> + if(loop->dir){
> + rv40_v_loop_filter(Y, s->linesize, loop->dither + j,
> + clip_cur, clip_next,
> + alpha, beta, betaY, 0, loop->edge_mbtype != -1);
> + }else{
> + rv40_h_loop_filter(Y, s->linesize, loop->dither + j,
> + clip_cur, clip_next,
> + alpha, beta, betaY, 0, loop->edge_mbtype != -1);
> + }
> + }
> + }
> + }
> + for(i = 0; i < 2; i++){
> + for(j = 0; j < RV40_CHROMA_LOOP; j++){
> + RV40LoopFilterCond *loop = rv40_loop_cond_chroma + j;
> + int cond, edgecond = 1, nonedgecond = 1, clip_cur, clip_next;
> + C = s->dest[i+1] + loop->x + loop->y * s->uvlinesize;
> + cond = (loop->dir ? c_v_deblock[i] : c_h_deblock[i]) & loop->filt_mask;
> + if(loop->edge_mbtype != -1){
> + edgecond = (mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
> + }
> + if(loop->nonedge_mbtype != -1){
> + nonedgecond = !(mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
> + }
> + clip_cur = uvcbp[0][i] & loop->filt_mask ? clip[0] : 0;
> + if(!loop->x && loop->dir){
> + clip_next = uvcbp[2][i] & loop->next_clip_mask ? clip[2] : 0;
> + }else if(!loop->y && !loop->dir){
> + clip_next = uvcbp[1][i] & loop->next_clip_mask ? clip[1] : 0;
> + }else{
> + clip_next = uvcbp[0][i] & loop->next_clip_mask ? clip[0] : 0;
> + }
> + if(cond && edgecond && nonedgecond){
> + if(loop->dir){
> + rv40_v_loop_filter(C, s->uvlinesize, loop->dither,
> + clip_cur, clip_next,
> + alpha, beta, betaC, 1, loop->edge_mbtype != -1);
> + }else{
> + rv40_h_loop_filter(C, s->uvlinesize, loop->dither,
> + clip_cur, clip_next,
> + alpha, beta, betaC, 1, loop->edge_mbtype != -1);
> + }
> + }
> + }
> + }
> + }
> + }
> +}
i will not accept this mess, sorry.
If you dont (or cant) clean this up i will try eventually but that might not
be soon.
as is, this is too much of a mess and iam unwilling to belive that h264
drafts required such mess.
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
In a rich man's house there is no place to spit but his face.
-- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20081027/d0f3b9dc/attachment.pgp>
More information about the ffmpeg-devel
mailing list