[FFmpeg-devel] [PATCH] RV40 Loop Filter
Kostya
kostya.shishkov
Sun Oct 26 14:41:09 CET 2008
On Sat, Oct 25, 2008 at 11:14:25AM +0200, Michael Niedermayer wrote:
> On Sat, Oct 25, 2008 at 10:08:44AM +0300, Kostya wrote:
> > On Wed, Oct 22, 2008 at 10:53:23AM +0200, Michael Niedermayer wrote:
> > > On Tue, Oct 21, 2008 at 09:23:21AM +0300, Kostya wrote:
> [...]
> > > [...]
> > > > +static int rv40_set_deblock_coef(RV34DecContext *r)
> > > > +{
> > > > + MpegEncContext *s = &r->s;
> > > > + int mvmask = 0, i, j, dx, dy;
> > > > + int midx = s->mb_x * 2 + s->mb_y * 2 * s->b8_stride;
> > >
> > > > + if(s->pict_type == FF_I_TYPE)
> > > > + return 0;
> > >
> > > why is this even called for i frames?
> >
> > I intend to use it for calculating macroblock-specific deblock
> > strength in RV30.
>
> fine but how is that related to having the pict_type check inside the
> function compared to outside?
For RV30 setting deblock coefficients would be performed for
I-frames as well.
> [...]
> > > > + if(dx > 3 || dy > 3){
> > > > + mvmask |= 0x03 << (i*2 + j*8);
> > > > + }
> > > > + }
> > > > + }
> > > > + midx += s->b8_stride;
> > > > + }
> > >
> > > i think the if() can be moved out of the loop like
> > > if(first_slice_line)
> > > mvmask &= 123;
> >
> > IMO it can't.
> > It constructs mask based on motion vectors difference in the
> > horizontal/vertical neighbouring blocks after all.
>
> one way (there surely are thousend others)
>
> get_mask(int delta)
> for()
> for()
> v0= motion_val[x+y*stride]
> v1= motion_val[x+y*stride+delta]
> if(FFABS(v0[0]-v1[0])>3 || FFABS(v0[1]-v1[1])>3)
> mask |= 1<<(2*x+8*y);
> return mask
>
> hmask= get_mask(1 );
> vmask= get_mask(stride);
> if(!mb_x)
> hmask &= 0x...
> if(first_slice_line)
> vmask &= 0x...
> mask = hmask | (hmask<<1) | vmask | (vmask<<4);
>
> besides, the way mask bits are combined looks strange/wrong
Per my understanding it sets edges for 2x2 groups of 4x4 subblocks.
> >
> > > > + return mvmask;
> > > > +}
> > > > +
> > > > +static void rv40_loop_filter(RV34DecContext *r)
> > > > +{
> > > > + MpegEncContext *s = &r->s;
> > > > + int mb_pos;
> > > > + int i, j;
> > > > + uint8_t *Y, *C;
> > > > + int alpha, beta, betaY, betaC;
> > > > + int q;
> > > > + // 0 - cur block, 1 - top, 2 - left, 3 - bottom
> > > > + int btype[4], clip[4], mvmasks[4], cbps[4], uvcbps[4][2];
> > > > +
> > >
> > > > + if(s->pict_type == FF_B_TYPE)
> > > > + return;
> > >
> > > why is this even called for b frames?
> >
> > Because the spec says so :)
> > RV40 has many special cases for B-frame loop filter which
> > I didn't care to implement.
>
> :/
> i hope it cannot use B frames as reference?
Looks like it does not
> [...]
> > [lots of loop filter invoking]
> > >
> > > the word mess is probably the best way to describe this
> > > as far as i can tell you are packing all the bits related to deblocking
> > > and then later duplicate code each with hardcoded masks to extract them
> > > again.
> >
> > We have a saying here "To make a candy from crap", which I think describes
> > current situation. I'd like to shot the group of men who proposed the loop
> > filter in the form RV40 has it.
>
> there arent many codecs around that are cleanly designed ...
> Some things here and there are ok but terrible messes like this are more
> common.
> We dont have too much of a choice, to support things the mess has to be
> implemented. If it can be done cleaner/simpler thats a big advantage in the
> long term, easier to maintain, understand, optimize; smaller and faste, ...
Also I think that forcing someone to understand it counts as
a psychological abuse and the sentence on it should be
debugging X8 frames or implementing interlaced mode in VC-1
(sorry, can't remember more evil codecs).
> >
> > The problem is that edges should be filtered in that order with clipping
> > values depending on clipping values selected depending on whether
> > neighbouring block coded is not and if it belongs to the same MB or not.
> > It's possible to all of the into loop, but it will have too many additional
> > conditions to my taste. I've merged some of them though.
>
> iam not suggesting to build a complex and ugly loop, rather something like
> storing all the numbers that might differ in a 2d array and then
> having a loop go over this.
> the mb edge flags, coded info and all that would be in the array so that
> reading it is a matter of coded[y][x], mb_edge[y][x], mb_type[y][x]
> i think this would be cleaner IMHO
done
> Ill review the new patch soon
here it is
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
-------------- next part --------------
Index: libavcodec/rv40.c
===================================================================
--- libavcodec/rv40.c (revision 15305)
+++ libavcodec/rv40.c (working copy)
@@ -247,7 +247,462 @@
return 0;
}
+#define CLIP_SYMM(a, b) av_clip(a, -(b), b)
/**
+ * weaker deblocking very similar to the one described in 4.4.2 of JVT-A003r1
+ */
+static inline void rv40_weak_loop_filter(uint8_t *src, const int step,
+ const int flag0, const int flag1,
+ const int alpha,
+ const int lim0, const int lim1,
+ const int difflim, const int beta,
+ const int S0, const int S1,
+ const int S2, const int S3)
+{
+ uint8_t *cm = ff_cropTbl + MAX_NEG_CROP;
+ int t, u, diff;
+
+ t = src[0*step] - src[-1*step];
+ if(!t){
+ return;
+ }
+ u = (alpha * FFABS(t)) >> 7;
+ if(u > 3 - (flag0 && flag1)){
+ return;
+ }
+
+ t <<= 2;
+ if(flag0 && flag1)
+ t += src[-2*step] - src[1*step];
+ diff = CLIP_SYMM((t + 4) >> 3, difflim);
+ src[-1*step] = cm[src[-1*step] + diff];
+ src[ 0*step] = cm[src[ 0*step] - diff];
+ if(FFABS(S2) <= beta && flag0){
+ t = (S0 + S2 - diff) >> 1;
+ src[-2*step] = cm[src[-2*step] - CLIP_SYMM(t, lim1)];
+ }
+ if(FFABS(S3) <= beta && flag1){
+ t = (S1 + S3 + diff) >> 1;
+ src[ 1*step] = cm[src[ 1*step] - CLIP_SYMM(t, lim0)];
+ }
+}
+
+/**
+ * This macro is used for calculating 25*x0+26*x1+26*x2+26*x3+25*x4
+ * or 25*x0+26*x1+51*x2+26*x3
+ * @param sub - index of the value with coefficient = 25
+ * @param last - index of the value with coefficient 25 or 51
+ */
+#define RV40_STRONG_FILTER(src, step, start, last, sub) \
+ 26*(src[start *step] + src[(start+1)*step] + src[(start+2)*step] \
+ + src[(start+3)*step] + src[last *step]) - src[last *step] \
+ - src[sub *step]
+
+/**
+ * Deblocking filter, the altered version from JVT-A003r1 H.26L draft.
+ */
+static inline void rv40_adaptive_loop_filter(uint8_t *src, const int step,
+ const int stride, const int dmode,
+ const int lim0, const int lim1,
+ const int alpha,
+ const int beta, const int beta2,
+ const int chroma, const int edge)
+{
+ int diffs[4][4];
+ int s0 = 0, s1 = 0, s2 = 0, s3 = 0;
+ uint8_t *ptr;
+ int flag0 = 1, flag1 = 1;
+ int strength0 = 3, strength1 = 3;
+ int i;
+ int lims;
+
+ for(i = 0, ptr = src; i < 4; i++, ptr += stride){
+ diffs[i][0] = ptr[-2*step] - ptr[-1*step];
+ diffs[i][1] = ptr[ 1*step] - ptr[ 0*step];
+ s0 += diffs[i][0];
+ s1 += diffs[i][1];
+ }
+ if(FFABS(s0) >= (beta<<2)){
+ strength0 = 1;
+ }
+ if(FFABS(s1) >= (beta<<2)){
+ strength1 = 1;
+ }
+ if(strength0 + strength1 <= 2){
+ return;
+ }
+
+ for(i = 0, ptr = src; i < 4; i++, ptr += stride){
+ diffs[i][2] = ptr[-2*step] - ptr[-3*step];
+ diffs[i][3] = ptr[ 1*step] - ptr[ 2*step];
+ s2 += diffs[i][2];
+ s3 += diffs[i][3];
+ }
+
+ if(!edge)
+ flag0 = flag1 = 0;
+ else{
+ flag0 = (strength0 == 3) && (FFABS(s2) < beta2);
+ flag1 = (strength1 == 3) && (FFABS(s3) < beta2);
+ }
+
+ lims = (lim0 + lim1 + strength0 + strength1) >> 1;
+ if(flag0 && flag1){ /* strong filtering */
+ for(i = 0; i < 4; i++, src += stride){
+ int diff[2], sflag, p0, p1;
+ int t = src[0*step] - src[-1*step];
+
+ if(!t) continue;
+ sflag = (alpha * FFABS(t)) >> 7;
+ if(sflag > 1) continue;
+
+ p0 = (RV40_STRONG_FILTER(src, step, -3, 1, -3) + rv40_dither_l[dmode + i]) >> 7;
+ p1 = (RV40_STRONG_FILTER(src, step, -2, 2, -2) + rv40_dither_r[dmode + i]) >> 7;
+ diff[0] = src[-1*step];
+ diff[1] = src[ 0*step];
+ src[-1*step] = sflag ? av_clip(p0, src[-1*step] - lims, src[-1*step] + lims) : p0;
+ src[ 0*step] = sflag ? av_clip(p1, src[ 0*step] - lims, src[ 0*step] + lims) : p1;
+ diff[0] -= src[-1*step];
+ diff[1] -= src[ 0*step];
+ p0 = (RV40_STRONG_FILTER(src, step, -4, 0, -4) + rv40_dither_l[dmode + i] + diff[1]*25) >> 7;
+ p1 = (RV40_STRONG_FILTER(src, step, -1, 3, -1) + rv40_dither_r[dmode + i] + diff[0]*25) >> 7;
+ src[-2*step] = sflag ? av_clip(p0, src[-2*step] - lims, src[-2*step] + lims) : p0;
+ src[ 1*step] = sflag ? av_clip(p1, src[ 1*step] - lims, src[ 1*step] + lims) : p1;
+ if(!chroma){
+ src[-3*step] = (RV40_STRONG_FILTER(src, step, -4, -3, -1) + 64) >> 7;
+ src[ 2*step] = (RV40_STRONG_FILTER(src, step, 0, 2, 0) + 64) >> 7;
+ }
+ }
+ }else if(strength0 == 3 && strength1 == 3){
+ for(i = 0; i < 4; i++, src += stride)
+ rv40_weak_loop_filter(src, step, 1, 1, alpha, lim0, lim1, lims, beta,
+ diffs[i][0], diffs[i][1], diffs[i][2], diffs[i][3]);
+ }else{
+ for(i = 0; i < 4; i++, src += stride)
+ rv40_weak_loop_filter(src, step, strength0==3, strength1==3,
+ alpha, lim0>>1, lim1>>1, lims>>1, beta,
+ diffs[i][0], diffs[i][1], diffs[i][2], diffs[i][3]);
+ }
+}
+
+static void rv40_v_loop_filter(uint8_t *src, int stride, int dmode, int lim0, int lim1,
+ int alpha, int beta, int beta2, int chroma, int edge){
+ rv40_adaptive_loop_filter(src, 1, stride, dmode, lim0, lim1, alpha, beta, beta2, chroma, edge);
+}
+static void rv40_h_loop_filter(uint8_t *src, int stride, int dmode, int lim0, int lim1,
+ int alpha, int beta, int beta2, int chroma, int edge){
+ rv40_adaptive_loop_filter(src, stride, 1, dmode, lim0, lim1, alpha, beta, beta2, chroma, edge);
+}
+
+static int check_mv(int16_t (*motion_val)[2], int step)
+{
+ int d;
+ d = motion_val[0][0] - motion_val[-step][0];
+ if(d < -3 || d > 3)
+ return 1;
+ d = motion_val[0][1] - motion_val[-step][1];
+ if(d < -3 || d > 3)
+ return 1;
+ return 0;
+}
+
+static int rv40_set_deblock_coef(RV34DecContext *r)
+{
+ MpegEncContext *s = &r->s;
+ int mvmask = 0, i, j, dx, dy;
+ int midx = s->mb_x * 2 + s->mb_y * 2 * s->b8_stride;
+ int16_t (*motion_val)[2] = s->current_picture_ptr->motion_val[0][midx];
+ if(s->pict_type == FF_I_TYPE)
+ return 0;
+ for(j = 0; j < 2; j++){
+ for(i = 0; i < 2; i++){
+ if(i || s->mb_x){
+ if(check_mv(motion_val, 1)){
+ mvmask |= 0x11 << (i*2 + j*8);
+ }
+ }
+ if(j || !s->first_slice_line){
+ if(check_mv(motion_val, s->b8_stride)){
+ mvmask |= 0x03 << (i*2 + j*8);
+ }
+ }
+ }
+ motion_val += s->b8_stride;
+ }
+ return mvmask;
+}
+
+/** This structure holds conditions on applying loop filter to some edge */
+typedef struct RV40LoopFilterCond{
+ int x; ///< x coordinate of edge start
+ int y; ///< y coordinate of edge start
+ int dir; ///< edge filtering direction (horizontal or vertical)
+ int filt_mask; ///< mask specifying what deblock pattern bit should be tested for filtering
+ int edge_mbtype; ///< edge condition testing - number of neighbouring mbtype or -1
+ int nonedge_mbtype; ///< not at edge condition testing - number of neighbouring mbtype or -1
+ int next_clip_mask; ///< mask specifying bit to test to select neighbour block clip value
+ int dither; ///< dither parameter for the current loop filtering
+}RV40LoopFilterCond;
+
+#define RV40_LUMA_LOOP_FIRST 13
+static const RV40LoopFilterCond rv40_loop_cond_luma_first_row[RV40_LUMA_LOOP_FIRST] = {
+ { 0, 4, 0, 0x0010, -1, -1, 0x0001, 0 }, // subblock 0
+ { 0, 0, 1, 0x0001, -1, 2, 0x0008, 0 },
+ { 0, 0, 0, 0x0001, 1, -1, 0x1000, 0 },
+ { 0, 0, 1, 0x0001, 2, -1, 0x0008, 0 },
+ { 4, 4, 0, 0x0020, -1, -1, 0x0002, 4 }, // subblocks 1-3
+ { 4, 0, 1, 0x0002, -1, -1, 0x0001, 4 },
+ { 4, 0, 0, 0x0002, 1, -1, 0x2000, 4 },
+ { 8, 4, 0, 0x0040, -1, -1, 0x0004, 8 },
+ { 8, 0, 1, 0x0004, -1, -1, 0x0002, 8 },
+ { 8, 0, 0, 0x0004, 1, -1, 0x4000, 8 },
+ { 12, 4, 0, 0x0080, -1, -1, 0x0008, 12 },
+ { 12, 0, 1, 0x0008, -1, -1, 0x0004, 12 },
+ { 12, 0, 0, 0x0008, 1, -1, 0x8000, 12 }
+};
+
+#define RV40_LUMA_LOOP_NEXT 9
+static const RV40LoopFilterCond rv40_loop_cond_luma_next_rows[RV40_LUMA_LOOP_NEXT] = {
+ { 0, 4, 0, 0x0010, -1, -1, 0x0001, 0 }, // first subblock of the row
+ { 0, 0, 1, 0x0001, 2, -1, 0x0008, 0 },
+ { 0, 0, 1, 0x0001, -1, 2, 0x0008, 0 },
+ { 4, 4, 0, 0x0020, -1, -1, 0x0002, 1 }, // the rest of subblocks
+ { 4, 0, 1, 0x0002, -1, -1, 0x0001, 1 },
+ { 8, 4, 0, 0x0040, -1, -1, 0x0004, 2 },
+ { 8, 0, 1, 0x0004, -1, -1, 0x0002, 2 },
+ { 12, 4, 0, 0x0080, -1, -1, 0x0008, 3 },
+ { 12, 0, 1, 0x0008, -1, -1, 0x0004, 3 }
+};
+
+#define RV40_CHROMA_LOOP 12
+static const RV40LoopFilterCond rv40_loop_cond_chroma[RV40_CHROMA_LOOP] = {
+ { 0, 4, 0, 0x04, -1, -1, 0x01, 0 }, // subblock 0
+ { 0, 0, 1, 0x01, -1, 2, 0x02, 0 },
+ { 0, 0, 0, 0x01, 1, -1, 0x04, 0 },
+ { 0, 0, 1, 0x01, 2, -1, 0x02, 0 },
+ { 4, 4, 0, 0x08, -1, -1, 0x02, 8 }, // subblock 1
+ { 4, 4, 1, 0x02, -1, -1, 0x01, 0 },
+ { 4, 4, 0, 0x02, 1, -1, 0x08, 8 },
+ { 0, 8, 0, 0x10, -1, -1, 0x04, 0 }, // subblock 2
+ { 0, 4, 1, 0x04, -1, 2, 0x08, 8 },
+ { 0, 4, 1, 0x04, 2, -1, 0x08, 8 },
+ { 4, 8, 0, 0x20, -1, -1, 0x08, 8 }, // subblock 3
+ { 4, 4, 1, 0x08, -1, -1, 0x04, 8 },
+};
+
+static void rv40_loop_filter(RV34DecContext *r)
+{
+ MpegEncContext *s = &r->s;
+ int mb_pos;
+ int i, j, k;
+ uint8_t *Y, *C;
+ int alpha, beta, betaY, betaC;
+ int q;
+ // 0 - cur block, 1 - top, 2 - left, 3 - bottom
+ int mbtype[4], clip[4], mvmasks[4], cbp[4], uvcbp[4][2];
+
+ if(s->pict_type == FF_B_TYPE)
+ return;
+
+ for(s->mb_y = 0; s->mb_y < s->mb_height; s->mb_y++){
+ mb_pos = s->mb_y * s->mb_stride;
+ for(s->mb_x = 0; s->mb_x < s->mb_width; s->mb_x++, mb_pos++){
+ int btype = s->current_picture_ptr->mb_type[mb_pos];
+ if(IS_INTRA(btype) || IS_SEPARATE_DC(btype)){
+ r->cbp_luma [mb_pos] = 0xFFFF;
+ }
+ if(IS_INTRA(btype)){
+ r->cbp_chroma[mb_pos] = 0xFF;
+ }
+ }
+ }
+ for(s->mb_y = 0; s->mb_y < s->mb_height; s->mb_y++){
+ mb_pos = s->mb_y * s->mb_stride;
+ for(s->mb_x = 0; s->mb_x < s->mb_width; s->mb_x++, mb_pos++){
+ int y_h_deblock, y_v_deblock;
+ int c_v_deblock[2], c_h_deblock[2];
+
+ ff_init_block_index(s);
+ ff_update_block_index(s);
+ Y = s->dest[0];
+ q = s->current_picture_ptr->qscale_table[mb_pos];
+ alpha = rv40_alpha_tab[q];
+ beta = rv40_beta_tab [q];
+ betaY = betaC = beta * 3;
+ if(s->width * s->height <= 0x6300){
+ betaY += beta;
+ }
+
+ mvmasks[0] = r->deblock_coefs[mb_pos];
+ mbtype [0] = s->current_picture_ptr->mb_type[mb_pos];
+ cbp [0] = r->cbp_luma[mb_pos];
+ uvcbp[0][0] = r->cbp_chroma[mb_pos] & 0xF;
+ uvcbp[0][1] = r->cbp_chroma[mb_pos] >> 4;
+ for(i = 1; i < 4; i++){
+ mvmasks[i] = 0;
+ mbtype [i] = mbtype[0];
+ cbp [i] = 0;
+ uvcbp[1][0] = uvcbp[1][1] = 0;
+ }
+ if(s->mb_y){
+ mvmasks[1] = r->deblock_coefs[mb_pos - s->mb_stride] & 0xF000;
+ mbtype [1] = s->current_picture_ptr->mb_type[mb_pos - s->mb_stride];
+ cbp [1] = r->cbp_luma[mb_pos - s->mb_stride] & 0xF000;
+ uvcbp[1][0] = r->cbp_chroma[mb_pos - s->mb_stride] & 0xC;
+ uvcbp[1][1] = (r->cbp_chroma[mb_pos - s->mb_stride] >> 4) & 0xC;
+ }
+ if(s->mb_x){
+ mvmasks[2] = r->deblock_coefs[mb_pos - 1] & 0x8888;
+ mbtype [2] = s->current_picture_ptr->mb_type[mb_pos - 1];
+ cbp [2] = r->cbp_luma[mb_pos - 1] & 0x8888;
+ uvcbp[2][0] = r->cbp_chroma[mb_pos - 1] & 0xA;
+ uvcbp[2][1] = (r->cbp_chroma[mb_pos - 1] >> 4) & 0xA;
+ }
+ if(s->mb_y < s->mb_height - 1){
+ mvmasks[3] = r->deblock_coefs[mb_pos + s->mb_stride] & 0x000F;
+ mbtype [3] = s->current_picture_ptr->mb_type[mb_pos + s->mb_stride];
+ cbp [3] = r->cbp_luma[mb_pos + s->mb_stride] & 0x000F;
+ uvcbp[3][0] = r->cbp_chroma[mb_pos + s->mb_stride] & 0x3;
+ uvcbp[3][1] = (r->cbp_chroma[mb_pos + s->mb_stride] >> 4) & 0x3;
+ }
+ for(i = 0; i < 4; i++){
+ mbtype[i] = (IS_INTRA(mbtype[i]) || IS_SEPARATE_DC(mbtype[i])) ? 2 : 1;
+ clip[i] = rv40_filter_clip_tbl[mbtype[i]][q];
+ }
+ y_h_deblock = cbp[0] | ((cbp[0] << 4) & ~0x000F) | (cbp[1] >> 12)
+ | ((cbp[3] << 20) & ~0x000F) | (cbp[3] << 16)
+ | mvmasks[0] | (mvmasks[3] << 16);
+ y_v_deblock = ((cbp[0] << 1) & ~0x1111) | (cbp[2] >> 3)
+ | cbp[0] | (cbp[3] << 16)
+ | mvmasks[0] | (mvmasks[3] << 16);
+ if(!s->mb_x){
+ y_v_deblock &= ~0x1111;
+ }
+ if(!s->mb_y){
+ y_h_deblock &= ~0x000F;
+ }
+ if(s->mb_y == s->mb_height - 1 || (mbtype[0] == 2 || mbtype[3] == 2)){
+ y_h_deblock &= ~0xF0000;
+ }
+ cbp[0] = cbp[0] | (cbp[3] << 16)
+ | mvmasks[0] | (mvmasks[3] << 16);
+ for(i = 0; i < 2; i++){
+ c_v_deblock[i] = ((uvcbp[0][i] << 1) & ~0x5) | (uvcbp[2][i] >> 1)
+ | (uvcbp[3][i] << 4) | uvcbp[0][i];
+ c_h_deblock[i] = (uvcbp[3][i] << 4) | uvcbp[0][i] | (uvcbp[1][i] >> 2)
+ | (uvcbp[3][i] << 6) | (uvcbp[0][i] << 2);
+ uvcbp[0][i] = (uvcbp[3][i] << 4) | uvcbp[0][i];
+ if(!s->mb_x){
+ c_v_deblock[i] &= ~0x5;
+ }
+ if(!s->mb_y){
+ c_h_deblock[i] &= ~0x3;
+ }
+ if(s->mb_y == s->mb_height - 1 || mbtype[0] == 2 || mbtype[3] == 2){
+ c_h_deblock[i] &= ~0x30;
+ }
+ }
+
+ for(j = 0; j < RV40_LUMA_LOOP_FIRST; j++){
+ RV40LoopFilterCond *loop = rv40_loop_cond_luma_first_row + j;
+ int cond, edgecond = 1, nonedgecond = 1, clip_cur, clip_next;
+ Y = s->dest[0] + loop->x + loop->y * s->linesize;
+ cond = (loop->dir ? y_v_deblock : y_h_deblock) & loop->filt_mask;
+ if(loop->edge_mbtype != -1){
+ edgecond = (mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
+ }
+ if(loop->nonedge_mbtype != -1){
+ nonedgecond = !(mbtype[0] == 2 || mbtype[loop->nonedge_mbtype] == 2);
+ }
+ clip_cur = cbp[0] & loop->filt_mask ? clip[0] : 0;
+ if(!loop->x && loop->dir){
+ clip_next = (cbp[2] | mvmasks[2]) & loop->next_clip_mask ? clip[2] : 0;
+ }else if(!loop->y && !loop->dir){
+ clip_next = (cbp[1] | mvmasks[1]) & loop->next_clip_mask ? clip[1] : 0;
+ }else{
+ clip_next = cbp[0] & loop->next_clip_mask ? clip[0] : 0;
+ }
+ if(cond && edgecond && nonedgecond){
+ if(loop->dir){
+ rv40_v_loop_filter(Y, s->linesize, loop->dither,
+ clip_cur, clip_next,
+ alpha, beta, betaY, 0, loop->edge_mbtype != -1);
+ }else{
+ rv40_h_loop_filter(Y, s->linesize, loop->dither,
+ clip_cur, clip_next,
+ alpha, beta, betaY, 0, loop->edge_mbtype != -1);
+ }
+ }
+ }
+ for(j = 4; j < 12; j++){
+ for(k = 0; k < RV40_LUMA_LOOP_NEXT; k++){
+ RV40LoopFilterCond *loop = rv40_loop_cond_luma_next_rows + k;
+ int cond, edgecond = 1, nonedgecond = 1, clip_cur, clip_next;
+ Y = s->dest[0] + loop->x + (loop->y + j) * s->linesize;
+ cond = (loop->dir ? y_v_deblock : y_h_deblock) & (loop->filt_mask << j);
+ if(loop->edge_mbtype != -1){
+ edgecond = (mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
+ }
+ if(loop->nonedge_mbtype != -1){
+ nonedgecond = !(mbtype[0] == 2 || mbtype[loop->nonedge_mbtype] == 2);
+ }
+ clip_cur = cbp[0] & (loop->filt_mask << j) ? clip[0] : 0;
+ if(!loop->x && loop->dir){
+ clip_next = (cbp[2] | mvmasks[2]) & (loop->next_clip_mask << j) ? clip[2] : 0;
+ }else{
+ clip_next = cbp[0] & (loop->next_clip_mask << j) ? clip[0] : 0;
+ }
+ if(cond && edgecond && nonedgecond){
+ if(loop->dir){
+ rv40_v_loop_filter(Y, s->linesize, loop->dither + j,
+ clip_cur, clip_next,
+ alpha, beta, betaY, 0, loop->edge_mbtype != -1);
+ }else{
+ rv40_h_loop_filter(Y, s->linesize, loop->dither + j,
+ clip_cur, clip_next,
+ alpha, beta, betaY, 0, loop->edge_mbtype != -1);
+ }
+ }
+ }
+ }
+ for(i = 0; i < 2; i++){
+ for(j = 0; j < RV40_CHROMA_LOOP; j++){
+ RV40LoopFilterCond *loop = rv40_loop_cond_chroma + j;
+ int cond, edgecond = 1, nonedgecond = 1, clip_cur, clip_next;
+ C = s->dest[i+1] + loop->x + loop->y * s->uvlinesize;
+ cond = (loop->dir ? c_v_deblock[i] : c_h_deblock[i]) & loop->filt_mask;
+ if(loop->edge_mbtype != -1){
+ edgecond = (mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
+ }
+ if(loop->nonedge_mbtype != -1){
+ nonedgecond = !(mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
+ }
+ clip_cur = uvcbp[0][i] & loop->filt_mask ? clip[0] : 0;
+ if(!loop->x && loop->dir){
+ clip_next = uvcbp[2][i] & loop->next_clip_mask ? clip[2] : 0;
+ }else if(!loop->y && !loop->dir){
+ clip_next = uvcbp[1][i] & loop->next_clip_mask ? clip[1] : 0;
+ }else{
+ clip_next = uvcbp[0][i] & loop->next_clip_mask ? clip[0] : 0;
+ }
+ if(cond && edgecond && nonedgecond){
+ if(loop->dir){
+ rv40_v_loop_filter(C, s->uvlinesize, loop->dither,
+ clip_cur, clip_next,
+ alpha, beta, betaC, 1, loop->edge_mbtype != -1);
+ }else{
+ rv40_h_loop_filter(C, s->uvlinesize, loop->dither,
+ clip_cur, clip_next,
+ alpha, beta, betaC, 1, loop->edge_mbtype != -1);
+ }
+ }
+ }
+ }
+ }
+ }
+}
+
+/**
* Initialize decoder.
*/
static av_cold int rv40_decode_init(AVCodecContext *avctx)
@@ -261,6 +716,8 @@
r->parse_slice_header = rv40_parse_slice_header;
r->decode_intra_types = rv40_decode_intra_types;
r->decode_mb_info = rv40_decode_mb_info;
+ r->loop_filter = rv40_loop_filter;
+ r->set_deblock_coef = rv40_set_deblock_coef;
r->luma_dc_quant_i = rv40_luma_dc_quant[0];
r->luma_dc_quant_p = rv40_luma_dc_quant[1];
return 0;
More information about the ffmpeg-devel
mailing list