[FFmpeg-devel] [PATCH] delogo filter: new "uglarm" interpolation mode added

Nicolas George george at nsup.org
Wed Jan 2 16:37:50 EET 2019

Uwe Freese (2019-01-01):
> > This can be optimized, and since it is the inner loop of the filter, I
> > think it is worth it. You can declare a pointer that will stay the same
> > for the whole y loop:
> > 
> > 	double *xweight = uglarmtable + (y - logo_y1) * (logo_w - 1);
> > 
> > and then use it in that loop:
> > 
> > 	interpd += ... * xweight[abs(bx - (x - logo_x1))];
> > 
> > It avoids a lot of multiplications.
> I'm not sure about this point. First, I would absolutely assume from the
> compiler that it optimizes expressions like "(logo_w - 1)" or a fixed offset
> to access the array in the loop and that they are only calculated once.

Relying a lot on compiler optimizations is a sure way of being
disappointed. But the logo_w_minus_one variable was not about
optimization but about organization. Each time there is the same
computation at several places, it is a sign that a specific variable
should probably be used. That way, if the design changes a little, only
the initialization of the variable needs to be changed.

In this particular instance, logo_w_minus_one was not a good name (it
was for explaining); table_stride would be better. That way, if you
decide to change the structure of the table, you do not need to look at
all uses of logo_w, you just need to update the initialization of

> Secondly, I don't exactly understand how *xweight in your example should be
> used (and would mean smaller or easier code).


> > > +                interp = (uint64_t)(interpd /
> > > +                    uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) * (logo_w - 2)]);
> > The cast to uint64_t seems suspicious. Can you explain?
> Every pixel value of the inner logo area is an integer. Only for the
> calculation of the weighted average, all values use floats. At the end, it
> is rounded (truncated) to an int.

int and uint64_t are not the same thing. Why uint64_t?

> > 	pow(x * x * aspect2 + y * y, exponent / 2);
> Hm. Again, I'm unsure if this "optimization" is worth it. I wouldn't do this
> normally.

In this particular instance, definitely yes. Compilers have very little
latitude to optimize floating point operations, and they will certainly
not optimize mathematical functions based on subtle arithmetic
properties. This is a C compiler, not a TI-89.

> > But I have a better suggestion:
> > 
> > #define MAX_SUB 2
> > 
> > 	double uglarmtable[MAX_SUB + 1][MAX_SUB + 1]
> > 
> > and use uglarmtable[hsub][vsub]. That way, for YUVA420P for example, the
> > table will be computed only once for Y and A and once for U and V.
> > 
> > The assert is still needed with that solution, though.
> I don't understand this. Please provide a complete example, or it stays as
> it is. - and could of course be optimized later.

I do not see how it could be more complete without including code that
is irrelevant to the example. But since you insist:

#define MAX_SUB 2

typedef struct DelogoContext {
    double uglarmtable[MAX_SUB + 1][MAX_SUB + 1]
    av_assert0(hsub <= MAX_SUB && vsub <= MAX_SUB);
    if (!s->uglarmtable[hsub][vsub])
	s->uglarmtable[hsub][vsub] = av_malloc_array(...);

> But why should the for loop run in xy mode and check "planes count" times to
> free the memory? The code without the "if" also looks to me more like this
> is not checked by mistake.

This is the usual way this project does things: free everything
unconditionally. The rationale is that it is less likely to lead to
leaks in case of code change.

> Hope that the code is correct like this?
> s->uglarmtable[plane] =
>                 av_malloc_array((logo_w - 1) * (logo_h - 1), sizeof(s->uglarmtable[plane]));

Sorry, no: you are taking the size of the pointer uglarmtable[plane];
you need to take the size of the pointed objects *uglarmtable[plane].


  Nicolas George

More information about the ffmpeg-devel mailing list