[FFmpeg-devel] [PATCH] avfilter/vf_bwdif_cuda: CUDA implementation of bwdif

Tue Oct 13 18:18:15 EEST 2020

Am Mo., 12. Okt. 2020 um 21:42 Uhr schrieb Philip Langdale <
philipl at overt.org>:

> On Sun, 11 Oct 2020 18:36:42 +0200
> Thomas Mundt <tmundt75 at gmail.com> wrote:
>
> > Hi Philip,
> >
> > Am Fr., 9. Okt. 2020 um 18:33 Uhr schrieb Philip Langdale
> > <philipl at overt.org
> > >:
> >
> > > I've been sitting on this for a couple of years now, and I figured I
> > > should just send it out. This is what I believe is a conceptually
> > > correct port of bwdif to cuda (modulo edge handling which is not
> > > done in the same way because the conditional checks for edges are
> > > expensive in cuda, but that's the same as for yadif_cuda).
> > >
> > > However, I see glitches in some samples where black or white pixels
> > > appear in white or black areas respectively. This seems like some
> > > sort of under/overflow. I've tried to use the largest cuda types
> > > everywhere, and that did appear to improve things but didn't make
> > > it go away. This is what led to me never sending this diff over the
> > > years, but maybe someone else has insights about this.
> > >
> >
> > I am not familiar with cuda. So here is just one difference, which I
> > noticed compared to the c code.
> > Maybe that is the reason for the glitches.
> >
> > > +
> > > +template<typename T>
> > > +__inline__ __device__ T filter(T A, T B, T C, T D,
> > > +                               T a, T b, T c, T d, T e, T f, T g,
> > > +                               T h, T i, T j, T k, T l, T m, T n,
> > > +                               int clip_max)
> > > +{
> > > +    T final;
> > > +
> > > +    int fc = C;
> > > +    int fd = (c + l) >> 1;
> > > +    int fe = B;
> > >
> >
> > In the following you sometimes use B and C directly and sometimes fc
> > and fe. Is there a reason for this?
>
> Unfortunately, I can't remember. This may have had something to do with
> wanting those calculations to be done with smaller data types, but why
> do that? Switch them did not have any obvious visual effect.
>
> >
> > > +
> > > +    int temporal_diff0 = abs(c - l);
> > > +    int temporal_diff1 = (abs(g - fc) + abs(f - fe)) >> 1;
> > > +    int temporal_diff2 = (abs(i - fc) + abs(h - fe)) >> 1;
> > > +    int diff = max3(temporal_diff0 >> 1, temporal_diff1,
> > > temporal_diff2); +
> > > +    if (!diff) {
> > > +        final = fd;
> > > +    } else {
> > > +        int fb = ((d + m) >> 1) - fc;
> > > +        int ff = ((c + l) >> 1) - fe;
> > >
> >
> > If I don´t miss anything this should be:
> > int ff = ((b + k) >> 1) - fe;
>
> I think you're right. This also doesn't seem to change things
> significantly; the glitches are still there, but that's not surprising.
> This fix would make the non-glitched parts more correct.
>
> Thanks for taking a look. I'll keep banging my head against this one.
>

Could you please point out in the description of the bwdif_cuda filter that
the processing of the top and bottom edges and the first and last field is
different from the bwdif filter. This can lead to glitches in the upper and
lower edges and ghosting effects in the first and last field.

Regards,
Thomas