[FFmpeg-devel] [PATCH v2] lavfi: add nlmeans CUDA filter

Timo Rothenpieler timo at rothenpieler.org
Sun Nov 28 14:16:32 EET 2021


> +    for (i = 0; i < nb_pixel / 4; i++) {
> +
> +        int *dx_cur = dxdy + 8 * i;
> +        int *dy_cur = dxdy + 8 * i + 4;
> +
> +        call_horiz(ctx, 1, src_dptr, src_width, src_height, src_pitch,
> +                   integ_img, dx_cur, dy_cur, pixel_size);
> +
> +        call_vert(ctx, 1, src_width, src_height, integ_img, pixel_size);
> +
> +        call_weight(ctx, 1, src_dptr, src_width, src_height, src_pitch, integ_img, (float*)s->sum, (float*)s->weight, p, dx_cur, dy_cur, pixel_size);
> +    }
> +
> +    call_average(ctx, 1, src_dptr, src_width, src_height, src_pitch, (float*)s->sum, (float*)s->weight,
> +                   dst_dptr, dst_width, dst_height, dst_pitch, pixel_size);

My immediate thought when seeing that block is "move this all to the 
CUDA side", but you're calling all those with different block layouts?

I don't understand the algorithm well enough, so I guess this is necessary.

How well does it perform? All those jumps between C and CUDA code come 
at an overhead.


Some other nits:
I'm not a fan of a functions just called "init", "uninit" and so on. 
It's not wrong, given it's static, but it's usually nicer to give all 
functions a common prefix. "cunlmeans_" or something like that.

What's up with that if(!s->initialised) block in filter_frame? I would 
have thought it's logically impossible that it gets that far without 
init being called?



Otherwise, the filter looks fine to me.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4494 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20211128/fbfcb212/attachment.bin>


More information about the ffmpeg-devel mailing list