[FFmpeg-devel] [PATCH v1] lavfi: add nlmeans CUDA filter

Dylan Fernando dylanf123 at gmail.com
Sun Aug 15 20:19:18 EEST 2021


On Sat, Aug 14, 2021 at 1:03 PM Timo Rothenpieler <timo at rothenpieler.org>
wrote:

> On 14.08.2021 07:49, Dylan Fernando wrote:
> > On Sat, Aug 14, 2021 at 9:11 AM Timo Rothenpieler <timo at rothenpieler.org
> >
> > wrote:
> >
> >> On 13.08.2021 10:42, Dylan Fernando wrote:
> >>> Any update on this?
> >>>
> >>> Kind Regards,
> >>> Dylan
> >>
> >> Also, are you sure that exp() function is correct?
> >>
> >> The CUDA-Function exp() is defined as "double exp(double x)" and
> >> calculates the base e exponential.
> >>
> >> While __nvvm_ex2_approx_f reads to me like it does so for floats, and
> >> for base 2. For which the CUDA equivalent would be "float exp2f(float)".
> >>
> >> _______________________________________________
> >> ffmpeg-devel mailing list
> >> ffmpeg-devel at ffmpeg.org
> >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >>
> >> To unsubscribe, visit link above, or email
> >> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
> >>
> >
> > I wasn't sure about the exp() function. Is there a function like
> > __nvvm_exp_approx_d? I can't seem to find a function for this.
>
> Looking into it some more, that's simply because there is no other fast
> approx exp function than ex2.
> If I use __expf() with nvcc, it spawns the following code:
>
>         ld.param.f32    %f1, [param];
>         mul.f32         %f2, %f1, 0f3FB8AA3B;
>         ex2.approx.f32  %f3, %f2;
>
> So it multiplies the input value by some factor, and then runs it
> through it.
> Given by math, this value must be log2(euler_constant), or log2(exp(1)),
> for lack of the constant being defined.
>
> So the implementation of __expf() would look like this:
>
> > static inline __device__ float __expf(float a) { return
> __nvvm_ex2_approx_f(a * (float)__builtin_log2(__builtin_exp(1))); }
>
> With llvm, this now spawns the exact same code:
>
>         ld.param.f32    %f1, [param];
>         mul.f32         %f2, %f1, 0f3FB8AA3B;
>         ex2.approx.f32  %f3, %f2;
>
>
> I will push that function soon, so you can just use __expf() in your
> code. Assuming you want exp to base e.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>
> Attatched updated patch
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-lavfi-add-nlmeans_cuda-filter.patch
Type: text/x-patch
Size: 42389 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20210815/1a5e1cba/attachment.bin>


More information about the ffmpeg-devel mailing list