[Libav-user] hw_decode.c example, help getting cuda decoded frame into application cuda memory

Thu Jun 12 13:42:00 EEST 2025

Hello Cole,

We've done this sort of stuff in our company.

> My questions are:
>
> How can I copy the hw (NV12 format) frame into my own cuda memory (cudaMalloc from cuda_runtime.h)?

You have to use one of the CUDA memcpy functions. The AVFrame's data
member should have the first 2 entries pointing to the Y and the
UV(interleaved) planes. You will also need the size of the total
memory and pitch to understand it. (The width and height will give you
the frame pixel dimensions of course.) The hw codec also uses a pool
of frames, with their own cuda context. So you are fine to use your
own cuda stream, etc. Just don't hang on to the AVFrame too much as it
needs to go back to the pool.

> How can I do the colorspace conversion? Is there a way to do this with libav* libraries prior to copying into my application memory? If not, I could probably write my own cuda kernel or something.
>

We had to write our own kernels too - because we couldn't find
anything off-the-shelf. Because we wanted to convert to sRGB pixels,
we had to figure out the colorspace matrices, as well. So, you could
start with something basic, and the compare it with what FFmpeg's
Swscalar is doing. This can be done by converting the NV12 frame into
a YUV420 frame (cuda kernel) and copy that YUV420 to CPU.

- Nolan.