[FFmpeg-user] Nvidia Transcoding: Failing Using xstack (When Running Under systemd)

Shane Warren shanew at innovsys.com
Mon Nov 18 22:33:14 EET 2024


I have been trying to track down why when transcoding using xstack with nvidia decoding and encoding I get strange decoding issues in ffmpeg.

Note: I use 2 1 minute long .ts files for this example if you want my inputs, they are available here (as input1.ts and input2.ts) :

https://drive.google.com/drive/folders/1mZ8xiNvz5ez1ULlNsy5a3KhnhaqQ2Hgo?usp=drive_link

I got the latest ffmpeg and tried this command (xstacking 2 videos into 1 output):

ffmpeg -y -threads 2 -nostats -loglevel verbose -probesize 5M -filter_threads 4 -threads 2 -re -fflags +genpts -fflags discardcorrupt  \
-extra_hw_frames 2 -hwaccel cuda -hwaccel_output_format cuda -threads 2 -thread_queue_size 4096 -heavy_compr 1 -thread_queue_size 4096 -re -i input1.ts \
-extra_hw_frames 2 -hwaccel cuda -hwaccel_output_format cuda -threads 2 -thread_queue_size 4096 -heavy_compr 1 -thread_queue_size 4096 -re -i input2.ts \
-filter_complex "\
[0:v:0]yadif_cuda=deint=interlaced,scale_cuda=768:432,hwdownload,format=nv12,fps=60000/1001[v0]; \
[1:v:0]yadif_cuda=deint=interlaced,scale_cuda=768:432,hwdownload,format=nv12,fps=60000/1001[v1]; \
[v0][v1] xstack=inputs=2:layout=0_0|0_h0[mosaic];\
[mosaic]hwupload_cuda,scale_cuda=w=1280:h=720:format=yuv420p:force_original_aspect_ratio=decrease,hwdownload,format=yuv420p,pad=1280:720:(ow-iw)/2:(oh-ih)/2,hwupload_cuda[out0]" \
-filter:a:0 "aresample=async=10000,volume=1.00" -c:a:0 ac3 -threads 2 -ac:a:0 6 -ar:a:0 48000 -b:a:0 384k \
-filter:a:1 "aresample=async=10000,volume=1.00" -c:a:1 ac3 -threads 2 -ac:a:1 6 -ar:a:1 48000 -b:a:1 384k \
-map "[out0]" -map "0:a:0" -map "1:a:0" \
-c:v h264_nvenc -b:v 6000k -minrate:v 6000k -maxrate:v 6000k -bufsize:v 12000k -a53cc 1 -tune ll -zerolatency 1 -cbr 1 -forced-idr 1 -strict_gop 1 -threads 2 -profile:v high -level:v 4.2 -bf:v 0 -g:v 30 \
-f mpegts -muxrate 8238520 -pes_payload_size 1528 "udp://@225.105.0.37:10102?pkt_size=1316&bitrate=8238520&burst_bits=10528&ttl=64"

If you run that command in Ubuntu 22.04 it works 100% fine and transcodes till the end of the input file(s).

What doesn't work is if you start that process under systemd non-interactively like so:

systemd-run -S

Then run that same command it will now fail in a strange way.

Note: It's important that you try to output to multicast, if I try the same command outputting to a file, it works fine (my guess is any network-based output exhibits this behavior).

You will see logs like this:

[Parsed_scale_cuda_1 @ 0x55da86a03340] w:1920 h:1080 fmt:nv12 -> w:768 h:432 fmt:nv12

And the about 1-2 seconds before another log comes out.

Eventually (after many stalls and logs) this log comes out and the transcode stops:

[vost#0:0/h264_nvenc @ 0x55da86a3f780] Error submitting a packet to the muxer: Cannot allocate memory

I attached GDB to ffmpeg when it is stalled and its inside trying to compile a cuda script.

If I'm not doing xstack (I'm pretty sure this has to do with multiple inputs) nvidia does not stall.

Does anyone have any idea what is happening here? I launch ffmpeg from a c++ wrapper daemon, if that daemon is started via systemd, then nvidia multiple inputs fail. However, if I launch my daemon by hand at a terminal, it works fine.

Thanks



More information about the ffmpeg-user mailing list