[FFmpeg-user] NVDEC/NVENC resources underutilization

Garri Djavadyan garryd at comnet.uz
Wed Feb 28 15:44:47 EET 2018


Hello FFmpeg community,


I faced a problem with NVDEC/NVENC resources underutilization while
running one ffmpeg instance.

We use ffmpeg to convert various format videos to MP4(h264/aac) and
applying logo overlay. Hardware decoding and encoding process is
performed by NVDEC and NVENC chips. For example, our cmdline is:

/usr/local/ffmpeg-dev/bin/ffmpeg -y -c:v mpeg4_cuvid \
  -i input.avi -i logo.png \
  -filter_complex [0:v:0][1:v:0]overlay=10:10[out1] \
  -map [out1] -map 0:a:0 -map_metadata -1 -map_chapters -1 \
  -c:v h264_nvenc -b:v 1024k -r 25 \
  -c:a libfdk_aac -b:a 128k \
  -movflags faststart out.mp4


The overal transcoding process is greatly accelerated. But I see, that
NVDEC/NVENC cycles are not fully utilized. For example:

# nvidia-smi dmon
# gpu   pwr  temp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     %     %     %     %   MHz   MHz
    0    32    50    11     3    22    16  3802  1632
    0    32    50    11     3    23    14  3802  1632
    0    32    50    12     3    22    15  3802  1632
    0    33    51    13     3    18    15  3802  1632
    0    32    51    12     3    17    11  3802  1632
    0    32    51    10     2    20    13  3802  1632


I tried to find a bottleneck, but all system resources are OK. For
example, CPU (top output, at least 60% idle):

Tasks: 296 total,   3 running, 292 sleeping,   0 stopped,   1 zombie
%Cpu0  : 14,6 us,  0,7 sy,  0,0 ni, 84,7 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu1  : 11,2 us,  1,4 sy,  0,0 ni, 87,5 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu2  : 14,5 us,  1,3 sy,  0,0 ni, 84,2 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu3  :  4,4 us,  0,7 sy,  0,0 ni, 94,9 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu4  : 39,6 us,  1,3 sy,  0,0 ni, 59,1 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu5  :  2,4 us,  0,7 sy,  0,0 ni, 96,6 id,  0,3 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu6  :  2,7 us,  4,3 sy,  0,0 ni, 93,0 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu7  : 21,0 us,  0,7 sy,  0,0 ni, 78,3 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
КiB Mem:  12261384 total, 10578352 used,  1683032 free,  4809968
buffers
КiB Swap:  7999484 total,   584072 used,  7415412 free.  1431448 cached
Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND                                                                
           
21245 user      20   0 17,516g 235792 187228 R 101,3  1,9   1:21.15
ffmpeg                                                                 
           
 1512 root     -51   0       0      0      0 S   7,0  0,0  17:29.39
irq/47-nvidia

-----------------
Memory:

# free -m
             total       used       free     shared    buffers     cach
ed
Mem:      11974       8619       3354        133       3758        653
-/+ buffers/cache:       4207       7766
Swap:       7811        570       7241             

-----------------
Storage I/O:

# iostat -x 1 3
Linux 3.13.0-142-generic (user-desktop) 	28.02.2018 	_x86
_64_	(8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          10,86    3,03    1,43    4,75    0,00   79,93

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,49     0,00    0,01    0,00     0,29     0,00    45
,90     0,00   47,45   47,45    0,00  14,55   0,02
sdb             849,21   769,16   47,81   22,18  3859,21  3935,51   222
,74     9,18  131,22   23,62  363,20   4,05  28,38

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          14,65    0,00    1,78    0,00    0,00   83,57

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0
,00     0,00    0,00    0,00    0,00   0,00   0,00
sdb               0,00     0,00    0,00    0,00     0,00     0,00     0
,00     0,00    0,00    0,00    0,00   0,00   0,00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          21,24    0,00    2,53    0,51    0,00   75,73

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0
,00     0,00    0,00    0,00    0,00   0,00   0,00
sdb               0,00     7,00    0,00    3,00     0,00    88,00    58
,67     0,04   13,33    0,00   13,33  13,33   4,00


--------------------
FFmper version and configuration options:

# /usr/local/ffmpeg-dev/bin/ffmpeg -version
ffmpeg version N-90054-g474194a Copyright (c) 2000-2018 the FFmpeg
developers
built with gcc 4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.4)
configuration: --prefix=/usr/local/ffmpeg-dev --enable-gpl --enable-
nonfree --enable-libfdk-aac --enable-libx264 --enable-nvenc --enable-
libnpp
libavutil      56.  7.101 / 56.  7.101
libavcodec     58. 11.101 / 58. 11.101
libavformat    58.  9.100 / 58.  9.100
libavdevice    58.  1.100 / 58.  1.100
libavfilter     7. 12.100 /  7. 12.100
libswscale      5.  0.101 /  5.  0.101
libswresample   3.  0.101 /  3.  0.101
libpostproc    55.  0.100 / 55.  0.100


---------------------
NVIDIA driver and card information:

# nvidia-smi
Wed Feb 28 17:20:13 2018       
+--------------------------------------------------------------------
---------+
| NVIDIA-SMI 384.111                Driver Version:
384.111                   |
|-------------------------------+----------------------+---------------
-------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile
Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-
Util  Compute M. |
|===============================+======================+===============
=======|
|   0  GeForce GTX 106...  Off  | 00000000:01:00.0  On
|                  N/A |
|  0%   53C    P2    32W / 150W |    939MiB /  6071MiB
|     11%      Default |
+-------------------------------+----------------------+---------------
-------+
                                                                       
        
+--------------------------------------------------------------------
---------+
| Processes:                                                       GPU
Memory |
|  GPU       PID   Type   Process
name                             Usage      |
|======================================================================
=======|
|    0      1423      G   /usr/bin/X                                   
443MiB |
|    0      2555      G   compiz                                       
224MiB |
|    0     14879      G   ...-token=ACEXXXXXXXXX
XX2E9DDXXXXXXXE41   107MiB |
|    0     21458      C   /usr/local/ffmpeg-
dev/bin/ffmpeg             159MiB |
+--------------------------------------------------------------------
---------+


I believe I overlooked something, or maybe there are some limitations.
So, I kindly ask your suggestions. Many thanks in advance!


Garri


More information about the ffmpeg-user mailing list