[FFmpeg-user] Using a AMD Radeon RX5xx with ffmpeg

Dennis Mungai dmngaie at gmail.com
Wed Jul 25 07:16:51 EEST 2018


>From what I can gather, AMD's driver implementation for VAAPI (gallium?
through mesa) is a work in progress, and compared to i915 (intel's), is
quite behind.

On your system, are you able to build FFmpeg to utilize OMX IL? AMD has
support for it via the VCE block. See this for an example on enabling it:
https://github.com/legotheboss/YouTube-files/wiki/(RPi)-Compile-FFmpeg-with-the-OpenMAX-H.264-GPU-acceleration

The guide was written for the rPI, but what we're interested in is OpenMAX
bellagio and the configuration switches that enable OpenMAX IL encoders.



On 24 July 2018 at 12:01, Lukas Obermann <obermann.lukas at gmail.com> wrote:

> Hello Dennis,
>
> thank you for your help! Much appreciate it.
>
> Using your command I get a 1.9x speed. So a slight improvement, but not
> much.
> I pasted the debug output here, maybe you can see something usefull?
> https://pastebin.com/W0KKjZbN <https://pastebin.com/W0KKjZbN>
>
> ad 1. Yes, there is the onboard intel device and 6 of those RX570 that in
> the end I want to have all transcode stuff in parallel.
>
> lukas at transcoder:~$ vainfo --display drm --device /dev/dri/card1
> libva info: VA-API version 1.1.0
> libva info: va_getDriverName() returns 0
> libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/
> radeonsi_drv_video.so
> libva info: Found init function __vaDriverInit_1_1
> libva info: va_openDriver() returns 0
> vainfo: VA-API version: 1.1 (libva 2.1.0)
> vainfo: Driver version: mesa gallium vaapi
> vainfo: Supported profile and entrypoints
>       VAProfileMPEG2Simple            : VAEntrypointVLD
>       VAProfileMPEG2Main              : VAEntrypointVLD
>       VAProfileVC1Simple              : VAEntrypointVLD
>       VAProfileVC1Main                : VAEntrypointVLD
>       VAProfileVC1Advanced            : VAEntrypointVLD
>       VAProfileH264ConstrainedBaseline: VAEntrypointVLD
>       VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
>       VAProfileH264Main               : VAEntrypointVLD
>       VAProfileH264Main               : VAEntrypointEncSlice
>       VAProfileH264High               : VAEntrypointVLD
>       VAProfileH264High               : VAEntrypointEncSlice
>       VAProfileHEVCMain               : VAEntrypointVLD
>       VAProfileHEVCMain10             : VAEntrypointVLD
>       VAProfileJPEGBaseline           : VAEntrypointVLD
>       VAProfileNone                   : VAEntrypointVideoProc
>
> lukas at transcoder:~$ ls -la /dev/dri/
> total 0
> drwxr-xr-x   3 root root       340 Jul 23 15:47 .
> drwxr-xr-x  19 root root      5020 Jul 23 15:47 ..
> drwxr-xr-x   2 root root       320 Jul 23 15:47 by-path
> crw-rw----+  1 root video 226,   0 Jul 23 15:47 card0
> crw-rw----+  1 root video 226,   1 Jul 23 15:47 card1
> crw-rw----+  1 root video 226,   2 Jul 23 15:47 card2
> crw-rw----+  1 root video 226,   3 Jul 23 15:47 card3
> crw-rw----+  1 root video 226,   4 Jul 23 15:47 card4
> crw-rw----+  1 root video 226,   5 Jul 23 15:47 card5
> crw-rw----+  1 root video 226,   6 Jul 23 15:47 card6
> crw-rw----+  1 root video 226, 128 Jul 23 15:47 renderD128
> crw-rw----+  1 root video 226, 129 Jul 23 15:47 renderD129
> crw-rw----+  1 root video 226, 130 Jul 23 15:47 renderD130
> crw-rw----+  1 root video 226, 131 Jul 23 15:47 renderD131
> crw-rw----+  1 root video 226, 132 Jul 23 15:47 renderD132
> crw-rw----+  1 root video 226, 133 Jul 23 15:47 renderD133
> crw-rw----+  1 root video 226, 134 Jul 23 15:47 renderD134
>
>
> ad 2. ok, understand. Is there a benefit of doing it that way?
>
> ad 3. I have done two tests now with only the decoder running, which are
> confusing me now even more.
>
> So running following command:
> ffmpeg -init_hw_device vaapi=amd:/dev/dri/renderD129 -hwaccel vaapi
> -hwaccel_output_format vaapi -hwaccel_device amd -filter_hw_device amd -i
> fs_experiental_method.avi  -f null -
>
> Results in ~ 12x speed
> frame= 5824 fps=340 q=-0.0 Lsize=N/A time=00:03:14.47 bitrate=N/A
> speed=11.4x
>
> But, using the CPU (a dual core pentium from last year)
> ffmpeg -i fs_experiental_method.avi  -f null -
>
> Results in ~ 14x speed
> frame=10570 fps=408 q=-0.0 Lsize=N/A time=00:05:52.90 bitrate=N/A
> speed=13.6x
>
> Of course the vaapi one uses only like 10% of CPU while the CPU one uses
> 100%.
>
> The graph looks like this for vaapi:
>
> [graph_1_in_0_1 @ 0x557f5d1cd5c0] Setting 'time_base' to value '1/32000'
> [graph_1_in_0_1 @ 0x557f5d1cd5c0] Setting 'sample_rate' to value '32000'
> [graph_1_in_0_1 @ 0x557f5d1cd5c0] Setting 'sample_fmt' to value 'fltp'
> [graph_1_in_0_1 @ 0x557f5d1cd5c0] Setting 'channel_layout' to value '0x4'
> [graph_1_in_0_1 @ 0x557f5d1cd5c0] tb:1/32000 samplefmt:fltp
> samplerate:32000 chlayout:0x4
> [format_out_0_1 @ 0x557f5d29fe40] Setting 'sample_fmts' to value 's16'
> [format_out_0_1 @ 0x557f5d29fe40] auto-inserting filter 'auto_resampler_0'
> between the filter 'Parsed_anull_0' and the filter 'format_out_0_1'
> [AVFilterGraph @ 0x557f5d1ceec0] query_formats: 4 queried, 6 merged, 3
> already done, 0 delayed
> [auto_resampler_0 @ 0x557f5d2be600] [SWR @ 0x557f5d1d8200] Using fltp
> internally between filters
> [auto_resampler_0 @ 0x557f5d2be600] ch:1 chl:mono fmt:fltp r:32000Hz ->
> ch:1 chl:mono fmt:s16 r:32000Hz
> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'video_size' to
> value '1920x1080'
> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'pix_fmt' to
> value '46'
> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'time_base' to
> value '1/30'
> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'pixel_aspect' to
> value '0/1'
> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'sws_param' to
> value 'flags=2'
> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'frame_rate' to
> value '30/1'
> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] w:1920 h:1080
> pixfmt:vaapi_vld tb:1/30 fr:30/1 sar:0/1 sws_param:flags=2
> [AVFilterGraph @ 0x557f5d17e480] query_formats: 3 queried, 2 merged, 0
> already done, 0 delayed
>
> and like this for the cpu:
>
> [graph_1_in_0_1 @ 0x55986d1cab40] Setting 'time_base' to value '1/32000'
> [graph_1_in_0_1 @ 0x55986d1cab40] Setting 'sample_rate' to value '32000'
> [graph_1_in_0_1 @ 0x55986d1cab40] Setting 'sample_fmt' to value 'fltp'
> [graph_1_in_0_1 @ 0x55986d1cab40] Setting 'channel_layout' to value '0x4'
> [graph_1_in_0_1 @ 0x55986d1cab40] tb:1/32000 samplefmt:fltp
> samplerate:32000 chlayout:0x4
> [format_out_0_1 @ 0x55986d19c740] Setting 'sample_fmts' to value 's16'
> [format_out_0_1 @ 0x55986d19c740] auto-inserting filter 'auto_resampler_0'
> between the filter 'Parsed_anull_0' and the filter 'format_out_0_1'
> [AVFilterGraph @ 0x55986d090fc0] query_formats: 4 queried, 6 merged, 3
> already done, 0 delayed
> [auto_resampler_0 @ 0x55986d1ab680] [SWR @ 0x55986d0cd740] Using fltp
> internally between filters
> [auto_resampler_0 @ 0x55986d1ab680] ch:1 chl:mono fmt:fltp r:32000Hz ->
> ch:1 chl:mono fmt:s16 r:32000Hz
> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'video_size' to
> value '1920x1080'
> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'pix_fmt' to
> value '0'
> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'time_base' to
> value '1/30'
> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'pixel_aspect' to
> value '0/1'
> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'sws_param' to
> value 'flags=2'
> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'frame_rate' to
> value '30/1'
> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] w:1920 h:1080
> pixfmt:yuv420p tb:1/30 fr:30/1 sar:0/1 sws_param:flags=2
> [AVFilterGraph @ 0x55986d196c40] query_formats: 3 queried, 2 merged, 0
> already done, 0 delayed
>
>
> I find it very strange that CPU decoding is faster then GPU decoding. Or
> maybe is it a bottleneck? I am a bit lost right now I have to say.
>
>
>
> > On 23.07.2018, at 22:51, Dennis Mungai <dmngaie at gmail.com> wrote:
> >
> > Hello there,
> >
> > Here's something you can try:
> >
> > ffmpeg -init_hw_device vaapi=amd:/dev/dri/renderD129 -hwaccel vaapi
> > -hwaccel_output_format vaapi -hwaccel_device amd -filter_hw_device amd -i
> > fs_experiental_method.avi -vf 'format=nv12|vaapi,hwupload' -y -c:v
> > h264_vaapi -qp:v 21 -sei +identifier+timing+recovery_point -profile:v
> main
> > -level 4 output.avi
> >
> > Assumptions made:
> >
> > 1. You have another GPU on the system. See the DRI device you highlighted
> > (/dev/dri/card1) is implied to be the second render node because the
> first
> > ordinal device would have been /dev/dri/card0, mapped to
> > /dev/dri/renderD128.
> >
> > Confirm this by providing the output of:
> >
> > (a). vainfo
> > (b). ls -al /dev/dri/
> >
> > 2. We explicitly initialize and name the hardware device
> > (/dev/dri/renderD129) to 'amd' and pass it to both the decoder, encoder
> and
> > the video filtergraph.
> >
> > 3. Observe the video filter graph. Here's what it does: The decoder will
> > output either vaapi surfaces (if the hwaccel is usable) or software
> frames
> > (if it isn't). In the first case, it matches the vaapi format and
> hwupload
> > does nothing (it passes through hardware frames unchanged). In the second
> > case, it matches the nv12 format and converts whatever the input is to
> > that, then uploads.
> >
> > This is done for safety reasons: Either way, the encoder will run.
> However,
> > depending on the path chosen (upload to memory vs native VAAPI hwdec),
> your
> > performance may vary.
> >
> > Reference used:
> >
> > 1. The VAAPI entry on FFmpeg wiki:
> > https://trac.ffmpeg.org/wiki/Hardware/VAAPI
> >
> > 2. The VAAPI encoders entry in the docs:
> > http://www.ffmpeg.org/ffmpeg-codecs.html#VAAPI-encoders
> >
> > On 23 July 2018 at 22:30, Lukas Obermann <obermann.lukas at gmail.com>
> wrote:
> >
> >> Hi all,
> >>
> >> I want to use a RX570 for transcoding with ffmpeg. Have been looking
> into
> >> this for some time now and testing around various things.
> >> I use Ubuntu 18.04 and I have it running with VAAPI. But the performance
> >> is not good imo. For a 1080p file I only get like 1.8x speed. I was
> >> expecting something around 6x to 8x.
> >> Is VAAPI the right way to go here? I see that AMF is not yet ready for
> >> linux and VDPAU only support decoding, not encoding.
> >>
> >> Following is the command:
> >> ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/card1
> -hwaccel_output_format
> >> vaapi -i fs_experiental_method.avi -y -c:v h264_vaapi -profile:v main
> >> output.avi
> >>
> >> ffmpeg version n4.0.2
> >> mesa 18
> >> amdgpu-pro-18.20-606296
> >> libva: VA-API version 1.1.0
> >>
> >> And here below the non-debug output of the command, to show the formats.
> >> I would appreciate any help on this.
> >>
> >> Thanks!
> >> Lukas
> >>
> >>
> >> ffmpeg version n4.0.2-2 Copyright (c) 2000-2018 the FFmpeg developers
> >>  built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)
> >>  configuration: --prefix=/usr --extra-version=2 --toolchain=hardened
> >> --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-
> linux-gnu
> >> --extra-cflags=-I/usr/local/include --extra-ldflags=-L/usr/local/lib
> >> --enable-gpl --disable-stripping --enable-avresample --enable-avisynth
> >> --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray
> >> --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite
> >> --enable-libfontconfig --enable-libfreetype --enable-libfribidi
> >> --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa
> >> --enable-libopenjpeg --enable-libopenmpt --enable-libopus
> --enable-libpulse
> >> --enable-librubberband --enable-librsvg --enable-libshine
> >> --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh
> >> --enable-libtheora --enable-libtwolame --enable-libvorbis
> --enable-libvpx
> >> --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2
> >> --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx
> >> --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394
> >> --enable-libdrm --enable-libiec61883 --enable-chromaprint
> --enable-frei0r
> >> --enable-libx264 --enable-shared --enable-vaapi --enable-vdpau
> >>  libavutil      56. 14.100 / 56. 14.100
> >>  libavcodec     58. 18.100 / 58. 18.100
> >>  libavformat    58. 12.100 / 58. 12.100
> >>  libavdevice    58.  3.100 / 58.  3.100
> >>  libavfilter     7. 16.100 /  7. 16.100
> >>  libavresample   4.  0.  0 /  4.  0.  0
> >>  libswscale      5.  1.100 /  5.  1.100
> >>  libswresample   3.  1.100 /  3.  1.100
> >>  libpostproc    55.  1.100 / 55.  1.100
> >> Input #0, avi, from 'fs_experiental_method.avi':
> >>  Metadata:
> >>    encoder         : Lavf57.83.100
> >>  Duration: 00:33:38.10, start: 0.000000, bitrate: 8133 kb/s
> >>    Stream #0:0: Video: h264 (Constrained Baseline) (H264 / 0x34363248),
> >> yuv420p(progressive), 1920x1080, 8057 kb/s, 30 fps, 30 tbr, 30 tbn, 60
> tbc
> >>    Stream #0:1: Audio: mp3 (U[0][0][0] / 0x0055), 32000 Hz, mono, fltp,
> >> 64 kb/s
> >> Stream mapping:
> >>  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_vaapi))
> >>  Stream #0:1 -> #0:1 (mp3 (mp3float) -> mp3 (libmp3lame))
> >> Press [q] to stop, [?] for help
> >> [h264_vaapi @ 0x55fcc47055c0] B frames are not supported (0x1) by the
> >> underlying driver.
> >> [h264_vaapi @ 0x55fcc47055c0] Warning: some packed headers are not
> >> supported (want 0xd, got 0).
> >> Output #0, avi, to 'output.avi':
> >>  Metadata:
> >>    ISFT            : Lavf58.12.100
> >>    Stream #0:0: Video: h264 (h264_vaapi) (Main) (H264 / 0x34363248),
> >> vaapi_vld, 1920x1080, q=0-31, 30 fps, 30 tbn, 30 tbc
> >>    Metadata:
> >>      encoder         : Lavc58.18.100 h264_vaapi
> >>    Stream #0:1: Audio: mp3 (libmp3lame) (U[0][0][0] / 0x0055), 32000 Hz,
> >> mono, fltp
> >>    Metadata:
> >>      encoder         : Lavc58.18.100 libmp3lame
> >> frame=  202 fps= 52 q=-0.0 Lsize=    4309kB time=00:00:06.80
> >> bitrate=5187.5kbits/s speed=1.74x
> >> video:4249kB audio:40kB subtitle:0kB other streams:0kB global
> headers:0kB
> >> muxing overhead: 0.444606%
> >> _______________________________________________
> >> ffmpeg-user mailing list
> >> ffmpeg-user at ffmpeg.org
> >> http://ffmpeg.org/mailman/listinfo/ffmpeg-user
> >>
> >> To unsubscribe, visit link above, or email
> >> ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> > _______________________________________________
> > ffmpeg-user mailing list
> > ffmpeg-user at ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-user
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
>
> _______________________________________________
> ffmpeg-user mailing list
> ffmpeg-user at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
>


More information about the ffmpeg-user mailing list