[FFmpeg-user] amd hwaccel encoding + overlay filter is not working together

György Pásztor coruscant0+ffmpeg at gmail.com
Fri Mar 29 12:30:44 EET 2024


Hi,

I tried your way. Unfortunately it's not working.
time ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
-hwaccel_output_format vaapi -i ../WK.0217/2024_0217_S0n000.mp4 -r 1 -i
2024_0217_S0n000-f%04d.tiff -filter_complex
'[1:v]hwupload,format=vaapi[1v];[0:v][1v]overlay_vaapi' -r 7.5 -an -c:v
hevc_vaapi -crf 22 2024_0217_S0n000f-hwa2.mp4
...
[out#0/mp4 @ 0xe43b0cf1f80] Codec AVOption crf (Select the quality for
constant quality mode) has not been used for any stream. The most likely
reason is either wrong type (e.g. a video option with no video streams) or
that it is a private option of some encoder which was not actually used for
any stream.
Stream mapping:
  Stream #0:0 (hevc) -> overlay_vaapi
  Stream #1:0 (tiff) -> hwupload:default
  overlay_vaapi:default -> Stream #0:0 (hevc_vaapi)
Press [q] to stop, [?] for help
[in#1/image2 @ 0xe43aee2ca00] Thread message queue blocking; consider
raising the thread_queue_size option (current value: 8)
[Parsed_overlay_vaapi_2 @ 0xe43b0cf2340] VAAPI driver doesn't support
overlay
[Parsed_overlay_vaapi_2 @ 0xe43b0cf2340] Failed to configure output pad on
Parsed_overlay_vaapi_2
[fc#0 @ 0xe43aee6a080] Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
Error while filtering: Invalid argument
[Parsed_overlay_vaapi_2 @ 0xe43b0cf2400] VAAPI driver doesn't support
overlay
[Parsed_overlay_vaapi_2 @ 0xe43b0cf2400] Failed to configure output pad on
Parsed_overlay_vaapi_2
[fc#0 @ 0xe43aee6a080] Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
[out#0/mp4 @ 0xe43b0cf1f80] Nothing was written into output file, because
at least one of its streams received no packets.
frame=    0 fps=0.0 q=0.0 Lsize=       0kB time=N/A bitrate=N/A speed=N/A

Conversion failed!

real 3m57.578s
user 0m2.788s
sys 3m57.688s

And an empty output. Is it possible that overlay_vaapi is not supported on
my cpu afterall?
It's a Ryzen 5800U.

In the meantime, I came up with a different approach: do the decode and
overlay by the cpu, and use the gpu to encode the stream:
time ffmpeg -vaapi_device /dev/dri/renderD128 -i
../WK.0217/2024_0217_S0n000.mp4 -r 1 -i 2024_0217_S0n000-f%04d.tiff
-filter_complex 'overlay,format=yuv420p,hwupload,scale_vaapi=format=nv12'
-r 7.5 -an -c:v hevc_vaapi -qp 22 2024_0217_S0n000f-hwa1.mp4
[fc#0 @ 0x3a0550953a80] Error while filtering: Cannot allocate
memory3311.3kbits/s dup=0 drop=2922 speed=1.18x
Failed to inject frame into filter network: Cannot allocate memory
Error while filtering: Cannot allocate memory
[out#0/mp4 @ 0x3a0550986f80] video:212882kB audio:0kB subtitle:0kB other
streams:0kB global headers:0kB muxing overhead: 0.003005%
frame=  981 fps=8.8 q=-0.0 Lsize=  212888kB time=00:02:10.66
bitrate=13346.8kbits/s dup=0 drop=2933 speed=1.17x
Conversion failed!

real 1m51.626s
user 20m40.850s
sys 0m14.040s
Though, it says, the conversion failed. But it encodes 981 frames. 2 frames
less compared to the case, when there's no acceleration et al.
If the only consequence is that the result will be 2 frame shorter than I
planned, I can solve that: I just need to add some bogus ending to the
videos, or cut longer the original slices. (Or just don't care about
missing 2 frames. Afterall, in a later phase, I add the audio back, concat
a lot of slice like this, than speed up to 30 fps, by dumping the video
stream, than generating PTS + atempo*4 -> At that speed, 2 missing frames
here and there doesn't really hurt.)

I tried to fine tune your solution further:
Let the hwaccel do the decode, and hwdownload the video frame, then use the
cpu for the overlay, then hwupload again...
But no luck again:
time ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
-hwaccel_output_format vaapi -vaapi_device /dev/dri/renderD128 -i
../WK.0217/2024_0217_S0n000.mp4 -r 1 -i 2024_0217_S0n000-f%04d.tiff
-filter_complex
'[0:v]hwdownload,format=yuv420p[0v];[0v][1:v]overlay,hwupload,format=vaapi'
-r 7.5 -an -c:v hevc_vaapi -crf 22 2024_0217_S0n000f-hwa3.mp4
....
[out#0/mp4 @ 0x2f69bfa76a40] Codec AVOption crf (Select the quality for
constant quality mode) has not been used for any stream. The most likely
reason is either wrong type (e.g. a video option with no video streams) or
that it is a private option of some encoder which was not actually used for
any stream.
File '2024_0217_S0n000f-hwa3.mp4' already exists. Overwrite? [y/N] y
Stream mapping:
  Stream #0:0 (hevc) -> hwdownload:default
  Stream #1:0 (tiff) -> overlay
  format:default -> Stream #0:0 (hevc_vaapi)
Press [q] to stop, [?] for help
[in#1/image2 @ 0x2f69b8a7d400] Thread message queue blocking; consider
raising the thread_queue_size option (current value: 8)
There are 2 hardware devices. device vaapi1 of type vaapi is picked for
filters by default. Set hardware device explicitly with the
filter_hw_device option if device vaapi1 is not usable for filters.
[AVHWFramesContext @ 0x2f69cce68080] Failed to read image from surface
0x24: 1 (operation failed).
[hwdownload @ 0x2f69c9b494c0] Failed to download frame: -5.
[fc#0 @ 0x2f69b8a7ba00] Error while filtering: Input/output error
Failed to inject frame into filter network: Input/output error
Error while filtering: Input/output error
[AVHWFramesContext @ 0x2f69cce68080] Failed to read image from surface
0x23: 1 (operation failed).
[hwdownload @ 0x2f69c9b494c0] Failed to download frame: -5.
[fc#0 @ 0x2f69b8a7ba00] Error while filtering: Input/output error
Failed to inject frame into filter network: Input/output error
[out#0/mp4 @ 0x2f69bfa76a40] Nothing was written into output file, because
at least one of its streams received no packets.
frame=    0 fps=0.0 q=0.0 Lsize=       0kB time=N/A bitrate=N/A speed=N/A

Conversion failed!

real 0m25.042s
user 0m2.755s
sys 0m23.686s
Without the vaapi_device option, I get the same result, just it fails
faster:
time ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
-hwaccel_output_format vaapi  -i ../WK.0217/2024_0217_S0n000.mp4 -r 1 -i
2024_0217_S0n000-f%04d.tiff -filter_complex
'[0:v]hwdownload,format=yuv420p[0v];[0v][1:v]overlay,hwupload,format=vaapi'
-r 7.5 -an -c:v hevc_vaapi -crf 22 2024_0217_S0n000f-hwa3.mp4
...
[out#0/mp4 @ 0x2456f270cf80] Codec AVOption crf (Select the quality for
constant quality mode) has not been used for any stream. The most likely
reason is either wrong type (e.g. a video option with no video streams) or
that it is a private option of some encoder which was not actually used for
any stream.
Stream mapping:
  Stream #0:0 (hevc) -> hwdownload:default
  Stream #1:0 (tiff) -> overlay
  format:default -> Stream #0:0 (hevc_vaapi)
Press [q] to stop, [?] for help
[in#1/image2 @ 0x2456f202ca00] Thread message queue blocking; consider
raising the thread_queue_size option (current value: 8)
[AVHWFramesContext @ 0x245703c68080] Failed to read image from surface
0x24: 1 (operation failed).
[hwdownload @ 0x2456fc95f7c0] Failed to download frame: -5.
[fc#0 @ 0x2456f206a080] Error while filtering: Input/output error
Failed to inject frame into filter network: Input/output error
Error while filtering: Input/output error
[AVHWFramesContext @ 0x245703c68080] Failed to read image from surface
0x23: 1 (operation failed).
[hwdownload @ 0x2456fc95f7c0] Failed to download frame: -5.
[fc#0 @ 0x2456f206a080] Error while filtering: Input/output error
Failed to inject frame into filter network: Input/output error
[out#0/mp4 @ 0x2456f270cf80] Nothing was written into output file, because
at least one of its streams received no packets.
frame=    0 fps=0.0 q=0.0 Lsize=       0kB time=N/A bitrate=N/A speed=N/A

Conversion failed!

real 0m8.649s
user 0m2.715s
sys 0m8.742s

Anyway... Comparing the 1m51s to the version, where I had no hw
acceleration et al:
time ffmpeg -i /store/vedit/WK.0217/2024_0217_S0n000.mp4 -r 1 -i
2024_0217_S0n000-f%04d.tiff -filter_complex overlay -r 7.5 -an -c:v libx265
-crf 22 2024_0217_S0n000f.mp4
...
encoded 985 frames in 374.00s (2.63 fps), 13357.59 kb/s, Avg QP:21.38

real 6m15.015s
user 90m1.528s
sys 0m10.971s

The hw accelerated version is more than 3 times faster. Maybe it has some
slight bugs... But I might be able to find a workaround for those bugs.

Obviously, If I could tune it further, so the decode would be done by the
GPU that would be awesome.
If the overlay could be done by the gpu, that would be even better.
If not, I have to accept this partial solution I've found. It is still more
than 3 times faster.

Thanks a million!
Gyu

Chen, Wenbin <wenbin.chen-at-intel.com at ffmpeg.org> ezt írta (időpont: 2024.
márc. 29., P, 2:52):

> > I just successfully tested how to use AMD GPU's hwaccel to re-encode a
> file
> > with ffmpeg. Using this command:
> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -c:v hevc_vaapi -c:a copy -crf
> 23
> > output.mp4
> > encoding is much faster.
> > Testing it on a sample file and wrapping around with time:
> > time ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -c:v hevc_vaapi -c:a copy -crf
> 23
> > -t 60 output.mp4
> > frame= 1800 fps= 35 q=-0.0 Lsize=  294431kB time=00:00:59.98
> > bitrate=40209.7kbits/s dup=2 drop=0 speed=1.16x
> >
> > real 0m52.024s
> > user 0m2.304s
> > sys 0m10.866s
> >  ^ This is definitely using the GPU and hwaccel. During the encoding, I
> see
> > no cpu usage spike in top, or increase of loadavg, etc.
> >
> > While the "classic" re-encode for the same file:
> > time ffmpeg  -i input.mp4 -c:v libx265 -c:a copy -crf 23 -t 60 output.mp4
> > ...
> >
> > encoded 1800 frames in 460.63s (3.91 fps), 27046.56 kb/s, Avg QP:27.49
> >
> > real 7m41.169s
> > user 109m18.008s
> > sys 0m5.764s
> >
> > This is definitely using the cpu. During this 7 minutes and 41 seconds,
> the
> > all the cpu cores were 100% used, load was around the number of cpu
> cores,
> > etc.
> >
> > The actual command I try to execute:
> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > -filter_complex overlay -r 7.5 -an -c:v hevc_vaapi -crf 22 output.mp4
> > But it's not working.
> > I've got this output:
> > ---------<snip>--------
> > $ ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > -filter_complex overlay -r 7.5 -an -c:v hevc_vaapi -crf 22  output.mp4
> > ffmpeg version 6.1.1 Copyright (c) 2000-2023 the FFmpeg developers
> >   built with FreeBSD clang version 16.0.6 (
> > https://github.com/llvm/llvm-project.git llvmorg-16.0.6-0-g7cbf1a259152)
> >   configuration: --prefix=/usr/local --mandir=/usr/local/man
> > --datadir=/usr/local/share/ffmpeg --docdir=/usr/local/share/doc/ffmpeg
> > --pkgconfigdir=/usr/local/libdata/pkgconfig --disable-static
> > --disable-libcelt --enable-shared --enable-pic --enable-gpl --cc=cc
> > --cxx=c++ --disable-alsa --disable-libopencore-amrnb
> > --disable-libopencore-amrwb --enable-libaom --disable-libaribb24
> > --disable-libaribcaption --enable-asm --enable-libass --disable-libbs2b
> > --disable-libcaca --disable-libcdio --disable-libcodec2 --enable-libdav1d
> > --disable-libdavs2 --disable-libdc1394 --disable-debug --enable-htmlpages
> > --enable-libdrm --disable-libfdk-aac --disable-libflite
> --enable-fontconfig
> > --enable-libfreetype --enable-frei0r --disable-libfribidi
> --disable-gcrypt
> > --disable-libglslang --disable-libgme --enable-gmp --enable-gnutls
> > --enable-version3 --disable-libgsm --enable-libharfbuzz --enable-iconv
> > --disable-libilbc --disable-libjack --enable-libjxl --disable-libklvanc
> > --disable-libkvazaar --disable-ladspa --enable-libmp3lame --enable-lcms2
> > --disable-liblensfun --disable-libbluray --enable-libplacebo
> > --disable-librsvg --disable-librtmp --enable-libxml2 --disable-lv2
> > --disable-mbedtls --disable-libmfx --disable-libmodplug
> --disable-libmysofa
> > --enable-network --disable-nonfree --enable-nvenc --disable-openal
> > --disable-opencl --disable-opengl --disable-libopenh264
> > --disable-libopenjpeg --disable-libopenmpt --disable-openssl
> > --disable-libopenvino --enable-optimizations --enable-libopus
> > --disable-pocketsphinx --disable-libpulse --disable-librabbitmq
> > --disable-librav1e --disable-librist --enable-runtime-cpudetect
> > --disable-librubberband --disable-sdl2 --enable-libshaderc
> > --disable-libsmbclient --disable-libsnappy --disable-sndio
> > --disable-libsoxr --disable-libspeex --disable-libsrt --disable-libssh
> > --enable-libsvtav1 --disable-libtensorflow --disable-libtesseract
> > --disable-libtheora --disable-libtwolame --disable-libuavs3d
> > --enable-libv4l2 --enable-vaapi --disable-vapoursynth --enable-vdpau
> > --disable-libvidstab --enable-libvmaf --enable-libvorbis
> > --disable-libvo-amrwbenc --disable-libvpl --enable-libvpx --enable-vulkan
> > --enable-libwebp --enable-libx264 --enable-libx265 --disable-libxavs2
> > --enable-libxcb --disable-libxvid --disable-outdev=xv --disable-libzimg
> > --disable-libzmq --disable-libzvbi
> >   libavutil      58. 29.100 / 58. 29.100
> >   libavcodec     60. 31.102 / 60. 31.102
> >   libavformat    60. 16.100 / 60. 16.100
> >   libavdevice    60.  3.100 / 60.  3.100
> >   libavfilter     9. 12.100 /  9. 12.100
> >   libswscale      7.  5.100 /  7.  5.100
> >   libswresample   4. 12.100 /  4. 12.100
> >   libpostproc    57.  3.100 / 57.  3.100
> > Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
> >   Metadata:
> >     major_brand     : isom
> >     minor_version   : 512
> >     compatible_brands: isomiso2avc1mp41
> >     creation_time   : 2023-09-03T23:12:04.000000Z
> >     encoder         : Lavf60.3.100
> >   Duration: 00:01:15.01, start: 0.000000, bitrate: 14849 kb/s
> >   Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661),
> > yuv420p(progressive), 1920x1080, 14750 kb/s, 30 fps, 30 tbr, 60k tbn
> > (default)
> >     Metadata:
> >       creation_time   : 2023-09-03T23:12:04.000000Z
> >       handler_name    : VideoHandler
> >       vendor_id       : [0][0][0][0]
> >       encoder         : h264
> >   Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 32000 Hz,
> > mono, fltp, 96 kb/s (default)
> >     Metadata:
> >       creation_time   : 2023-09-03T23:12:04.000000Z
> >       handler_name    : SoundHandler
> >       vendor_id       : [0][0][0][0]
> > Input #1, image2, from 'overlay%04d.png':
> >   Duration: 00:00:03.00, start: 0.000000, bitrate: N/A
> >   Stream #1:0: Video: png, rgba(pc, gbr/unknown/unknown), 1920x1080, 25
> > fps, 25 tbr, 25 tbn
> > amdgpu: os_same_file_description couldn't determine if two DRM fds
> > reference the same file description.
> > If they do, bad things may happen!
> > [out#0/mp4 @ 0x1e333e6b5600] Codec AVOption crf (Select the quality for
> > constant quality mode) has not been used for any stream. The most likely
> > reason is either wrong type (e.g. a video option with no video streams)
> or
> > that it is a private option of some encoder which was not actually used
> for
> > any stream.
> > Stream mapping:
> >   Stream #0:0 (h264) -> overlay
> >   Stream #1:0 (png) -> overlay
> >   overlay:default -> Stream #0:0 (hevc_vaapi)
> > Press [q] to stop, [?] for help
> > [in#1/image2 @ 0x1e333e62c600] Thread message queue blocking; consider
> > raising the thread_queue_size option (current value: 8)
> > Impossible to convert between the formats supported by the filter 'graph
> 0
> > input from stream 0:0' and the filter 'auto_scale_0'
> > [fc#0 @ 0x1e333e66a080] Error reinitializing filters!
> > Failed to inject frame into filter network: Function not implemented
> > Error while filtering: Function not implemented
> > Impossible to convert between the formats supported by the filter 'graph
> 0
> > input from stream 0:0' and the filter 'auto_scale_0'
> > [fc#0 @ 0x1e333e66a080] Error reinitializing filters!
> > Failed to inject frame into filter network: Function not implemented
> > [out#0/mp4 @ 0x1e333e6b5600] Nothing was written into output file,
> > because
> > at least one of its streams received no packets.
> > frame=    0 fps=0.0 q=0.0 Lsize=       0kB time=N/A bitrate=N/A speed=N/A
> >
> > Conversion failed!
> > ---------<snap>--------
> > I'm not sure, if it's completely impossible, because some feature is
> > missing on ffmpeg, or from the dri driver, or from freebsd (though, the
> > simple task, to re-encode an mp4 file was working on this same host), or
> I
> > just need to add some extra filter, so the output of the overlay filter
> > would be compatible with the encoder / vaapi's input.
> > Though, based on the error message, I don't even know if it's the input
> of
> > the overlay filter, or the output of the overlay filter is problematic.
> > Can someone please help?
> > By googling the error message, I found this, but I'm not sure how to
> apply
> > this to my filter, or if it's even the same problem.
> > https://superuser.com/questions/1633883/ffmpeg-hevc-vaapi-impossible-to-
> > convert-between-the-formats-supported-by-the-fi
> > Based on the answers there, I tried to convert it further, but without
> any
> > luck:
> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > -filter_complex overlay -r 7.5 -an -c:v hevc_vaapi -crf 22  output.mp4
> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > -filter_complex 'overlay,format=nv12|vaapi,hwupload' -r 7.5 -an -c:v
> > hevc_vaapi -crf 22  output.mp4
> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > -filter_complex 'overlay,format=yuv420p|vaapi,hwupload' -r 7.5 -an -c:v
> > hevc_vaapi -crf 22  output.mp4
> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > -filter_complex 'overlay,hwupload' -r 7.5 -an -c:v hevc_vaapi -crf 22
> >  output.mp4
> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > -filter_complex 'overlay,format=nv12|vaapi,hwupload' -r 7.5 -an -c:v
> > hevc_vaapi -crf 22  output.mp4
> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > -filter_complex 'overlay,format=yuv420p|vaapi,hwupload' -r 7.5 -an -c:v
> > hevc_vaapi -crf 22  output.mp4
> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > -filter_complex 'overlay,format=yuv420p,hwupload' -r 7.5 -an -c:v
> > hevc_vaapi -crf 22  output.mp4
> >
> > Thank you for every help!
> > Gyu
>
> The command should be changed to like this:
>  ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
>  -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
>  -filter_complex "[1:v]hwupload,format=vaapi[1v];[0:v][1v]overlay_vaapi"
>  -r 7.5 -an -c:v hevc_vaapi -crf 22 output.mp4
>
> When you use "-hwaccel_output_format vaapi", it means the output frames
> are sotred in device memory. Classic overlay filter cannot handle frames
> in device
> memory, so you need to use overlay_vaapi.
> png images are decoded to system memory, so for the same reason, you need
> to upload
> them to device memory.
>
> - Wenbin
>
> > _______________________________________________
> > ffmpeg-user mailing list
> > ffmpeg-user at ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> _______________________________________________
> ffmpeg-user mailing list
> ffmpeg-user at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
>


More information about the ffmpeg-user mailing list