[FFmpeg-user] amd hwaccel encoding + overlay filter is not working together
György Pásztor
coruscant0+ffmpeg at gmail.com
Tue Apr 2 00:12:11 EEST 2024
Trying your way, my command was:
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
-hwaccel_output_format vaapi -i /store/vedit/WK.0217/2024_0217_S0n000.mp4
-r 1 -i 2024_0217_S0n000-f%04d.tiff -filter_complex
[0:v]hwdownload,format=nv12[0v];[0v][1:v]overlay,hwupload,scale_vaapi=format=vaapi
-r 7.5 -an -c:v hevc_vaapi -qp 22 2024_0217_S0n000f.mp4
-----<snip>-----
Stream mapping:
Stream #0:0 (hevc) -> hwdownload:default
Stream #1:0 (tiff) -> overlay
scale_vaapi:default -> Stream #0:0 (hevc_vaapi)
Press [q] to stop, [?] for help
[in#1/image2 @ 0x3316b542ca00] Thread message queue blocking; consider
raising the thread_queue_size option (current value: 8)
[Parsed_scale_vaapi_4 @ 0x3316b5b12580] Hardware does not support output
format vaapi.
[Parsed_scale_vaapi_4 @ 0x3316b5b12580] Failed to configure output pad on
Parsed_scale_vaapi_4
[fc#0 @ 0x3316b546a080] Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
Error while filtering: Invalid argument
[Parsed_scale_vaapi_4 @ 0x3316b5b12640] Hardware does not support output
format vaapi.
[Parsed_scale_vaapi_4 @ 0x3316b5b12640] Failed to configure output pad on
Parsed_scale_vaapi_4
[fc#0 @ 0x3316b546a080] Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
[out#0/mp4 @ 0x3316b5b12040] Nothing was written into output file, because
at least one of its streams received no packets.
frame= 0 fps=0.0 q=0.0 Lsize= 0kB time=N/A bitrate=N/A speed=N/A
Conversion failed!
-----<snap>-----
I think, I tried to use format vaapi earlier, and got the same error
message, that's why I turned to use format=nv12
So, I tried this:
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
-hwaccel_output_format vaapi -i /store/vedit/WK.0217/2024_0217_S0n000.mp4
-r 1 -i 2024_0217_S0n000-f%04d.tiff -filter_complex
[0:v]hwdownload[0v];[0v][1:v]overlay,hwupload,scale_vaapi=format=nv12 -r
7.5 -an -c:v hevc_vaapi -qp 22 2024_0217_S0n000f.mp4
But for this command, I got this error message:
-----<snip>-----
Stream mapping:
Stream #0:0 (hevc) -> hwdownload:default
Stream #1:0 (tiff) -> overlay
scale_vaapi:default -> Stream #0:0 (hevc_vaapi)
Press [q] to stop, [?] for help
[in#1/image2 @ 0x1863c6c2da00] Thread message queue blocking; consider
raising the thread_queue_size option (current value: 8)
[hwdownload @ 0x1863d1158700] Invalid output format yuva420p for hwframe
download.
[Parsed_hwdownload_0 @ 0x1863c730d280] Failed to configure output pad on
Parsed_hwdownload_0
[fc#0 @ 0x1863c6c24080] Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
Error while filtering: Invalid argument
[hwdownload @ 0x1863d1158900] Invalid output format yuva420p for hwframe
download.
[Parsed_hwdownload_0 @ 0x1863c730d340] Failed to configure output pad on
Parsed_hwdownload_0
[fc#0 @ 0x1863c6c24080] Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
[out#0/mp4 @ 0x1863c730d040] Nothing was written into output file, because
at least one of its streams received no packets.
frame= 0 fps=0.0 q=0.0 Lsize= 0kB time=N/A bitrate=N/A speed=N/A
Conversion failed!
-----<snap>-----
So, because of this error message, I tried this version:
ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
-hwaccel_output_format vaapi -i /store/vedit/WK.0217/2024_0217_S0n000.mp4
-r 1 -i 2024_0217_S0n000-f%04d.tiff -filter_complex
[0:v]hwdownload[0v];[0v][1:v]overlay,format=yuv420p,hwupload,scale_vaapi=format=nv12
-r 7.5 -an -c:v hevc_vaapi -qp 22 2024_0217_S0n000f.mp4
But that gave me this error message:
-----<snip>-----
Stream mapping:
Stream #0:0 (hevc) -> hwdownload:default
Stream #1:0 (tiff) -> overlay
scale_vaapi:default -> Stream #0:0 (hevc_vaapi)
Press [q] to stop, [?] for help
[in#1/image2 @ 0xea47fc2ca00] Thread message queue blocking; consider
raising the thread_queue_size option (current value: 8)
[AVHWFramesContext @ 0xea491068080] Failed to read image from surface 0x24:
1 (operation failed).
[hwdownload @ 0xea48a34f740] Failed to download frame: -5.
[fc#0 @ 0xea47fc6a080] Error while filtering: Input/output error
Failed to inject frame into filter network: Input/output error
Error while filtering: Input/output error
[AVHWFramesContext @ 0xea491068080] Failed to read image from surface 0x23:
1 (operation failed).
[hwdownload @ 0xea48a34f740] Failed to download frame: -5.
[fc#0 @ 0xea47fc6a080] Error while filtering: Input/output error
Failed to inject frame into filter network: Input/output error
[out#0/mp4 @ 0xea480312040] Nothing was written into output file, because
at least one of its streams received no packets.
frame= 0 fps=0.0 q=0.0 Lsize= 0kB time=N/A bitrate=N/A speed=N/A
Conversion failed!
-----<snap>-----
Now I'm out of ideas.
At least, the way I found earlier, is working... With some finetune...
Originally, I had this:
ffmpeg -vaapi_device /dev/dri/renderD128 -i
/store/vedit/WK.0217/2024_0217_S0n000.mp4 -r 1 -i
2024_0217_S0n000-f%04d.tiff -filter_complex
overlay,format=yuv420p,hwupload,scale_vaapi=format=nv12 -r 30000/1001 -an
-c:v hevc_vaapi -qp 22 2024_0217_S0n000f.mp4
I added the format=yuv420p, because otherwise ffmpeg was complaining about
the conversion from yuva420p.
And as far as I remember, the output was also somewhat wrong. I can't
recall how. Maybe, it looked like it was converted to b&w or grayscale.
That's why I added in that format=yuv420p into the filter chain.
That fixed the problem, and also conversion was 5-15% faster.
The problem with this was, that my next step is to add back in the audio,
then concat the files, then speed it up to 4 times faster.
This process resulted a file which made my desktop freeze. This was in my
dmesg:
-----<snip>-----
Apr 1 03:35:38 orion kernel: [drm ERROR :amdgpu_job_timedout] ring vcn_dec
timeout, signaled seq=5990, emitted seq=5992
Apr 1 03:35:38 orion kernel: [drm ERROR :amdgpu_job_timedout] Process
information: process pid 101419 thread pid 101419
Apr 1 03:35:38 orion kernel: drmn0: GPU reset begin!
Apr 1 03:35:38 orion kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed
to reach value 0x00000001 != 0x00000002
Apr 1 03:35:38 orion kernel: [drm] Register(0) [mmUVD_RBC_RB_RPTR] failed
to reach value 0x00000100 != 0x000000e0
Apr 1 03:35:39 orion kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed
to reach value 0x00000001 != 0x00000002
Apr 1 03:35:39 orion kernel: [drm] free PSP TMR buffer
Apr 1 03:35:39 orion kernel: drmn0: MODE2 reset
Apr 1 03:35:39 orion kernel: drmn0: GPU reset succeeded, trying to resume
Apr 1 03:35:39 orion kernel: [drm] PCIE GART of 1024M enabled.
Apr 1 03:35:39 orion kernel: [drm] PTB located at 0x000000F400900000
Apr 1 03:35:39 orion kernel: [drm] PSP is resuming...
Apr 1 03:35:39 orion kernel: [drm] reserve 0x400000 from 0xf41f800000 for
PSP TMR
Apr 1 03:35:39 orion kernel: drmn0: RAS: optional ras ta ucode is not
available
Apr 1 03:35:39 orion kernel: drmn0: RAP: optional rap ta ucode is not
available
Apr 1 03:35:39 orion kernel: drmn0: SECUREDISPLAY: securedisplay ta ucode
is not available
Apr 1 03:35:39 orion kernel: drmn0: SMU is resuming...
Apr 1 03:35:39 orion kernel: drmn0: SMU is resumed successfully!
Apr 1 03:35:39 orion kernel: [drm] kiq ring mec 2 pipe 1 q 0
Apr 1 03:35:39 orion kernel: [drm] DMUB hardware initialized:
version=0x01010027
Apr 1 03:35:39 orion kernel: [drm] VCN decode and encode initialized
successfully(under DPG Mode).
Apr 1 03:35:39 orion kernel: [drm] JPEG decode initialized successfully.
Apr 1 03:35:39 orion kernel: drmn0: ring gfx uses VM inv eng 0 on hub 0
Apr 1 03:35:39 orion kernel: drmn0: ring comp_1.0.0 uses VM inv eng 1 on
hub 0
Apr 1 03:35:39 orion kernel: drmn0: ring comp_1.1.0 uses VM inv eng 4 on
hub 0
Apr 1 03:35:39 orion kernel: drmn0: ring comp_1.2.0 uses VM inv eng 5 on
hub 0
Apr 1 03:35:39 orion kernel: drmn0: ring comp_1.3.0 uses VM inv eng 6 on
hub 0
Apr 1 03:35:39 orion kernel: drmn0: ring comp_1.0.1 uses VM inv eng 7 on
hub 0
Apr 1 03:35:39 orion kernel: drmn0: ring comp_1.1.1 uses VM inv eng 8 on
hub 0
Apr 1 03:35:39 orion kernel: drmn0: ring comp_1.2.1 uses VM inv eng 9 on
hub 0
Apr 1 03:35:39 orion kernel: drmn0: ring comp_1.3.1 uses VM inv eng 10 on
hub 0
Apr 1 03:35:39 orion kernel: drmn0: ring kiq_2.1.0 uses VM inv eng 11 on
hub 0
Apr 1 03:35:39 orion kernel: drmn0: ring sdma0 uses VM inv eng 0 on hub 1
Apr 1 03:35:39 orion kernel: drmn0: ring vcn_dec uses VM inv eng 1 on hub 1
Apr 1 03:35:39 orion kernel: drmn0: ring vcn_enc0 uses VM inv eng 4 on hub
1
Apr 1 03:35:39 orion kernel: drmn0: ring vcn_enc1 uses VM inv eng 5 on hub
1
Apr 1 03:35:39 orion kernel: drmn0: ring jpeg_dec uses VM inv eng 6 on hub
1
Apr 1 03:35:39 orion kernel: drmn0: recover vram bo from shadow start
Apr 1 03:35:39 orion kernel: drmn0: recover vram bo from shadow done
Apr 1 03:35:39 orion kernel: [drm] Skip scheduling IBs!
Apr 1 03:35:39 orion syslogd: last message repeated 1 times
Apr 1 03:35:39 orion kernel: drmn0: GPU reset(1) succeeded!
Apr 1 03:35:49 orion kernel: [drm ERROR :amdgpu_job_timedout] ring vcn_dec
timeout, signaled seq=5994, emitted seq=5994
Apr 1 03:35:49 orion kernel: [drm ERROR :amdgpu_job_timedout] Process
information: process pid 101419 thread pid 101419
Apr 1 03:35:49 orion kernel: drmn0: GPU reset begin!
-----<snap>-----
At least, I could ssh in and do a grace reboot from remote.
I checked: it wasn't a simple one time case. That process resulted a file
which can freeze my desktop anytime:
-----<snip>-----
[pasztor at orion ~]$ grep GPU.reset /var/log/messages
Apr 1 02:55:40 orion kernel: drmn0: GPU reset begin!
Apr 1 02:55:41 orion kernel: drmn0: GPU reset succeeded, trying to resume
Apr 1 02:55:42 orion kernel: drmn0: GPU reset(1) succeeded!
Apr 1 02:55:52 orion kernel: drmn0: GPU reset begin!
Apr 1 03:06:40 orion kernel: drmn0: GPU reset begin!
Apr 1 03:06:41 orion kernel: drmn0: GPU reset succeeded, trying to resume
Apr 1 03:06:41 orion kernel: drmn0: GPU reset(1) succeeded!
Apr 1 03:06:51 orion kernel: drmn0: GPU reset begin!
Apr 1 03:35:38 orion kernel: drmn0: GPU reset begin!
Apr 1 03:35:39 orion kernel: drmn0: GPU reset succeeded, trying to resume
Apr 1 03:35:39 orion kernel: drmn0: GPU reset(1) succeeded!
Apr 1 03:35:49 orion kernel: drmn0: GPU reset begin!
-----<snap>-----
For speeding up the concatenated result I used this function, which
resulted the mp4 file, which can freeze my desktop:
https://github.com/pasztor/sjdemux/blob/v1.0.8/vedit.sh#L59
Instead of that, I tried to recode, just to see if that would also freeze
my desktop https://github.com/pasztor/sjdemux/blob/v1.0.8/vedit.sh#L68
That command completely used hw acceleration and resulted a playable file,
which doesn't crash my desktop/gpu.
Though, that needed a lot of time to recode the video. (I mean, the full
one, I'm currently working on, from this weekend. The 17th of February is
"released": https://youtu.be/utONQaRCgfg )
I didn't took notes, but for a test...
I just used this slice that I generated earlier using the command I
previously shared...
Than I tred to add back the audio:
https://github.com/pasztor/sjdemux/blob/v1.0.8/vedit.sh#L96
[pasztor at orion /store/vedit/WKO.0217]$ replaceaudio
2024_0217_S0n000f-hwa1.mp4 ../WK.0217/2024_0217_S0n000.mp4
Than speed it up:
https://github.com/pasztor/sjdemux/blob/v1.0.8/vedit.sh#L59 again
[pasztor at orion /store/vedit/WKO.0217]$ speedup4raw265af
2024_0217_S0n000f-hwa1a.mp4
Which resulted this 2024_0217_S0n000f-hwa1as4.mp4 file, which can also
crash my desktop.
As it seems, I don't even need to concat this to other files. It's just
enough, if I use ffmpeg's directions to speed it up without re-encode on a
hw-accelerated overlayed file:
https://trac.ffmpeg.org/wiki/How%20to%20speed%20up%20/%20slow%20down%20a%20video#rawbitstreammethod
I've shared the resulted file here: https://pasztor.xyz/BSDebug/
Not sure where to share this. Probably some drm-dev list would suffice. But
If my understanding is correct, FreeBSD just re-use the drm modules
straight from the linux source.
As I shared earlier, I generated the file this way:
ffmpeg -vaapi_device /dev/dri/renderD128 -i
/store/vedit/WK.0217/2024_0217_S0n000.mp4 -r 1 -i
2024_0217_S0n000-f%04d.tiff -filter_complex
overlay,format=yuv420p,hwupload,scale_vaapi=format=nv12 -r 7.5 -an -c:v
hevc_vaapi -qp 22 2024_0217_S0n000f-hwa1.mp4
Now, I modified that way to do the speeding up at the same time, so won't
need to dump and regenerate PTS info for the frames.
ffmpeg -vaapi_device /dev/dri/renderD128 -i
/store/vedit/WK.0217/2024_0217_S0n000.mp4 -r 1 -i
2024_0217_S0n000-f%04d.tiff -filter_complex
overlay,setpts=0.25*PTS,format=yuv420p,hwupload,scale_vaapi=format=nv12 -r
30000/1001 -an -c:v hevc_vaapi -qp 22 2024_0217_S0n000f.mp4
This keeps the original framerate, but speeds up the video at the same time
and results a file that I can concatenate to others without needing to
worry about resulting a file which crashes my desktop.
Btw, full hw assisted recode:
time ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
-hwaccel_output_format vaapi -i 2024_0217_S0n000f-hwa1a.mp4 -filter_complex
'[0:v]setpts=0.25*PTS[v];[0:a]atempo=4[a]' -map '[v]' -map '[a]' -r 30 -c:v
hevc_vaapi -qp 22 2024_0217_S0n000f-hwa1as4.re.fhw.mp4
...
real 0m23.791s
user 0m2.260s
sys 0m0.669s
During this, no cpu usage increase.
VS. sw decode, hw encode. At least, what I presume, what this command is
doing:
time ffmpeg -vaapi_device /dev/dri/renderD128 -i
2024_0217_S0n000f-hwa1a.mp4 -filter_complex
'[0:v]setpts=0.25*PTS,hwupload,scale_vaapi=format=nv12[v];[0:a]atempo=4[a]'
-map '[v]' -map '[a]' -r 30 -c:v hevc_vaapi -qp 22
2024_0217_S0n000f-hwa1as4.re.swdechwenc.mp4
...
real 0m16.580s
user 1m57.586s
sys 0m1.398s
During this, all 16 cpu threads were used somewhat around 50% during the
re-encode time.
Based on this, I don't mind if the decode (and the overlay) is done by the
cpu, and only the encode is done by the gpu.
Looking at the big picture this way I just gain time and save electricity.
What I think the remaining interesting problems:
- ffmpeg resulting a file which can crash my desktop
- my desktop / drivers allow a userland program (mpv) to get into an
undesired state
Btw, I have a few Intels at home, but the NAS where I store all these files
for the long term is a ~10 yrs old Xeon E3-1230 v3. I wanted to change it
back to my previous SuperMicro motherboard which has a Xeon D1528, but I
had more urgent things to do.
I don't think my old D54250WYK nuc desktop (i5-4250U) would outperform this
Topton with the Ryzen 5800U :D
I also have two router miniPCs (just for the sake of redundancy). One has
an N100 cpu, and the other has a Pentium Gold 8505.
I installed the drm-kmod package on both.
When I run the kldload i915kms command on the N100, there was no /dev/dri
directory appearing on the system.
When I run the kldload i915kms command on the Pentium Gold router box...
Well.. It froze so bad that I had to manually long press the power button
to reset. (God bless the redundant network core design in my home :D)
That's also a Topton minipc. Cool stuff as a router. Has 2 Intel 10G SFP+
ports and 4 intel i226v 2.5Gbit ports. Nothing beats Intel on the
networking chips.
But If you have any good advice on where to look, if I want something with
low power usage but great desktop experience, like the Ryzen 5800U, I'm all
ears!
It's all my home hobby gear, I don't have an enterprise budget for
electricity. (Besides, I need new chains and sprockets.)
Thanks a million,
Gyu
Chen, Wenbin <wenbin.chen-at-intel.com at ffmpeg.org> ezt írta (időpont: 2024.
ápr. 1., H, 2:30):
> > I tried your way. Unfortunately it's not working.
> > time ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i ../WK.0217/2024_0217_S0n000.mp4 -r 1 -i
> > 2024_0217_S0n000-f%04d.tiff -filter_complex
> > '[1:v]hwupload,format=vaapi[1v];[0:v][1v]overlay_vaapi' -r 7.5 -an -c:v
> > hevc_vaapi -crf 22 2024_0217_S0n000f-hwa2.mp4
> > ...
> > [out#0/mp4 @ 0xe43b0cf1f80] Codec AVOption crf (Select the quality for
> > constant quality mode) has not been used for any stream. The most likely
> > reason is either wrong type (e.g. a video option with no video streams)
> or
> > that it is a private option of some encoder which was not actually used
> for
> > any stream.
> > Stream mapping:
> > Stream #0:0 (hevc) -> overlay_vaapi
> > Stream #1:0 (tiff) -> hwupload:default
> > overlay_vaapi:default -> Stream #0:0 (hevc_vaapi)
> > Press [q] to stop, [?] for help
> > [in#1/image2 @ 0xe43aee2ca00] Thread message queue blocking; consider
> > raising the thread_queue_size option (current value: 8)
> > [Parsed_overlay_vaapi_2 @ 0xe43b0cf2340] VAAPI driver doesn't support
> > [Parsed_overlay_vaapi_2 @ 0xe43b0cf2340] Failed to configure output pad
> > on
> > Parsed_overlay_vaapi_2
> > [fc#0 @ 0xe43aee6a080] Error reinitializing filters!
> > Failed to inject frame into filter network: Invalid argument
> > Error while filtering: Invalid argument
> > [Parsed_overlay_vaapi_2 @ 0xe43b0cf2400] VAAPI driver doesn't support
> > overlay
> > [Parsed_overlay_vaapi_2 @ 0xe43b0cf2400] Failed to configure output pad
> > on
> > Parsed_overlay_vaapi_2
> > [fc#0 @ 0xe43aee6a080] Error reinitializing filters!
> > Failed to inject frame into filter network: Invalid argument
> > [out#0/mp4 @ 0xe43b0cf1f80] Nothing was written into output file, because
> > at least one of its streams received no packets.
> > frame= 0 fps=0.0 q=0.0 Lsize= 0kB time=N/A bitrate=N/A speed=N/A
> >
> > Conversion failed!
> >
> > real 3m57.578s
> > user 0m2.788s
> > sys 3m57.688s
> >
> > And an empty output. Is it possible that overlay_vaapi is not supported
> on
> > my cpu afterall?
> > It's a Ryzen 5800U.
> >
> > In the meantime, I came up with a different approach: do the decode and
> > overlay by the cpu, and use the gpu to encode the stream:
> > time ffmpeg -vaapi_device /dev/dri/renderD128 -i
> > ../WK.0217/2024_0217_S0n000.mp4 -r 1 -i 2024_0217_S0n000-f%04d.tiff
> > -filter_complex 'overlay,format=yuv420p,hwupload,scale_vaapi=format=nv12'
> > -r 7.5 -an -c:v hevc_vaapi -qp 22 2024_0217_S0n000f-hwa1.mp4
> > [fc#0 @ 0x3a0550953a80] Error while filtering: Cannot allocate
> > memory3311.3kbits/s dup=0 drop=2922 speed=1.18x
> > Failed to inject frame into filter network: Cannot allocate memory
> > Error while filtering: Cannot allocate memory
> > [out#0/mp4 @ 0x3a0550986f80] video:212882kB audio:0kB subtitle:0kB
> > other
> > streams:0kB global headers:0kB muxing overhead: 0.003005%
> > frame= 981 fps=8.8 q=-0.0 Lsize= 212888kB time=00:02:10.66
> > bitrate=13346.8kbits/s dup=0 drop=2933 speed=1.17x
> > Conversion failed!
> >
> > real 1m51.626s
> > user 20m40.850s
> > sys 0m14.040s
> > Though, it says, the conversion failed. But it encodes 981 frames. 2
> frames
> > less compared to the case, when there's no acceleration et al.
> > If the only consequence is that the result will be 2 frame shorter than I
> > planned, I can solve that: I just need to add some bogus ending to the
> > videos, or cut longer the original slices. (Or just don't care about
> > missing 2 frames. Afterall, in a later phase, I add the audio back,
> concat
> > a lot of slice like this, than speed up to 30 fps, by dumping the video
> > stream, than generating PTS + atempo*4 -> At that speed, 2 missing frames
> > here and there doesn't really hurt.)
> >
> > I tried to fine tune your solution further:
> > Let the hwaccel do the decode, and hwdownload the video frame, then use
> > the
> > cpu for the overlay, then hwupload again...
> > But no luck again:
> > time ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -vaapi_device /dev/dri/renderD128 -i
> > ../WK.0217/2024_0217_S0n000.mp4 -r 1 -i 2024_0217_S0n000-f%04d.tiff
> > -filter_complex
> > '[0:v]hwdownload,format=yuv420p[0v];[0v][1:v]overlay,hwupload,format=vaa
> > pi'
> > -r 7.5 -an -c:v hevc_vaapi -crf 22 2024_0217_S0n000f-hwa3.mp4
> > ....
> > [out#0/mp4 @ 0x2f69bfa76a40] Codec AVOption crf (Select the quality for
> > constant quality mode) has not been used for any stream. The most likely
> > reason is either wrong type (e.g. a video option with no video streams)
> or
> > that it is a private option of some encoder which was not actually used
> for
> > any stream.
> > File '2024_0217_S0n000f-hwa3.mp4' already exists. Overwrite? [y/N] y
> > Stream mapping:
> > Stream #0:0 (hevc) -> hwdownload:default
> > Stream #1:0 (tiff) -> overlay
> > format:default -> Stream #0:0 (hevc_vaapi)
> > Press [q] to stop, [?] for help
> > [in#1/image2 @ 0x2f69b8a7d400] Thread message queue blocking; consider
> > raising the thread_queue_size option (current value: 8)
> > There are 2 hardware devices. device vaapi1 of type vaapi is picked for
> > filters by default. Set hardware device explicitly with the
> > filter_hw_device option if device vaapi1 is not usable for filters.
> > [AVHWFramesContext @ 0x2f69cce68080] Failed to read image from surface
> > 0x24: 1 (operation failed).
> > [hwdownload @ 0x2f69c9b494c0] Failed to download frame: -5.
> > [fc#0 @ 0x2f69b8a7ba00] Error while filtering: Input/output error
> > Failed to inject frame into filter network: Input/output error
> > Error while filtering: Input/output error
> > [AVHWFramesContext @ 0x2f69cce68080] Failed to read image from surface
> > 0x23: 1 (operation failed).
> > [hwdownload @ 0x2f69c9b494c0] Failed to download frame: -5.
> > [fc#0 @ 0x2f69b8a7ba00] Error while filtering: Input/output error
> > Failed to inject frame into filter network: Input/output error
> > [out#0/mp4 @ 0x2f69bfa76a40] Nothing was written into output file,
> because
> > at least one of its streams received no packets.
> > frame= 0 fps=0.0 q=0.0 Lsize= 0kB time=N/A bitrate=N/A speed=N/A
> >
> > Conversion failed!
> >
> > real 0m25.042s
> > user 0m2.755s
> > sys 0m23.686s
> > Without the vaapi_device option, I get the same result, just it fails
> > faster:
> > time ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi -i ../WK.0217/2024_0217_S0n000.mp4 -r 1 -i
> > 2024_0217_S0n000-f%04d.tiff -filter_complex
> > '[0:v]hwdownload,format=yuv420p[0v];[0v][1:v]overlay,hwupload,format=vaa
> > pi'
> > -r 7.5 -an -c:v hevc_vaapi -crf 22 2024_0217_S0n000f-hwa3.mp4
> > ...
> > [out#0/mp4 @ 0x2456f270cf80] Codec AVOption crf (Select the quality for
> > constant quality mode) has not been used for any stream. The most likely
> > reason is either wrong type (e.g. a video option with no video streams)
> or
> > that it is a private option of some encoder which was not actually used
> for
> > any stream.
> > Stream mapping:
> > Stream #0:0 (hevc) -> hwdownload:default
> > Stream #1:0 (tiff) -> overlay
> > format:default -> Stream #0:0 (hevc_vaapi)
> > Press [q] to stop, [?] for help
> > [in#1/image2 @ 0x2456f202ca00] Thread message queue blocking; consider
> > raising the thread_queue_size option (current value: 8)
> > [AVHWFramesContext @ 0x245703c68080] Failed to read image from surface
> > 0x24: 1 (operation failed).
> > [hwdownload @ 0x2456fc95f7c0] Failed to download frame: -5.
> > [fc#0 @ 0x2456f206a080] Error while filtering: Input/output error
> > Failed to inject frame into filter network: Input/output error
> > Error while filtering: Input/output error
> > [AVHWFramesContext @ 0x245703c68080] Failed to read image from surface
> > 0x23: 1 (operation failed).
> > [hwdownload @ 0x2456fc95f7c0] Failed to download frame: -5.
> > [fc#0 @ 0x2456f206a080] Error while filtering: Input/output error
> > Failed to inject frame into filter network: Input/output error
> > [out#0/mp4 @ 0x2456f270cf80] Nothing was written into output file,
> because
> > at least one of its streams received no packets.
> > frame= 0 fps=0.0 q=0.0 Lsize= 0kB time=N/A bitrate=N/A speed=N/A
> >
> > Conversion failed!
>
> It seems that your environment doesn't support hardware overlay. I don't
> know if it is
> caused by you hardware or by you driver. I test on my machine, and
> it works. I am using Intel platform.
>
> I see you download and upload frames and try to use software overlay but
> still failed.
> Can you try command like this (change download format to nv12)? The vaapi
> decoded
> frame is nv12 rather than yuv420p. I am not sure if this can fix your
> problem because
> your failed command ("hwdownload,format=yuv420p") still works on my
> machine.
>
> ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> -hwaccel_output_format vaapi -i ../WK.0217/2024_0217_S0n000.mp4 -r 1 -i
> 2024_0217_S0n000-f%04d.tiff -filter_complex
> '[0:v]hwdownload,format=nv12[0v];[0v][1:v]overlay,hwupload,format=vaapi'
> -r 7.5 -an -c:v hevc_vaapi -crf 22 2024_0217_S0n000f-hwa3.mp4
>
> >
> > real 0m8.649s
> > user 0m2.715s
> > sys 0m8.742s
> >
> > Anyway... Comparing the 1m51s to the version, where I had no hw
> > acceleration et al:
> > time ffmpeg -i /store/vedit/WK.0217/2024_0217_S0n000.mp4 -r 1 -i
> > 2024_0217_S0n000-f%04d.tiff -filter_complex overlay -r 7.5 -an -c:v
> libx265
> > -crf 22 2024_0217_S0n000f.mp4
> > ...
> > encoded 985 frames in 374.00s (2.63 fps), 13357.59 kb/s, Avg QP:21.38
> >
> > real 6m15.015s
> > user 90m1.528s
> > sys 0m10.971s
> >
> > The hw accelerated version is more than 3 times faster. Maybe it has some
> > slight bugs... But I might be able to find a workaround for those bugs.
> >
> > Obviously, If I could tune it further, so the decode would be done by the
> > GPU that would be awesome.
> > If the overlay could be done by the gpu, that would be even better.
> > If not, I have to accept this partial solution I've found. It is still
> more
> > than 3 times faster.
> >
> > Thanks a million!
> > Gyu
> >
> > Chen, Wenbin <wenbin.chen-at-intel.com at ffmpeg.org> ezt írta (időpont:
> > 2024.
> > márc. 29., P, 2:52):
> >
> > > > I just successfully tested how to use AMD GPU's hwaccel to re-encode
> a
> > > file
> > > > with ffmpeg. Using this command:
> > > > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -c:v hevc_vaapi -c:a copy
> -crf
> > > 23
> > > > output.mp4
> > > > encoding is much faster.
> > > > Testing it on a sample file and wrapping around with time:
> > > > time ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -c:v hevc_vaapi -c:a copy
> -crf
> > > 23
> > > > -t 60 output.mp4
> > > > frame= 1800 fps= 35 q=-0.0 Lsize= 294431kB time=00:00:59.98
> > > > bitrate=40209.7kbits/s dup=2 drop=0 speed=1.16x
> > > >
> > > > real 0m52.024s
> > > > user 0m2.304s
> > > > sys 0m10.866s
> > > > ^ This is definitely using the GPU and hwaccel. During the
> encoding, I
> > > see
> > > > no cpu usage spike in top, or increase of loadavg, etc.
> > > >
> > > > While the "classic" re-encode for the same file:
> > > > time ffmpeg -i input.mp4 -c:v libx265 -c:a copy -crf 23 -t 60
> output.mp4
> > > > ...
> > > >
> > > > encoded 1800 frames in 460.63s (3.91 fps), 27046.56 kb/s, Avg
> QP:27.49
> > > >
> > > > real 7m41.169s
> > > > user 109m18.008s
> > > > sys 0m5.764s
> > > >
> > > > This is definitely using the cpu. During this 7 minutes and 41
> seconds,
> > > the
> > > > all the cpu cores were 100% used, load was around the number of cpu
> > > cores,
> > > > etc.
> > > >
> > > > The actual command I try to execute:
> > > > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > > > -filter_complex overlay -r 7.5 -an -c:v hevc_vaapi -crf 22 output.mp4
> > > > But it's not working.
> > > > I've got this output:
> > > > ---------<snip>--------
> > > > $ ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > > > -filter_complex overlay -r 7.5 -an -c:v hevc_vaapi -crf 22
> output.mp4
> > > > ffmpeg version 6.1.1 Copyright (c) 2000-2023 the FFmpeg developers
> > > > built with FreeBSD clang version 16.0.6 (
> > > > https://github.com/llvm/llvm-project.git
> llvmorg-16.0.6-0-g7cbf1a259152)
> > > > configuration: --prefix=/usr/local --mandir=/usr/local/man
> > > > --datadir=/usr/local/share/ffmpeg
> --docdir=/usr/local/share/doc/ffmpeg
> > > > --pkgconfigdir=/usr/local/libdata/pkgconfig --disable-static
> > > > --disable-libcelt --enable-shared --enable-pic --enable-gpl --cc=cc
> > > > --cxx=c++ --disable-alsa --disable-libopencore-amrnb
> > > > --disable-libopencore-amrwb --enable-libaom --disable-libaribb24
> > > > --disable-libaribcaption --enable-asm --enable-libass
> --disable-libbs2b
> > > > --disable-libcaca --disable-libcdio --disable-libcodec2
> --enable-libdav1d
> > > > --disable-libdavs2 --disable-libdc1394 --disable-debug
> --enable-htmlpages
> > > > --enable-libdrm --disable-libfdk-aac --disable-libflite
> > > --enable-fontconfig
> > > > --enable-libfreetype --enable-frei0r --disable-libfribidi
> > > --disable-gcrypt
> > > > --disable-libglslang --disable-libgme --enable-gmp --enable-gnutls
> > > > --enable-version3 --disable-libgsm --enable-libharfbuzz
> --enable-iconv
> > > > --disable-libilbc --disable-libjack --enable-libjxl
> --disable-libklvanc
> > > > --disable-libkvazaar --disable-ladspa --enable-libmp3lame
> --enable-lcms2
> > > > --disable-liblensfun --disable-libbluray --enable-libplacebo
> > > > --disable-librsvg --disable-librtmp --enable-libxml2 --disable-lv2
> > > > --disable-mbedtls --disable-libmfx --disable-libmodplug
> > > --disable-libmysofa
> > > > --enable-network --disable-nonfree --enable-nvenc --disable-openal
> > > > --disable-opencl --disable-opengl --disable-libopenh264
> > > > --disable-libopenjpeg --disable-libopenmpt --disable-openssl
> > > > --disable-libopenvino --enable-optimizations --enable-libopus
> > > > --disable-pocketsphinx --disable-libpulse --disable-librabbitmq
> > > > --disable-librav1e --disable-librist --enable-runtime-cpudetect
> > > > --disable-librubberband --disable-sdl2 --enable-libshaderc
> > > > --disable-libsmbclient --disable-libsnappy --disable-sndio
> > > > --disable-libsoxr --disable-libspeex --disable-libsrt
> --disable-libssh
> > > > --enable-libsvtav1 --disable-libtensorflow --disable-libtesseract
> > > > --disable-libtheora --disable-libtwolame --disable-libuavs3d
> > > > --enable-libv4l2 --enable-vaapi --disable-vapoursynth --enable-vdpau
> > > > --disable-libvidstab --enable-libvmaf --enable-libvorbis
> > > > --disable-libvo-amrwbenc --disable-libvpl --enable-libvpx
> --enable-vulkan
> > > > --enable-libwebp --enable-libx264 --enable-libx265 --disable-libxavs2
> > > > --enable-libxcb --disable-libxvid --disable-outdev=xv
> --disable-libzimg
> > > > --disable-libzmq --disable-libzvbi
> > > > libavutil 58. 29.100 / 58. 29.100
> > > > libavcodec 60. 31.102 / 60. 31.102
> > > > libavformat 60. 16.100 / 60. 16.100
> > > > libavdevice 60. 3.100 / 60. 3.100
> > > > libavfilter 9. 12.100 / 9. 12.100
> > > > libswscale 7. 5.100 / 7. 5.100
> > > > libswresample 4. 12.100 / 4. 12.100
> > > > libpostproc 57. 3.100 / 57. 3.100
> > > > Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
> > > > Metadata:
> > > > major_brand : isom
> > > > minor_version : 512
> > > > compatible_brands: isomiso2avc1mp41
> > > > creation_time : 2023-09-03T23:12:04.000000Z
> > > > encoder : Lavf60.3.100
> > > > Duration: 00:01:15.01, start: 0.000000, bitrate: 14849 kb/s
> > > > Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661),
> > > > yuv420p(progressive), 1920x1080, 14750 kb/s, 30 fps, 30 tbr, 60k tbn
> > > > (default)
> > > > Metadata:
> > > > creation_time : 2023-09-03T23:12:04.000000Z
> > > > handler_name : VideoHandler
> > > > vendor_id : [0][0][0][0]
> > > > encoder : h264
> > > > Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 32000
> Hz,
> > > > mono, fltp, 96 kb/s (default)
> > > > Metadata:
> > > > creation_time : 2023-09-03T23:12:04.000000Z
> > > > handler_name : SoundHandler
> > > > vendor_id : [0][0][0][0]
> > > > Input #1, image2, from 'overlay%04d.png':
> > > > Duration: 00:00:03.00, start: 0.000000, bitrate: N/A
> > > > Stream #1:0: Video: png, rgba(pc, gbr/unknown/unknown), 1920x1080,
> > 25
> > > > fps, 25 tbr, 25 tbn
> > > > amdgpu: os_same_file_description couldn't determine if two DRM fds
> > > > reference the same file description.
> > > > If they do, bad things may happen!
> > > > [out#0/mp4 @ 0x1e333e6b5600] Codec AVOption crf (Select the quality
> > for
> > > > constant quality mode) has not been used for any stream. The most
> likely
> > > > reason is either wrong type (e.g. a video option with no video
> streams)
> > > or
> > > > that it is a private option of some encoder which was not actually
> used
> > > for
> > > > any stream.
> > > > Stream mapping:
> > > > Stream #0:0 (h264) -> overlay
> > > > Stream #1:0 (png) -> overlay
> > > > overlay:default -> Stream #0:0 (hevc_vaapi)
> > > > Press [q] to stop, [?] for help
> > > > [in#1/image2 @ 0x1e333e62c600] Thread message queue blocking;
> > consider
> > > > raising the thread_queue_size option (current value: 8)
> > > > Impossible to convert between the formats supported by the filter
> 'graph
> > > 0
> > > > input from stream 0:0' and the filter 'auto_scale_0'
> > > > [fc#0 @ 0x1e333e66a080] Error reinitializing filters!
> > > > Failed to inject frame into filter network: Function not implemented
> > > > Error while filtering: Function not implemented
> > > > Impossible to convert between the formats supported by the filter
> 'graph
> > > 0
> > > > input from stream 0:0' and the filter 'auto_scale_0'
> > > > [fc#0 @ 0x1e333e66a080] Error reinitializing filters!
> > > > Failed to inject frame into filter network: Function not implemented
> > > > [out#0/mp4 @ 0x1e333e6b5600] Nothing was written into output file,
> > > > because
> > > > at least one of its streams received no packets.
> > > > frame= 0 fps=0.0 q=0.0 Lsize= 0kB time=N/A bitrate=N/A
> speed=N/A
> > > >
> > > > Conversion failed!
> > > > ---------<snap>--------
> > > > I'm not sure, if it's completely impossible, because some feature is
> > > > missing on ffmpeg, or from the dri driver, or from freebsd (though,
> the
> > > > simple task, to re-encode an mp4 file was working on this same
> host), or
> > > I
> > > > just need to add some extra filter, so the output of the overlay
> filter
> > > > would be compatible with the encoder / vaapi's input.
> > > > Though, based on the error message, I don't even know if it's the
> input
> > > of
> > > > the overlay filter, or the output of the overlay filter is
> problematic.
> > > > Can someone please help?
> > > > By googling the error message, I found this, but I'm not sure how to
> > > apply
> > > > this to my filter, or if it's even the same problem.
> > > >
> https://superuser.com/questions/1633883/ffmpeg-hevc-vaapi-impossible-
> > to-
> > > > convert-between-the-formats-supported-by-the-fi
> > > > Based on the answers there, I tried to convert it further, but
> without
> > > any
> > > > luck:
> > > > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > > > -filter_complex overlay -r 7.5 -an -c:v hevc_vaapi -crf 22
> output.mp4
> > > > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > > > -filter_complex 'overlay,format=nv12|vaapi,hwupload' -r 7.5 -an -c:v
> > > > hevc_vaapi -crf 22 output.mp4
> > > > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > > > -filter_complex 'overlay,format=yuv420p|vaapi,hwupload' -r 7.5 -an
> -c:v
> > > > hevc_vaapi -crf 22 output.mp4
> > > > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > > > -filter_complex 'overlay,hwupload' -r 7.5 -an -c:v hevc_vaapi -crf 22
> > > > output.mp4
> > > > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > > > -filter_complex 'overlay,format=nv12|vaapi,hwupload' -r 7.5 -an -c:v
> > > > hevc_vaapi -crf 22 output.mp4
> > > > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > > > -filter_complex 'overlay,format=yuv420p|vaapi,hwupload' -r 7.5 -an
> -c:v
> > > > hevc_vaapi -crf 22 output.mp4
> > > > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > > > -filter_complex 'overlay,format=yuv420p,hwupload' -r 7.5 -an -c:v
> > > > hevc_vaapi -crf 22 output.mp4
> > > >
> > > > Thank you for every help!
> > > > Gyu
> > >
> > > The command should be changed to like this:
> > > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128
> > > -hwaccel_output_format vaapi -i input.mp4 -r 1 -i overlay%04d.png
> > > -filter_complex
> "[1:v]hwupload,format=vaapi[1v];[0:v][1v]overlay_vaapi"
> > > -r 7.5 -an -c:v hevc_vaapi -crf 22 output.mp4
> > >
> > > When you use "-hwaccel_output_format vaapi", it means the output frames
> > > are sotred in device memory. Classic overlay filter cannot handle
> frames
> > > in device
> > > memory, so you need to use overlay_vaapi.
> > > png images are decoded to system memory, so for the same reason, you
> > need
> > > to upload
> > > them to device memory.
> > >
> > > - Wenbin
> > >
> > > > _______________________________________________
> > > > ffmpeg-user mailing list
> > > > ffmpeg-user at ffmpeg.org
> > > > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> > > >
> > > > To unsubscribe, visit link above, or email
> > > > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> > > _______________________________________________
> > > ffmpeg-user mailing list
> > > ffmpeg-user at ffmpeg.org
> > > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> > >
> > > To unsubscribe, visit link above, or email
> > > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> > >
> > _______________________________________________
> > ffmpeg-user mailing list
> > ffmpeg-user at ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> _______________________________________________
> ffmpeg-user mailing list
> ffmpeg-user at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
>
More information about the ffmpeg-user
mailing list