[FFmpeg-user] Increased latency from filter_complex

Tue Jul 27 13:36:38 EEST 2021

Hi everyone,

we are using ffmpeg to create hls streams in multiple resolutions and also
languages from incoming rtmp streams.
When encoding without the filter_complex for ducking we get a minimal latency of
around 8 seconds, but the moment we add the autoducking we jump up to around 15
seconds. Also it seems that for every additional audio stream we add, it adds
another 8 seconds of latency.
The filter_complex is the folowing:
```
-filter_complex "format=nv12,hwupload,scale_vaapi=w=1280:h=720,split=2[720p][480];[480]scale_vaapi=w=854:h=480,split=2[480p][240];[240]scale_vaapi=w=426:h=240[240p]; \
        [0:a:0] loudnorm=i=-16.0:lra=12.0:tp=-3.0, asplit=2 [normalize] [out]; \
        [normalize] volume=0.1, asplit=2 [bg1] [bg2]; \
        [1:a:0] agate, adelay=2900:all=1 [tmp1]; \
        [bg1][tmp1] amix, dynaudnorm=b=1:p=0.35:r=1:f=150, loudnorm=i=-16.0:lra=12.0:tp=-3.0 [out1]; \
        [2:a:0] agate, adelay=8000:all=1 [tmp2]; \
        [bg2][tmp2] amix, dynaudnorm=b=1:p=0.35:r=1:f=150, loudnorm=i=-16.0:lra=12.0:tp=-3.0 [out2]"
´´´
To us it looks like the audio streams get processed sequentially. Each process
needs to check the secondary audio levels for a few seconds, to be able to
duck the main audio accordingly, resulting in the increased overall latency.

Is this correct and if so, is there a way to achieve this in parallel?

Or is there any other way to get the autoducking, that doesnt introduce additional latency?

Thank you!

Philipp

PS: Full ffmpeg call in the appended textfile.
-------------- next part --------------

This is the full call we use:
```
ffmpeg -hide_banner -loglevel info \
    -y -threads 0 -init_hw_device vaapi=intel:/dev/dri/renderD128 -hwaccel vaapi -hwaccel_device intel \
    -filter_complex_threads 8 \
    -thread_queue_size 8291 \
    -i "rtmp://127.0.0.1:1935/live/stream" \
    -i "rtmp://127.0.0.1:1935/live/stream-de" \
    -i "rtmp://127.0.0.1:1935/live/stream-en" \
    -level 41 -qp 23 -color_range tv \
    -async 1 -vsync -1 -r 25 -g 50 -keyint_min 50 -force_key_frames "expr:gte(t,n_forced*50)" -sc_threshold 0 -bf 3 -b_strategy 2 \
    -tune zerolatency -crf 23 \
    -muxdelay 0 -muxpreload 0 \
    -max_muxing_queue_size 9999 \
    -filter_complex "format=nv12,hwupload,scale_vaapi=w=1280:h=720,split=2[720p][480];[480]scale_vaapi=w=854:h=480,split=2[480p][240];[240]scale_vaapi=w=426:h=240[240p]; \
      [0:a:0] loudnorm=i=-16.0:lra=12.0:tp=-3.0, asplit=2 [normalize] [out]; \
      [normalize] volume=0.1, asplit=2 [bg1] [bg2]; \
      [1:a:0] agate, adelay=2900:all=1 [tmp1]; \
      [bg1][tmp1] amix, dynaudnorm=b=1:p=0.35:r=1:f=150, loudnorm=i=-16.0:lra=12.0:tp=-3.0 [out1]; \
      [2:a:0] agate, adelay=8000:all=1 [tmp2]; \
      [bg2][tmp2] amix, dynaudnorm=b=1:p=0.35:r=1:f=150, loudnorm=i=-16.0:lra=12.0:tp=-3.0 [out2]" \
    -map [720p] -c:v:0 h264_vaapi -b:v:0 2800k -maxrate:v:0 2996k -bufsize:v:0 4200k \
    -map [480p] -c:v:1 h264_vaapi -b:v:1 700k -maxrate:v:1 900k -bufsize:v:1 850k \
    -map [240p] -c:v:2 h264_vaapi -b:v:2 250k -maxrate:v:2 267k -bufsize:v:2 325k \
    -map [out] \
    -map [out1] \
    -map [out2] \
    -c:a aac -b:a 64k -ar 48000 -ac 1 \
    -var_stream_map "v:1,agroup:aac v:0,agroup:aac v:2,agroup:aac a:0,agroup:aac,default:yes a:1,agroup:aac a:2,agroup:aac" \
    -strict experimental -lhls 1 \
    -f hls -hls_flags second_level_segment_index+second_level_segment_duration+append_list+split_by_time+program_date_time+discont_start -hls_start_number_source epoch -hls_time 2 -hls_list_size 3 \
    -master_pl_name play2.m3u8 -strftime 1 -hls_segment_filename /var/lib/streaming/hls/live/stream/stream%v/%s_%%d_%%t.ts /var/lib/streaming/hls/live/stream/stream%v/play.m3u8
```