[FFmpeg-user] linear loudnorm

Jonathan Baecker jonbae77 at gmail.com
Mon Feb 22 19:21:40 EET 2021


Hello!

I'm trying to nomalize an audio file with FFmpeg. I'm using the loudnorm 
filter. The source loudness is -23 LUFS and I want to make it -17 LUFS. 
As far as I know, loudnorm has 2 modes of normalizing audio: linear and 
dynamic (analysing small parts vs. analysing the whole file).

The problem is that when I have an audio file where someone is speaking, 
the pauses in the speech get louder and louder and thus a hissing noise 
is clearly audible. Thats why I need linear normalization. But for some 
reason, that I can't explain, FFmpeg always switches to dynamic mode.

I've considered all the requirements for liner scaling in the loudnorm 
documentation, but I can't figure out whats wrong. I've specified all 4 
values, target LRA isn't lower than input LRA, and when I normalize the 
file in Adobe Audition to -17 LUFS, I can't see any peeking.
What would be the best way to get linear normalization with FFmpeg?

Here is what I'm doing:

1. Analyze the source audio file:

ffmpeg -i input.wav -filter:a 
loudnorm=I=-17:TP=-1:LRA=9:print_format=json -f null -
ffmpeg version N-100942-gc596e82155-gb7251aed46+3 Copyright (c) 
2000-2021 the FFmpeg developers
   built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
   configuration:  --pkg-config=pkgconf --cc='ccache gcc' --cxx='ccache 
g++' --ld='ccache g++' --disable-autodetect --enable-amf --enable-bzlib 
--enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 
--enable-iconv --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2 
--enable-ffnvcodec --enable-nvdec --enable-cuda-llvm --enable-libmp3lame 
--enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 
--enable-libx265 --enable-libdav1d --enable-libaom --disable-debug 
--enable-fontconfig --enable-libass --enable-libbluray 
--enable-libfreetype --enable-libmfx --enable-libmysofa 
--enable-libopencore-amrnb --enable-libopencore-amrwb 
--enable-libopenjpeg --enable-libsnappy --enable-libsoxr 
--enable-libspeex --enable-libtheora --enable-libtwolame 
--enable-libvidstab --enable-libvo-amrwbenc --enable-libwebp 
--enable-libxml2 --enable-libzimg --enable-libshine --enable-gpl 
--enable-avisynth --enable-libxvid --enable-libopenmpt --enable-version3 
--enable-chromaprint --enable-decklink --enable-frei0r --enable-libbs2b 
--enable-libcaca --enable-libcdio --enable-libfdk-aac --enable-libflite 
--enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc 
--enable-libsvthevc --enable-libsvtav1 --enable-libkvazaar 
--enable-libmodplug --enable-librtmp --enable-librubberband 
--enable-libtesseract --enable-libxavs --enable-libzmq --enable-libzvbi 
--enable-openal --enable-libvmaf --enable-libcodec2 --enable-libsrt 
--enable-ladspa --enable-librav1e --enable-libglslang --enable-vulkan 
--enable-openssl --extra-cflags=-fopenmp --extra-libs=-lgomp 
--extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++ 
--extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC 
--extra-cflags=-DCHROMAPRINT_NODLL --extra-libs=-lstdc++ 
--extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi 
--extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads 
--extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree 
--extra-cflags=-DAL_LIBTYPE_STATIC 
--extra-cflags='-ID:/ab-suite/local64/include/AL'
   libavutil      56. 64.100 / 56. 64.100
   libavcodec     58.120.100 / 58.120.100
   libavformat    58. 65.101 / 58. 65.101
   libavdevice    58. 11.103 / 58. 11.103
   libavfilter     7.101.100 /  7.101.100
   libswscale      5.  8.100 /  5.  8.100
   libswresample   3.  8.100 /  3.  8.100
   libpostproc    55.  8.100 / 55.  8.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'input.wav':
   Metadata:
     encoded_by      : Adobe Adobe Media Encoder 2020.0
     encoder         : Adobe Adobe Media Encoder 2020.0 (Windows)
     date            : 2021-02-15
     creation_time   : 15:31:34
     time_reference  : 0
   Duration: 00:37:57.52, bitrate: 1539 kb/s
   Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, 
stereo, s16, 1536 kb/s
Stream mapping:
   Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
   Metadata:
     encoded_by      : Adobe Adobe Media Encoder 2020.0
     time_reference  : 0
     date            : 2021-02-15
     encoder         : Lavf58.65.101
   Stream #0:0: Audio: pcm_s16le, 192000 Hz, stereo, s16, 6144 kb/s
     Metadata:
       encoder         : Lavc58.120.100 pcm_s16le
size=N/A time=00:37:54.62 bitrate=N/A speed=33.9x
video:0kB audio:1708140kB subtitle:0kB other streams:0kB global 
headers:0kB muxing overhead: unknown
[Parsed_loudnorm_0 @ 00000121fcf41e00]
{
         "input_i" : "-22.72",
         "input_tp" : "-2.67",
         "input_lra" : "6.10",
         "input_thresh" : "-33.31",
         "output_i" : "-16.95",
         "output_tp" : "-1.00",
         "output_lra" : "6.00",
         "output_thresh" : "-27.53",
         "normalization_type" : "dynamic",
         "target_offset" : "-0.05"
}

2. Encode the audio with:

ffmpeg -i input.wav -filter:a 
loudnorm=I=-17:TP=-1:LRA=9:measured_I=-22.72:measured_TP=-2.67:measured_LRA=6.10:measured_thresh=-33.31:offset=-0.05:linear=true:print_format=summary 
output.wav
ffmpeg version N-100942-gc596e82155-gb7251aed46+3 Copyright (c) 
2000-2021 the FFmpeg developers
   built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
   configuration:  --pkg-config=pkgconf --cc='ccache gcc' --cxx='ccache 
g++' --ld='ccache g++' --disable-autodetect --enable-amf --enable-bzlib 
--enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 
--enable-iconv --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2 
--enable-ffnvcodec --enable-nvdec --enable-cuda-llvm --enable-libmp3lame 
--enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 
--enable-libx265 --enable-libdav1d --enable-libaom --disable-debug 
--enable-fontconfig --enable-libass --enable-libbluray 
--enable-libfreetype --enable-libmfx --enable-libmysofa 
--enable-libopencore-amrnb --enable-libopencore-amrwb 
--enable-libopenjpeg --enable-libsnappy --enable-libsoxr 
--enable-libspeex --enable-libtheora --enable-libtwolame 
--enable-libvidstab --enable-libvo-amrwbenc --enable-libwebp 
--enable-libxml2 --enable-libzimg --enable-libshine --enable-gpl 
--enable-avisynth --enable-libxvid --enable-libopenmpt --enable-version3 
--enable-chromaprint --enable-decklink --enable-frei0r --enable-libbs2b 
--enable-libcaca --enable-libcdio --enable-libfdk-aac --enable-libflite 
--enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc 
--enable-libsvthevc --enable-libsvtav1 --enable-libkvazaar 
--enable-libmodplug --enable-librtmp --enable-librubberband 
--enable-libtesseract --enable-libxavs --enable-libzmq --enable-libzvbi 
--enable-openal --enable-libvmaf --enable-libcodec2 --enable-libsrt 
--enable-ladspa --enable-librav1e --enable-libglslang --enable-vulkan 
--enable-openssl --extra-cflags=-fopenmp --extra-libs=-lgomp 
--extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++ 
--extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC 
--extra-cflags=-DCHROMAPRINT_NODLL --extra-libs=-lstdc++ 
--extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi 
--extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads 
--extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree 
--extra-cflags=-DAL_LIBTYPE_STATIC 
--extra-cflags='-ID:/ab-suite/local64/include/AL'
   libavutil      56. 64.100 / 56. 64.100
   libavcodec     58.120.100 / 58.120.100
   libavformat    58. 65.101 / 58. 65.101
   libavdevice    58. 11.103 / 58. 11.103
   libavfilter     7.101.100 /  7.101.100
   libswscale      5.  8.100 /  5.  8.100
   libswresample   3.  8.100 /  3.  8.100
   libpostproc    55.  8.100 / 55.  8.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'input.wav':
   Metadata:
     encoded_by      : Adobe Adobe Media Encoder 2020.0
     encoder         : Adobe Adobe Media Encoder 2020.0 (Windows)
     date            : 2021-02-15
     creation_time   : 15:31:34
     time_reference  : 0
   Duration: 00:37:57.52, bitrate: 1539 kb/s
   Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, 
stereo, s16, 1536 kb/s
File 'output.wav' already exists. Overwrite? [y/N] y
Stream mapping:
   Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'output.wav':
   Metadata:
     ITCH            : Adobe Adobe Media Encoder 2020.0
     time_reference  : 0
     ICRD            : 2021-02-15
     ISFT            : Lavf58.65.101
   Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 192000 Hz, 
stereo, s16, 6144 kb/s
     Metadata:
       encoder         : Lavc58.120.100 pcm_s16le
size= 1708140kB time=00:37:54.62 bitrate=6151.8kbits/s speed=33.6x
video:0kB audio:1708140kB subtitle:0kB other streams:0kB global 
headers:0kB muxing overhead: 0.000009%
[Parsed_loudnorm_0 @ 000001a3673c0780]
Input Integrated:    -22.7 LUFS
Input True Peak:      -2.7 dBTP
Input LRA:             6.1 LU
Input Threshold:     -33.3 LUFS

Output Integrated:   -17.0 LUFS
Output True Peak:     -1.0 dBTP
Output LRA:            6.0 LU
Output Threshold:    -27.6 LUFS

Normalization Type:   Dynamic
Target Offset:        -0.0 LU



More information about the ffmpeg-user mailing list