[FFmpeg-user] Preserving perceived loudness when downmixing audio from 5.1 AC3 to stereo AAC

Wed Aug 14 16:31:46 CEST 2013

> -----Original Message-----
> From: ffmpeg-user-bounces at ffmpeg.org [mailto:ffmpeg-user-
> bounces at ffmpeg.org] On Behalf Of Nicolas George
> Sent: 14 August 2013 14:44
> To: FFmpeg user questions
> Subject: Re: [FFmpeg-user] Preserving perceived loudness when
> downmixing audio from 5.1 AC3 to stereo AAC
> 
> Le septidi 27 thermidor, an CCXXI, Francois Visagie a écrit :
> > Is it possible to normalise audio levels using ffmpeg? The 'pan'
> > filter documentation mentions:
> 
> Define "normalize".
> 
> > "If the ‘=’ in a channel specification is replaced by ‘<’, then the
> > gains for that specification will be renormalized so that the total is
> > 1, thus avoiding clipping noise."
> 
> Read carefully: "so that the total is 1", "avoiding clipping". That is exactly what
> was discussed in this thread.
> 
> > I.e., having downmixed to stereo, can one expect correct normalisation
> > from '-filter:a pan=stereo:c0<c0:c1<c1'?
> 
> This filter does nothing, obviously.
> 
> > If not, does ffmpeg provide a better mechanism, or is something like
> > that in planning?
> 
> There are mechanism, they are mostly good, but you are not specific
> enough.
> 

Apologies, I do not know enough about this field to express myself clearly. I'll try to improve: I'm looking for an ffmpeg mechanism that automatically determines and uniformly applies to the audio stream a gain factor that ensures maximum output levels without clipping. An added benefit would be the ability to specify the maximum permitted output level as a percentage of maximum possible output level. In other words, something similar to DGIndex's normalisation:
"Immediately, a pass will be made over the input files and a gain factor (pre-scale ratio) will be determined and stored for later use. This gain factor is such that when the audio is amplified by this factor, the highest sound peak will be set to the percentage of the maximum possible audio level set by the normalization selected. So, if your normalization is set to 50%, the loudest sound will be set to 50% of maximum."

How can this be accomplished in ffmpeg?

Thanks,
Francois