[FFmpeg-user] Preserving perceived loudness when downmixing audio from 5.1 AC3 to stereo AAC

Wed Aug 7 01:45:34 CEST 2013

Francois Visagie wrote:

> Therein lies part of the problem, not all input files are AC3. Up to at
> least 30 June -filter:a aformat=channel_layouts=stereo could be used in a
> standard command line to produce stereo from multi-channel inputs with input
> and output volumes perceivably equal. Now each encode needs to be inspected
> individually for input/output differences, and the remedy will in each case
> also differ according to input type and/or volume differences. Really
> sub-optimal in my view, one which I expect to be more widely shared once
> these implications are more widely understood.

I had a look at the old behavior and it clipped, which is not good.

It was also inconsistent - wav and 7ch thd behaved like -ac 2.

I don't know what it did as such - maybe there is a way to explicitly 
recreate it, or perhaps just blindly boost the levels by xDb as part of 
the processing if you don't care about clipping.

I don't know about your use case, but if I were mixing for my self I 
would take care to process individually because that's what's needed to 
get the correct results.

> I sincerely appreciate the trouble you took with outlining various
> principles involved, but, on a more practical level: rather than making
> -filter:a aformat=channel_layouts=stereo now share the mechanism of -ac 2
> and -filter:a aresample=ocl=3 (incorrectly so wrt. volume levels in my
> view), what is the feasibility of making the other two behave like -filter:a
> aformat=channel_layouts=stereo instead?

I am not a developer - but IMHO the old behavior was wrong, but I 
haven't tested enough to work out what/why it did.

It's possible that it was intended by someone - it does seem to down mix 
in the sense it's not just blindly putting 100% in, but then it's not 
normalised enough to prevent clipping.

I must admit I saw a little bit of clipping on some of the 6ch masters I 
looked at - but there was even more after the "old" down mix.

FWIW I also consider the new behavior wrong in that the description of 
aformat says -

"Set output format constraints for the input audio. The framework will 
negotiate the most appropriate format to minimize conversions"

I think it should use -request_channels (where possible) and it doesn't, 
so anyone using -

aformat=channel_layouts=stereo

on say a 7.1 thd stream will not get the best result = a proper studio 
stereo mix, but instead a 7 -> 2 conversion and very low levels.