[FFmpeg-user] 5.1 downmix to 2.0 (again) and buried dialogs
pehache
pehache.7 at gmail.com
Fri Aug 26 20:32:44 EEST 2022
Hi,
Not strictly speaking a ffmpeg (recurring) question, but ffmpeg is often
used for that...
Since I a have only a stereo setup (albeit a decent one) attached to my
TV, I started a while ago to generate downmixed 2.0 tracks with ffmpeg
on my video files with 5.1 (or 7.1) tracks.
My original motivation was a too low perceived loudness of the dialogs
compared to the music/ambiant sound in *some* movies (not all of them!).
My hypothesis at that time was that the built-in downmixing of my
equiment was overweighting the left and right channels (both front and
side) compared to the central channel where most dialogs are supposed to
be placed.
So I started with the "-ac 2" option in ffmpeg... Which basically
changed nothing (as far as I could say, at least). Investigating more I
then found the -af "pan=stereo| FL< ... | FR< ..." syntax to chose the
weighting coefficient of each 5.1 channel to buiild the stereo channels.
There were recommended coefficients:
FL < 1.0*FL + 0.707*FC + 0.707*SL (and similarly from FR)
These ones were ginving the same result than -ac 2 to my ears.
There were also tons of alternate formula described on various web
sites... I ended up with
FL < 0.707*FL + 1.0*FC + 0.707*SL
It was doing what it was supposed to do: louder dialogs compared to
music and ambient sounds.
However I finally observed that it was also narrowing the stereo image.
Indeed, FC does not contain only voices but also a large part of the
music and ambient sounds. Overweighting FC would not narrow the stereo
image it was containing only the voices, but this is not the case.
I kept wondering why the dialog loudness is sometimes perceived too low
after downmixing, and I have a possible explanation: the brain is very
good at isolating a voice buried in the ambient noise because it can
located where it comes from. That's why people with hearing aids still
have difficulties to follow a conversation when multiple people speak at
the same time: the earings aids can restore the volume, but the
directivity is (mostly) lost... So, with a real 5.1 or 7.1 setup the
brain is not bothered by the side/rear channels when it comes to focus
on the central dialogs, because they come from fully different
directions. But after downmix, what was coming from the side/rear
channels is now coming from the front channels, making the separation
task more difficult for the brain. The solution is hence to downweight
the side/rear channels... Therefore I am now using:
FL < 1.0*FL + 0.707*FC + 0.4*SL
And it seems better to me: the dialogs are clearer, without narrowing
the stereo image. But maybe this is just what I desperately want to hear...
Any thought on all of this ?
More information about the ffmpeg-user
mailing list