[FFmpeg-devel] [PATCH] avcodec/aacdec: don't force HE-AACv2 profile if no PS info is present

Sat Jul 23 02:10:58 EEST 2022

On 7/22/2022 8:00 PM, Alex Converse wrote:
> On Fri, Jul 22, 2022 at 8:37 AM James Almer <jamrial at gmail.com> wrote:
>> On 7/22/2022 12:14 PM, Andreas Rheinhardt wrote:
>>> James Almer:
>>>> On 7/22/2022 11:56 AM, Andreas Rheinhardt wrote:
>>>>> James Almer:
>>>>>> On 7/22/2022 11:23 AM, Andreas Rheinhardt wrote:
>>>>>>> James Almer:
>>>>>>>> On 7/18/2022 10:57 AM, Andreas Rheinhardt wrote:
>>>>>>>>> James Almer:
>>>>>>>>>> On 7/14/2022 9:10 AM, Andreas Rheinhardt wrote:
>>>>>>>>>>> James Almer:
>>>>>>>>>>>> Should fix ticket #3361
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>> This also needs an update to some fate ref samples i'll upload
>>>>>>>>>>>> before
>>>>>>>>>>>> pushing
>>>>>>>>>>>> (fate-aac-al_sbr_ps_04_ur and fate-aac-al_sbr_ps_06_ur which
>>>>>>>>>>>> are now
>>>>>>>>>>>> decoded
>>>>>>>>>>>> properly as he_aac mono, so the .s16 files need to be replaced).
>>>>>>>>>>>>
>>>>>>>>>>>
> 
> [snip]
> 
>>>>>>>>
>>>>>>>>
>>>>>>>> it seems at least for these samples the fixed decoder does not
>>>>>>>> generate
>>>>>>>> a decoded stream comparable to the float one, so I'll just upload a
>>>>>>>> new
>>>>>>>> raw pcm file.
>>>>>>>
>>>>>>> When I decode both of these streams with git master, the left
>>>>>>> channel is
>>>>>>> pretty much identical, yet the right channel of the fixed-point decoder
>>>>>>> is silent and the right channel of the floating point decoder is not.
>>>>>>> With this patch applied, the result are two mono streams that are
>>>>>>> pretty
>>>>>>> much identical: The test sample created by the floating-point decoder
>>>>>>> works with the fixed-point decoder test (if one uncomments and modifies
>>>>>>> the latter). So the issue with aac-al_sbr_ps_06_ur is not a reason to
>>>>>>> upload new samples.
>>>>>>
>>>>>> Ok, can you suggest how to add a test that decodes with the fixed point
>>>>>> decoder then compares that with the output of the float decoder? Is
>>>>>> there a helper in fate.sh already for this?
>>>>>>
>>>>>
>>>>> There is currently no helper in fate-run.sh for this.
>>>>
>>>> Yeah, figures that's the case. Can you suggest how one would work for this?
>>>>
>>>
>>> A new function in fate-run.sh seems to be necessary. Given that a
>>> similar situation exists for AC-3 we should not hardcode aac; instead it
>>> should have two parameters for the float and the fixed decoder. Then it
>>> should decode the input file twice and do the same comparison that the
>>> current tests use (they use the oneoff method, which results in
>>> do_tiny_psnr with MAXDIFF being called).
>>> I think the tests for the fixed-point decoder (with checksums) should be
>>> separate, so that one can test this without the floating-point decoder
>>> being present.
>>>
>>>>>
>>>>>>>
>>>>>>> - Andreas
>>>>>>>
>>>>>>> PS: libfdk-aac produces a file that looks pretty much like the floating
>>>>>>> point decoder from git master. Are you sure your patch is correct?
>>>>>>
>>>>>> Yes, they duplicate the single channel in the stream and output it as
>>>>>> stereo, something that should be done by a filter if that's what the
>>>>>> user wants. Decoding a mono sample should generate a mono stream.
>>>>>
>>>>> Not really. The channels are different.
>>>>
>>>> ./ffmpeg -c:a libfdk_aac -i ../samples/aac/al_sbr_ps_04_new.mp4 -af
>>>> channelmap=channel_layout=mono:map=0 -f md5 -
>>>>
>>>> has the same result as
>>>>
>>>> ./ffmpeg -c:a libfdk_aac -i ../samples/aac/al_sbr_ps_04_new.mp4 -af
>>>> channelmap=channel_layout=mono:map=1 -f md5 -
>>>>
>>>> Same with the samples in the ticket.
>>>>
>>>
>>> This seems to be true for al_sbr_ps_04_new.mp4; but it is not true for
>>> al_sbr_ps_06_new.mp4.
>>
>> So looks like nearly a hundred samples into al_sbr_ps_06_new.mp4 frames
>> start containing PS info. With this patch the decoder properly decodes
>> the first hundred as mono, but since the decoder is locked, it will keep
>> decoding the stereo samples as mono.
>>
> 
> Hey all,
> 
> I thought I should share a little bit of context about this problem,
> but I don't mean to come back from nowhere and try to overrule you
> all. Do what you decide is best.
> 
> An HE-AACv2 decoder treating unsignaled mono as stereo is an
> intentional design trade-off that the MPEG audio committee made. It is
> a tradeoff that the FFmpeg decoder has mimicked for a number of years.
> If you want to revisit the trade-off (and there may very well be good
> reasons to do that) that's fine, but I think treating the current
> behavior as a "bug" is the wrong approach.
> 
> In fact, those fate tests are based on a Coding Technologies test
> suite designed to validate a decoder conformed to the MPEG behavior.
> 
> Parametric Stereo is nested inside the SBR extension after the main
> SBR payload which itself is nested inside the AAC raw data block after
> the main channel elements. It takes a full bitstream parse of both the
> AAC and SBR layers and finding an SBR intra-frame to even see if PS is
> present.
> 
> 
> As for why I think MPEG made this trade-off (my opinion of why they did this) :
> - It enables cutting (or joining a stream of) audio at arbitrary
> frames without losing PS content or stereo detection.
> - Most devices with mono output can support downmixing from stereo.
> - Down mixing stereo to mono can be ugly with regard to phase but it
> doesn't have nearly the complexity of the taste/environment/judgment
> factor that surround sound to stereo downmix does.
> - In transcode scenarios, most output codecs can support encoding
> stereo formatted audio where both channels are identical quite
> efficiently (even in AAC-LC with mid-side coding the overhead for
> mono-coded as stereo is a single digit number of bits per frame).
> - On playback on a stereo device it doesn't matter if the decoded
> signal is one channel played out of both speakers or two channels that
> are identical.
> - This behavior can be overridden with more complicated signaling the
> extradata (but this requires a transport that supports such signaling
> and doesn't simply wrap ADTS).
> - The folks working on iterating "MPEG-2 NBC" into the "MPEG-4 Audio"
> monstrosity were primarily focused on getting their new features used.

Thanks a lot for clarifying this. This "bug" has been pending for nearly 
a decade now...

So i withdraw this patch and I'll close the ticket as invalid. I'll then 
see if adding a downmix input option is feasible, but the user could 
just request to downmix to mono by a filter down the chain (which is 
what the audiomatch tests do), so maybe it's not that useful.