[FFmpeg-devel] [PATCH v2 8/8] aacdec: add a decoder for AAC USAC (xHE-AAC)
Marton Balint
cus at passwd.hu
Tue May 21 23:12:03 EEST 2024
On Tue, 21 May 2024, Lynne via ffmpeg-devel wrote:
> On 21/05/2024 21:40, Marton Balint wrote:
>>
>>
>> On Tue, 21 May 2024, Lynne via ffmpeg-devel wrote:
>>
>>> On 21/05/2024 09:16, Marton Balint wrote:
>>>>
>>>>
>>>> On Sun, 19 May 2024, Lynne via ffmpeg-devel wrote:
>>>>
>>>>> On 19/05/2024 21:39, Marton Balint wrote:
>>>>>>
>>>>>>
>>>>>> On Sun, 19 May 2024, Lynne via ffmpeg-devel wrote:
>>>>>>
>>>>>>> This commit adds a decoder for the frequency-domain part of USAC.
>>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>
>>>>>>> +/* Finish later */
>>>>>>> +static const enum AVChannel usac_ch_pos_to_av[64] = {
>>>>>>> + [0] = AV_CHAN_FRONT_LEFT,
>>>>>>> + [1] = AV_CHAN_FRONT_RIGHT,
>>>>>>> + [2] = AV_CHAN_FRONT_CENTER,
>>>>>>> + [3] = AV_CHAN_LOW_FREQUENCY,
>>>>>>> + [4] = AV_CHAN_BACK_LEFT, // unsure
>>>>>>> + [5] = AV_CHAN_BACK_RIGHT, // unsure
>>>>>>> + [6] = AV_CHAN_FRONT_LEFT_OF_CENTER,
>>>>>>> + [7] = AV_CHAN_FRONT_RIGHT_OF_CENTER,
>>>>>>> + [8] = 0, /* rear surround left is missing */
>>>>>>> + [9] = 0, /* rear surround right is missing */
>>>>>>> + [10] = AV_CHAN_BACK_CENTER,
>>>>>>> + [11] = AV_CHAN_SURROUND_DIRECT_LEFT,
>>>>>>> + [12] = AV_CHAN_SURROUND_DIRECT_RIGHT,
>>>>>>> + [13] = AV_CHAN_SIDE_LEFT, // fairly sure
>>>>>>> + [14] = AV_CHAN_SIDE_RIGHT, // fairly sure
>>>>>>> + [15] = AV_CHAN_WIDE_LEFT, // somewhat confident
>>>>>>> + [16] = AV_CHAN_WIDE_RIGHT, // somewhat confident
>>>>>>> + [17] = AV_CHAN_TOP_FRONT_LEFT,
>>>>>>> + [18] = AV_CHAN_TOP_FRONT_RIGHT,
>>>>>>> + [19] = AV_CHAN_TOP_FRONT_CENTER,
>>>>>>> + [20] = AV_CHAN_TOP_BACK_LEFT,
>>>>>>> + [21] = AV_CHAN_TOP_BACK_RIGHT,
>>>>>>> + [22] = AV_CHAN_TOP_BACK_CENTER,
>>>>>>> + [23] = AV_CHAN_TOP_SIDE_LEFT,
>>>>>>> + [24] = AV_CHAN_TOP_SIDE_RIGHT,
>>>>>>> + [25] = AV_CHAN_TOP_CENTER,
>>>>>>> + [26] = AV_CHAN_LOW_FREQUENCY, // actually LFE2
>>>>>>> + [27] = AV_CHAN_BOTTOM_FRONT_LEFT,
>>>>>>> + [28] = AV_CHAN_BOTTOM_FRONT_RIGHT,
>>>>>>> + [29] = AV_CHAN_BOTTOM_FRONT_CENTER,
>>>>>>> + [30] = 0, /* top left surround is missing */
>>>>>>> + [31] = 0, /* top right surround is missing */
>>>>>>> +};
>>>>>>
>>>>>> Some comment would be nice about the source of this table (which
>>>>>> document,
>>>>>> which table).
>>>>>>
>>>>>> It looks very similar to the ISO channel positons used in mov_chan.
>>>>>> I
>>>>>> think we follow this mapping in most cases:
>>>>>>
>>>>>> Left Surround is SIDE_LEFT
>>>>>> Right Surround is SIDE_RIGHT
>>>>>> Rear Surround Left is BACK_LEFT
>>>>>> Rear Surround Right is BACK_RIGHT
>>>>>>
>>>>>> So in your table [4] and [5] should be SIDE, [8] and [9] should be
>>>>>> BACK.
>>>>>> [26] can be AV_CHAN_LOW_FREQUENCY_2, we do have that.
>>>>>>
>>>>>> Yes, Left/Right Surround and Left/Right Side Surround will be the
>>>>>> same,
>>>>>> but those are not present in commonly used layouts at the same time.
>>>>>>
>>>>>> Regards,
>>>>>> Marton
>>>>>> _______________________________________________
>>>>>> ffmpeg-devel mailing list
>>>>>> ffmpeg-devel at ffmpeg.org
>>>>>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>>>>
>>>>>> To unsubscribe, visit link above, or email
>>>>>> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>>>>>
>>>>> Source of the table is ISO/IEC 23003-3, Table 74 — bsOutputChannelPos:
>>>>>
>>>>> 0 L left front FL front left
>>>>> 1 R right front FR front right
>>>>> 2 C center front FC front centre
>>>>> 3 LFE low frequency enhancement LFE1 low frequency effects-1
>>>>> 4 Ls left surround LS left surround
>>>>> 5 Rs right surround RS right surround
>>>>> 6 Lc left front center FLc front left centre
>>>>> 7 Rc right front center FRc front right centre
>>>>> 8 Lsr rear surround left BL back left
>>>>> 9 Rsr rear surround right BR back right
>>>>> 10 Cs rear center BC back centre
>>>>> 11 Lsd left surround direct LSd left surround direct
>>>>> 12 Rsd right surround direct RSd right surround direct
>>>>> 13 Lss left side surround SL side left
>>>>> 14 Rss right side surround SR side right
>>>>> 15 Lw left wide front FLw front left wide
>>>>> 16 Rw right wide front FRw front right wide
>>>>> 17 Lv left front vertical height TpFL top front left
>>>>> 18 Rv right front vertical height TpFR top front right
>>>>> 19 Cv center front vertical height TpFC top front centre
>>>>> 20 Lvr left surround vertical height rear TpBL top back left
>>>>> 21 Rvr right surround vertical height rear TpBR top back right
>>>>> 22 Cvr center vertical height rear TpBC top back centre
>>>>> 23 Lvss left vertical height side surround TpSiL top side left
>>>>> 24 Rvss right vertical height side surround TpSiR top side right
>>>>> 25 Ts top center surround TpC top centre
>>>>> 26 LFE2 low frequency enhancement 2 LFE2 low frequency effects-2
>>>>> 27 Lb left front vertical bottom BtFL bottom front left
>>>>> 28 Rb right front vertical bottom BtFR bottom front right
>>>>> 29 Cb center front vertical bottom BtFC bottom front centre
>>>>> 30 Lvs left vertical height surround TpLS top left surround
>>>>> 31 Rvs right vertical height surround TpRS top right surround
>>>>>
>>>>> Third field is "Loudspeaker position", last field is "Loudspeaker
>>>>> position according to IEC 100/1706/CDV/IEC 62574 (TC100)", each
>>>>> prefixed
>>>>> with an abbreviation.
>>>>>
>>>>> I've added the source to the table comment in the code.
>>>>>
>>>>> I've also fixed the SIDE/BACK/LFE2 issue in my github repo I linked
>>>>> earlier.
>>>>
>>>> Thanks. Later in the code when you actually use this I can see that you
>>>> are creating a native layout:
>>>>
>>>>> + channel_config_idx = get_bits(gb, 5); /*
>>>>> channelConfigurationIndex
>>>>> */
>>>>> + if (!channel_config_idx) {
>>>>> + /* UsacChannelConfig() */
>>>>> + uint8_t channel_pos[64];
>>>>> + uint8_t nb_channels = get_escaped_value(gb, 5, 8, 16); /*
>>>>> numOutChannels */
>>>>> + if (nb_channels >= 64)
>>>>> + return AVERROR(EINVAL);
>>>>> +
>>>>> + av_channel_layout_uninit(&ac->oc[1].ch_layout);
>>>>> + for (int i = 0; i < nb_channels; i++)
>>>>> + channel_pos[i] = get_bits(gb, 5); /* bsOutputChannelPos
>>>>> */
>>>>> +
>>>>> + ac->oc[1].ch_layout.order = AV_CHANNEL_ORDER_NATIVE;
>>>>> + ac->oc[1].ch_layout.nb_channels = nb_channels;
>>>>> + ac->oc[1].ch_layout.u.mask = 0;
>>>>> +
>>>>> + for (int i = 0; i < nb_channels; i++)
>>>>> + ac->oc[1].ch_layout.u.mask |= 1 <<
>>>>> usac_ch_pos_to_av[channel_pos[i]];
>>>>> +
>>>>> + av_channel_layout_copy(&avctx->ch_layout,
>>>>> & ac->oc[1].ch_layout);
>>>>> + } else {
>>>>
>>>> Probably you should create a custom layout here, because the channels
>>>> are
>>>> not necessary in native order. We already have a relatively simple way
>>>> to
>>>> do that and to fall back to native layouts if possible, here is an
>>>> example
>>>> copied from mov_chan:
>>>>
>>>> ret = av_channel_layout_custom_init(ch_layout, channels);
>>>> if (ret < 0)
>>>> return ret;
>>>> for (i = 0; i < channels; i++) {
>>>> enum AVChannel id = layout_map[i].id;
>>>> ch_layout->u.map[i].id = (id != AV_CHAN_NONE ? id :
>>>> AV_CHAN_UNKNOWN);
>>>> }
>>>> return av_channel_layout_retype(ch_layout, 0,
>>>> AV_CHANNEL_LAYOUT_RETYPE_FLAG_CANONICAL);
>>>>
>>>> So you should adapt this accodingly to aac.
>>>>
>>>> Regards,
>>>> Marton
>>>> _______________________________________________
>>>> ffmpeg-devel mailing list
>>>> ffmpeg-devel at ffmpeg.org
>>>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>>
>>>> To unsubscribe, visit link above, or email
>>>> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>>>
>>> According to the spec:
>>>
>>>> In case of multiple channel elements the index i of
>>>> bsOutputChannelPos[i] indicates the position in which the channel
>>>> appears in the bitstream.
>>>
>>> So the channels will always be in native order, as far as I understand.
>>>
>>
>> AV_CHANNEL_ORDER_NATIVE expects channels in the order of the channel IDs
>> in the AVChannel enum. Surely that is not the case here, unless the
>> decoder reorders the channels. Or what if an output position appears
>> multiple times?
>>
>> Regards,
>> Marton
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel at ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>
>
> It should be the case here, we shouldn't need reordering as NATIVE just lets
> you specify what order the elements appear in the bitstream.
I don't get it.
bsOutputChannelPos[0] = 0
bsOutputChannelPos[1] = 1
will map to the same AVChannelLayout layout as
bsOutputChannelPos[0] = 1
bsOutptuChannelPos[1] = 0
which will be AV_CHANNEL_ORDER_NATIVE with a mask of 0x3, which means the
first channel is LEFT, the second channel is RIGHT.
Regards,
Marton
More information about the ffmpeg-devel
mailing list