[FFmpeg-devel] [PATCH 4/9] avutil: introduce an Immersive Audio Model and Formats API

Anton Khirnov anton at khirnov.net
Thu Nov 30 13:01:31 EET 2023


Quoting James Almer (2023-11-26 02:28:53)
> diff --git a/libavutil/iamf.h b/libavutil/iamf.h
> new file mode 100644
> index 0000000000..1f4919efdb
> --- /dev/null
> +++ b/libavutil/iamf.h
> +enum AVIAMFAudioElementType {
> +    AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL,
> +    AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE,

'audio' in the names is redundant and makes already long identifiers
unnecessarily longer

> +};
> +
> +/**
> + * @defgroup lavf_iamf_params Parameter Definition
> + * @{
> + * Parameters as defined in section 3.6.1 and 3.8

of what?

> +/**
> + * Mix Gain Parameter Data as defined in section 3.8.1
> + *
> + * Subblocks in AVIAMFParamDefinition use this struct when the value or
> + * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is
> + * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN.
> + */
> +typedef struct AVIAMFMixGainParameterData {

Does 'ParameterData' at the end really serve any purpose?

> +    const AVClass *av_class;
> +
> +    // AVOption enabled fields
> +    unsigned int subblock_duration;
> +    enum AVIAMFAnimationType animation_type;
> +    AVRational start_point_value;
> +    AVRational end_point_value;
> +    AVRational control_point_value;
> +    unsigned int control_point_relative_time;

All these should really be documented. Also, some vertical alignment
would improve readability.

> +/**
> + * Parameters as defined in section 3.6.1

This really REALLY needs more documentation.

> + */
> +typedef struct AVIAMFParamDefinition {
> +    const AVClass *av_class;
> +
> +    size_t subblocks_offset;
> +    size_t subblock_size;
> +
> +    enum AVIAMFParamDefinitionType param_definition_type;
> +    unsigned int num_subblocks;

We use nb_foo generally.

> +AVIAMFParamDefinition *av_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType param_definition_type,
> +                                                            AVDictionary **options,
> +                                                            unsigned int num_subblocks, AVDictionary **subblock_options,

What are the dicts for?

> + *
> + * When audio_element_type is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, this
> + * corresponds to an Scalable Channel Layout layer as defined in section 3.6.2.
> + * For AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, it is an Ambisonics channel
> + * layout as defined in section 3.6.3
> + */
> +typedef struct AVIAMFLayer {
> +    const AVClass *av_class;
> +
> +    // AVOption enabled fields
> +    AVChannelLayout ch_layout;
> +
> +    unsigned int recon_gain_is_present;

Every time you dedicate 4 bytes to storing one bit, God kills a kitten.

> +    /**
> +     * Output gain flags as defined in section 3.6.2

It would be really really nice if people could understand the struct
contents without some external document.

> +     * This field is defined only if audio_element_type is

presumably the parent's audio_element_type

> +     * AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, must be 0 otherwise.
> +     */
> +    unsigned int output_gain_flags;
> +    /**
> +     * Output gain as defined in section 3.6.2
> +     *
> +     * Must be 0 if @ref output_gain_flags is 0.
> +     */
> +    AVRational output_gain;
> +    /**
> +     * Ambisonics mode as defined in section 3.6.3
> +     *
> +     * This field is defined only if audio_element_type is
> +     * AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, must be 0 otherwise.
> +     *
> +     * If 0, channel_mapping is defined implicitly (Ambisonic Order)
> +     * or explicitly (Custom Order with ambi channels) in @ref ch_layout.
> +     * If 1, @ref demixing_matrix must be set.
> +     */
> +    enum AVIAMFAmbisonicsMode ambisonics_mode;
> +
> +    // End of AVOption enabled fields

What purpose does this comment serve?

> +    /**
> +     * Demixing matrix as defined in section 3.6.3
> +     *
> +     * Set only if @ref ambisonics_mode == 1, must be NULL otherwise.
> +     */
> +    AVRational *demixing_matrix;

Who sets this?

> +typedef struct AVIAMFAudioElement {
> +    const AVClass *av_class;
> +
> +    AVIAMFLayer **layers;
> +    /**
> +     * Number of layers, or channel groups, in the Audio Element.
> +     * For audio_element_type AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, there
> +     * may be exactly 1.
> +     *
> +     * Set by av_iamf_audio_element_add_layer(), must not be
> +     * modified by any other code.
> +     */
> +    unsigned int num_layers;
> +
> +    unsigned int codec_config_id;

???

> +int av_iamf_audio_element_add_layer(AVIAMFAudioElement *audio_element, AVDictionary **options);

I would much prefer to have the caller call av_opt_set* manually rather
than sprinkle AVDictionary function arguments everywhere.
Do note that their usage in lavc and lavf APIs is out of necessity, not
because it's very pretty.

-- 
Anton Khirnov


More information about the ffmpeg-devel mailing list