[FFmpeg-devel] [PATCH 000/279 v2] New channel layout API

Sun Dec 19 13:35:11 EET 2021

On Sat, 18 Dec 2021, Michael Niedermayer wrote:

> On Sat, Dec 18, 2021 at 02:36:12PM +0100, Michael Niedermayer wrote:
>> On Fri, Dec 17, 2021 at 07:04:08PM +0100, Marton Balint wrote:
>>>
>>>
>>> On Fri, 17 Dec 2021, Michael Niedermayer wrote:
>>>
>>>> On Fri, Dec 17, 2021 at 01:04:19AM +0100, Marton Balint wrote:
>>>>>
>>>>>
>>>>> On Thu, 16 Dec 2021, James Almer wrote:
>>>>>
>>>>>> Resending the first two patches only, since this is meant to
>>>>>> show the implementation of one of the several suggestions made
>>>>>> in the previous set that need to be discussed and hopefully
>>>>>> resolved in a call.
>>>>>
>>>>> Can you push the full branch somewhere?
>>>>>
>>>>>>
>>>>>> The proposals so far to extend the API to support either custom
>>>>>> labels for channels are, or some form of extra user information.
>>>>>>
>>>>>> - Fixed array of bytes to hold a label. Simple solution, but
>>>>>>  the labels will have a hard limit that can only be extended
>>>>>>  with a major bump. This is what i implemented in this version.
>>>>>> - "char *name" per channel that the user may allocate and the
>>>>>>  API will manage, duplicate and free. Simple solution, and the
>>>>>>  name can be arbitrarily long, but inefficient (av_strdup() per
>>>>>>  channel with a custom label on layout copy).
>>>>>> - "const char *name" per channel for compile time constants, or
>>>>>>  that the user may allocate and free. Very efficient, but for
>>>>>>  non compile time strings ensuring they outlive the layout can
>>>>>>  be tricky.
>>>>>> - Refcounted AVChannelCustom with a dictionary. This can't be
>>>>>>  done with AVBufferRef, so it would require some other form
>>>>>>  of reference counting. And a dictionary may add quite a bit of
>>>>>>  complexity to the API, as you can set anything on them.
>>>>>
>>>>> Until we have proper refcounting API we can make the AVBufferRef in
>>>>> AVChannelLayout a void *, and only allow channel_layout functions to
>>>>> dereference it as an AVBufferRef. This would mean adding some extra helper
>>>>> functions to channel layout, but overall it is not unsolvable.
>>>>>
>>>>> The real question is that if you want to use refcounting and add helpers to
>>>>> query / replace per-channel metadata, or you find the idea too heavy weight
>>>>> and would like to stick to flat structs.
>>>>
>>>> what is the advantage of refcounting for channel metadata ?
>>>> is it about the used memory, about the reduced need to copy ?
>>>
>>> Basicly it is the ability to store per-channel metadata in avdictionary,
>>> because otherwise it would have to be copyed, and avdictionary is very
>>> ineffective at copying because of many mallocs.
>>>
>>>>
>>>> what kind of metadata and what size do you expect ?
>>>> bytes, kilobytes, megabytes, gigabytes per channel ?
>>>
>>> Usually, nothing, because most format don't have support for per-channel
>>> metadata. In some cases it is going to be a couple of textual metadata
>>> key-value pairs, such as language, label, group, speaker, positon, so 4-5
>>> dynamically allocated string pairs, plus the AVDictionary itself, multiplied
>>> by the number of channels in a layout.
>>>
>>>>
>>>> what is the overhead for dynamic allocation and ref counting?
>>>> that is at which point does it even make sense ?
>>>
>>> I don't have exact measurements. It is generally felt that copying
>>> AVDictionary per-channel is a huge overhead for something as lightweight as
>>> an audio frame which is a 2-4 kB per channel at most and only a couple of
>>> allocs usually not dependant on the number of channels. That's why
>>> refcounting was proposed.
>>
>> I was thinking more at a AVStream / AVCodecParameters level.
>
>> How will a demuxer transport such metadata over a AVPacket into a decoder
>> outputting metadata-filled AVFrames?
>
> or is this never needed ?

I am not sure I understand. Usually metadata is passed from demuxer to 
decoder by avcodec_parameters_to_context(), this is used for all metadata 
which is in AVCodecParameters.

For per-packet metadata ff_decode_frame_props() has some automatic packet 
side data -> frame side data transfer.

AVStream side data may be transferred to AVPacket side data if 
av_format_inject_global_side_data() is used, but it is not enabled by 
default.

Regards,
Marton