[FFmpeg-devel] [PATCH 00/10 v4][RFC] AVCodecContext and AVCodecParameters side data
James Almer
jamrial at gmail.com
Mon Sep 25 23:18:57 EEST 2023
On 9/25/2023 5:02 PM, Paul B Mahol wrote:
> On 9/25/23, James Almer <jamrial at gmail.com> wrote:
>> On 9/14/2023 1:40 PM, James Almer wrote:
>>> On 9/14/2023 12:43 PM, Anton Khirnov wrote:
>>>> Quoting James Almer (2023-09-06 19:44:20)
>>>>> Changes since the previous version:
>>>>> - Removed the AVPacketSideDataSet from AVCodecContext, using instead the
>>>>> existing coded_side_data field, at Anton's request.
>>>>>
>>>>> I still prefer using a new field of type AVPacketSideDataSet, given
>>>>> that the
>>>>> user can currently only fill coded_side_data manually (See the
>>>>> implementation
>>>>> in the codecpar copy functions, which non lavf users would need to
>>>>> replicate),
>>>>> and more so considering coded_side_data is a field pretty much
>>>>> nothing but lavf
>>>>> uses.
>>>>>
>>>>> Opinions are very welcome and needed.
>>>>
>>>> To provide more context on the issue:
>>>>
>>>> AVPackets can have side data attached to them, in the form of
>>>> AVPacket.{side_data,side_data_elems} array+count.
>>>>
>>>> Some of this side data can occasionally be stored in global headers,
>>>> e.g. in containers or extradata. We have some limited support for this:
>>>> * AVStream.{side_data,nb_side_data} array+count allows demuxers to
>>>> export this to user, and muxers to accept it from the user
>>>> * AVCodecContext.{coded_side_data,nb_coded_side_data} allows decoders to
>>>> accept it from the user (in principle, in practice it is not used),
>>>> and encoders to export it to the user (this is used, but not very
>>>> much)
>>>>
>>>> The broad idea for this set is adding stream-global "coded/packet" side
>>>> data to AVCodecParameters, so that it can be naturally communicated from
>>>> demuxers to bitstream filters to decoders, and from encoders to
>>>> bitstream filters to muxers. Since AVStream already contains an
>>>> AVCodecParameters instance, the abovementioned AVStream.side_data
>>>> becomes redundant and is deprecated, the muxer/demuxer stream-global
>>>> side data would be passed through AVStream.codecpar.
>>>>
>>>> The original version proposed by James adds a new struct, that bundles
>>>> the side data array+count, and a set of helpers operating on this
>>>> struct. Then this new struct is added to AVCodecParameters and
>>>> AVCodecContext, which replaces the abovementioned AVStream and
>>>> AVCodecContext side data array+count. Notably AVPacket is left as-is,
>>>> because its side data is widely used and replacing array+count by this
>>>> new struct would be a very intrusive break for little gain.
>>>
>>> In the future, packet side data could be made ref counted. For this,
>>> we'd need to add a new AVBufferRef field to AVPacketSideData, which is
>>> currently tied to the ABI.
>>> Ideally, if this happens, sizeof(AVPacketSideData) would be made not a
>>> part of the ABI, putting it in line with the AVFrame counterpart. Doing
>>> this is pretty much not an option for the existing array+count fields in
>>> AVPacket and AVCodecContext, because everyone, from external API users
>>> to our own API users like the CLI last i checked, loop through it
>>> manually as nothing prevents them to. But if a new field of the new set
>>> struct is added as replacement (a struct where the AVPacketSideData
>>> field is an array of pointers, and whose helpers can be defined as "must
>>> use to handle this struct"), then this would not be an issue.
>>>
>>> I probably said it several times, including in the above paragraph, but
>>> one of my objectives is trying to sync frame and packet side data
>>> structs, fields and helpers, since right now both are considerably
>>> different.
>>> Jan has a patchset also introducing a side data set for frames, so this
>>> is the time to try and align both as much as possible (which i already
>>> talked to him about) and laying the ground for future changes.
>>>
>>>>
>>>> My objections to this approach are
>>>> * coded side data is now stored differently in AVPacket and in
>>>> everything else
>>>> * the case for replacing AVCodecContext.coded_side_data does not seem
>>>> sufficiently strong to me
>>>
>>> It's used only by a handful of encoders to export CPB side data, which
>>> probably only lavf looks at, and by no decoder at all. It's a field that
>>> would be completely painless to replace.
>>>
>>>>
>>>> My alternative suggestion was:
>>>> * avoid adding a new struct (which merely bundles array+count)
>>>> * add a set of helpers that accept array+count as parameters and operate
>>>> on them
>>>
>>> See above my comment about sizeof(AVPacketSideData) and users accessing
>>> the array it directly.
>>>
>>>> * use these helpers for all operations on side data across AVPacket,
>>>> AVCodecContext, and AVCodecParameters
>>>>
>>>> I have a moderately strong preference for this approach, but James
>>>> disagrees. More opinions are very much welcome.
>>
>> Ping. We really need opinions on what approach to use.
>>
>> Here's some visual representation of what is being discussed.
>> Currently, things look like this:
>>
>> #########
>> Side data definition
>> -----
>> struct AVPacketSideData {
>> uint8_t *data;
>> size_t size;
>> enum AVPacketSideDataType type;
>> };
>> -----
>>
>> Users
>> -----
>> struct AVCodecContext {
>> [...]
>> }
>>
>> struct AVCodecParameters {
>> [...]
>> }
>>
>> struct AVPacket {
>> [...]
>> AVPacketSideData *side_data;
>> int side_data_elems;
>> [...]
>> }
>> uint8_t* av_packet_new_side_data(AVPacket *pkt, enum
>> AVPacketSideDataType type,
>> size_t size);
>> int av_packet_add_side_data(AVPacket *pkt,
>> enum AVPacketSideDataType type,
>> uint8_t *data, size_t size);
>> uint8_t* av_packet_get_side_data(const AVPacket *pkt,
>> enum AVPacketSideDataType type,
>> size_t *size);
>>
>> struct AVStream {
>> [...]
>> AVCodecParameters *codec_par;
>> AVPacketSideData *side_data;
>> int nb_side_data;
>> [...]
>> };
>> uint8_t* av_stream_new_side_data(AVStream *stream,
>> enum AVPacketSideDataType type,
>> size_t size);
>> int av_stream_add_side_data(AVStream *st,
>> enum AVPacketSideDataType type,
>> uint8_t *data, size_t size);
>> uint8_t* av_stream_get_side_data(const AVStream *stream,
>> enum AVPacketSideDataType type,
>> size_t *size);
>> -----
>> #########
>>
>> The only way for global side data exported by demuxers to hit decoders
>> or bsfs is by injecting it into the first output packet. This means it's
>> not available at init().
>> The solution is obviously being able to store the side data in the
>> decoder context, which the caller would copy from the demuxer context.
>> My suggestion is to introduce a new struct to hold a set of side data,
>> and helpers that work with it, whereas Anton's suggestion is to just
>> storing the pointer to side data array and side data array size directly
>> without wrapping them.
>>
>> My suggestion would have things look like this:
>>
>> #########
>> Side data definition
>> -----
>> struct AVPacketSideData {
>> uint8_t *data;
>> size_t size;
>> enum AVPacketSideDataType type;
>> };
>>
>> struct AVPacketSideDataSet {
>> AVPacketSideData **sd;
>> int nb_sd;
>> };
>>
>> AVPacketSideData *av_packet_side_data_set_new(AVPacketSideDataSet *set,
>> enum AVPacketSideDataType type,
>> size_t size);
>> AVPacketSideData *av_packet_side_data_set_add(AVPacketSideDataSet *set,
>> enum AVPacketSideDataType type,
>> uint8_t *data, size_t size);
>> AVPacketSideData *av_packet_side_data_set_get(AVPacketSideDataSet *set,
>> enum AVPacketSideDataType type);
>> void av_packet_side_data_set_remove(AVPacketSideDataSet *set,
>> enum AVPacketSideDataType type);
>> void av_packet_side_data_set_uninit(AVPacketSideDataSet *set);
>> -----
>>
>> Users
>> -----
>> struct AVCodecContext {
>> [...]
>> AVPacketSideDataSet *side_data;
>> [...]
>> }
>>
>> struct AVCodecParameters {
>> [...]
>> AVPacketSideDataSet *side_data;
>> [...]
>> }
>>
>> AVPacket and its helpers untouched.
>>
>> struct AVStream {
>> [...]
>> AVCodecParameters *codec_par;
>> [...]
>> };
>> -----
>>
>> In which you'd do for example
>>
>> av_packet_side_data_set_new(avctx->side_data,
>> AV_PKT_DATA_DISPLAYMATRIX,
>> sizeof(int32_t) * 9);
>> av_packet_side_data_set_new(st->codec_par->side_data,
>> AV_PKT_DATA_DISPLAYMATRIX,
>> sizeof(int32_t) * 9);
>> av_packet_side_data_set_get(avctx->side_data,
>> AV_PKT_DATA_DISPLAYMATRIX);
>> av_packet_side_data_set_get(st->codec_par->side_data,
>> AV_PKT_DATA_DISPLAYMATRIX);
>> #########
>>
>> Whereas Anton's would have things look like this:
>>
>> #########
>> Side data definition
>> -----
>> struct AVPacketSideData {
>> uint8_t *data;
>> size_t size;
>> enum AVPacketSideDataType type;
>> };
>>
>> AVPacketSideData *av_packet_side_data_set_new(
>> AVPacketSideData **sd, // INOUT
>> size_t *sd_size, // INOUT
>> enum AVPacketSideDataType type,
>> size_t size);
>> AVPacketSideData *av_packet_side_data_set_add(
>> AVPacketSideData **sd, // INOUT
>> size_t *sd_size, // INOUT
>> enum AVPacketSideDataType type,
>> uint8_t *data, size_t size);
>> AVPacketSideData *av_packet_side_data_set_get(AVPacketSideData *sd,
>> size_t sd_size,
>> enum AVPacketSideDataType type);
>> void av_packet_side_data_set_remove(AVPacketSideData *sd,
>> size_t *sd_size,
>> enum AVPacketSideDataType type);
>> void av_packet_side_data_set_uninit(AVPacketSideData *sd);
>> -----
>>
>> Users
>> -----
>> struct AVCodecContext {
>> [...]
>> AVPacketSideData *coded_side_data;
>> int nb_coded_side_data;
>> [...]
>> }
>>
>> struct AVCodecParameters {
>> [...]
>> AVPacketSideData *side_data;
>> int nb_side_data;
>> [...]
>> }
>>
>> AVPacket and its helpers untouched.
>>
>> struct AVStream {
>> [...]
>> AVCodecParameters *codec_par;
>> [...]
>> };
>> -----
>>
>> In which you'd do for example
>>
>> av_packet_side_data_set_new(&avctx->coded_side_data,
>> &avctx->nb_coded_side_data,
>> AV_PKT_DATA_DISPLAYMATRIX,
>> av_packet_side_data_set_new(&st->codec_par->side_data,
>> &st->codec_par->nb_side_data,
>> AV_PKT_DATA_DISPLAYMATRIX,
>> sizeof(int32_t) * 9);
>> av_packet_side_data_set_get(avctx->coded_side_data,
>> avctx->nb_coded_side_data,
>> AV_PKT_DATA_DISPLAYMATRIX);
>> av_packet_side_data_set_get(st->codec_par->side_data,
>> st->codec_par->nb_side_data,
>> AV_PKT_DATA_DISPLAYMATRIX);
>> #########
>>
>> Does anyone have a preference?
>
> Addition of newer API to just reduce number of arguments when
> using it looks to me too much for little gain.
The difference also lies in the fact the new struct will hold an array
of pointers of AVPacketSideData, meaning sizeof(AVPacketSideData) can be
made non ABI fixed in the long run, whereas the side data in avctx and
codecpar are and will be an array of AVPacketSideData, meaning
sizeof(AVPacketSideData) must remain tied to the ABI.
This means adding new fields becomes much harder, and we lose an
opportunity to put AVPacketSideData in line with AVFrameSideData.
And as i stated, Jan will add his own set of side data, for
AVFrameSideData, and that one needs to be in a new set struct as the
alternative is having helpers with arguments like "AVFrameSideData
***side_data".
More information about the ffmpeg-devel
mailing list