[FFmpeg-devel] [PATCH v2] avcodec: add a get_encoder_buffer() callback to AVCodecContext

Fri Mar 12 22:11:54 EET 2021

James Almer:
> On 3/12/2021 4:46 PM, James Almer wrote:
>> On 3/12/2021 4:30 PM, Michael Niedermayer wrote:
>>> On Fri, Mar 12, 2021 at 02:03:52PM -0300, James Almer wrote:
>>>> On 3/12/2021 1:32 PM, Michael Niedermayer wrote:
>>>>> On Thu, Mar 11, 2021 at 02:18:36PM -0300, James Almer wrote:
>>>>>> On 3/11/2021 1:35 PM, Michael Niedermayer wrote:
>>>>>>> On Wed, Mar 10, 2021 at 05:59:11PM -0300, James Almer wrote:
>>>>>>>> On 3/10/2021 5:18 PM, Michael Niedermayer wrote:
>>>>>>>>> On Mon, Feb 22, 2021 at 07:27:34PM -0300, James Almer wrote:
>>>>>>>>>> On 2/21/2021 6:04 PM, James Almer wrote:
>>>>>>>>>>> On 2/21/2021 5:29 PM, Mark Thompson wrote:
>>>>>>>>>>>> On 21/02/2021 20:00, James Almer wrote:
>>>>>>>>>>>>> On 2/21/2021 4:13 PM, Mark Thompson wrote:
>>>>>>>>>>>>>> On 21/02/2021 17:35, James Almer wrote:
>>>>>>>>>>>>>>> This callback is functionally the same as get_buffer2()
>>>>>>>>>>>>>>> is for decoders, and
>>>>>>>>>>>>>>> implements for the new encode API the functionality of
>>>>>>>>>>>>>>> the old encode API had
>>>>>>>>>>>>>>> where the user could provide their own buffers.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>> Used the names Lynne suggested this time, plus a line
>>>>>>>>>>>>>>> about how the callback
>>>>>>>>>>>>>>> must be thread safe.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>       libavcodec/avcodec.h | 45
>>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>>>       libavcodec/codec.h   |  8 ++++---
>>>>>>>>>>>>>>>       libavcodec/encode.c  | 54
>>>>>>>>>>>>>>> +++++++++++++++++++++++++++++++++++++++++++-
>>>>>>>>>>>>>>>       libavcodec/encode.h  |  8 +++++++
>>>>>>>>>>>>>>>       libavcodec/options.c |  1 +
>>>>>>>>>>>>>>>       5 files changed, 112 insertions(+), 4 deletions(-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>>>>>>>>>>>>>>> index 7dbf083a24..e60eb16ce1 100644
>>>>>>>>>>>>>>> --- a/libavcodec/avcodec.h
>>>>>>>>>>>>>>> +++ b/libavcodec/avcodec.h
>>>>>>>>>>>>>>> @@ -513,6 +513,11 @@ typedef struct
>>>>>>>>>>>>>>> AVProducerReferenceTime {
>>>>>>>>>>>>>>>        */
>>>>>>>>>>>>>>>       #define AV_GET_BUFFER_FLAG_REF (1 << 0)
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * The encoder will keep a reference to the packet and
>>>>>>>>>>>>>>> may reuse it later.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +#define AV_GET_ENCODER_BUFFER_FLAG_REF (1 << 0)
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>       struct AVCodecInternal;
>>>>>>>>>>>>>>>       /**
>>>>>>>>>>>>>>> @@ -2346,6 +2351,39 @@ typedef struct AVCodecContext {
>>>>>>>>>>>>>>>            * - encoding: set by user
>>>>>>>>>>>>>>>            */
>>>>>>>>>>>>>>>           int export_side_data;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +    /**
>>>>>>>>>>>>>>> +     * This callback is called at the beginning of each
>>>>>>>>>>>>>>> packet to get a data
>>>>>>>>>>>>>>> +     * buffer for it.
>>>>>>>>>>>>>>> +     *
>>>>>>>>>>>>>>> +     * The following field will be set in the packet
>>>>>>>>>>>>>>> before this callback is
>>>>>>>>>>>>>>> +     * called:
>>>>>>>>>>>>>>> +     * - size
>>>>>>>>>>>>>>> +     * This callback must use the above value to
>>>>>>>>>>>>>>> calculate the required buffer size,
>>>>>>>>>>>>>>> +     * which must padded by at least
>>>>>>>>>>>>>>> AV_INPUT_BUFFER_PADDING_SIZE bytes.
>>>>>>>>>>>>>>> +     *
>>>>>>>>>>>>>>> +     * This callback must fill the following fields in
>>>>>>>>>>>>>>> the packet:
>>>>>>>>>>>>>>> +     * - data
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is the data pointer allowed to be in write-only memory?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm not sure what the use case for this would be, so
>>>>>>>>>>>>> probably no?
>>>>>>>>>>>>
>>>>>>>>>>>> The two use-cases I see for this API are:
>>>>>>>>>>>>
>>>>>>>>>>>> * You want to avoid a copy when combining the output with
>>>>>>>>>>>> something
>>>>>>>>>>>> else.  E.g. you pass a pointer to the block of memory following
>>>>>>>>>>>> where you are going to put your header data (for something
>>>>>>>>>>>> you are
>>>>>>>>>>>> going to send over the network, say).
>>>>>>>>>>>>
>>>>>>>>>>>> * You want to avoid a copy when passing the output directly to
>>>>>>>>>>>> something external.  E.g. you pass a pointer to a memory-mapped
>>>>>>>>>>>> device buffer (such as a V4L2 buffer, say).
>>>>>>>>>>>>
>>>>>>>>>>>> In the second case, write-only memory on an external device
>>>>>>>>>>>> seems
>>>>>>>>>>>> possible, as does memory which is, say, readable but
>>>>>>>>>>>> uncached, so
>>>>>>>>>>>> reading it is a really bad idea.
>>>>>>>>>>>
>>>>>>>>>>> Allowing the second case would depend on how encoders behave.
>>>>>>>>>>> Some may
>>>>>>>>>>> attempt to read data already written to the output packet.
>>>>>>>>>>> It's not like
>>>>>>>>>>> all of them allocate the packet, do a memcpy from an internal
>>>>>>>>>>> buffer,
>>>>>>>>>>> then return.
>>>>>>>>>>> There is also the flag meant to signal that the encoder will
>>>>>>>>>>> keep a
>>>>>>>>>>> reference to the packet around, which more or less implies it
>>>>>>>>>>> will be
>>>>>>>>>>> read later in the encoding process.
>>>>>>>>>>>
>>>>>>>>>>> The doxy for avcodec_encode_video2(), which allowed the user
>>>>>>>>>>> to provide
>>>>>>>>>>> their own buffers in the output packet, does not mention any
>>>>>>>>>>> kind of
>>>>>>>>>>> requirement for the data pointer, so I don't think we can say
>>>>>>>>>>> it's an
>>>>>>>>>>> allowed scenario here either.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>> Does it have any alignment requirements?
>>>>>>>>>>>>>
>>>>>>>>>>>>> No, just padding. AVPacket doesn't require alignment for
>>>>>>>>>>>>> the payload.
>>>>>>>>>>>>
>>>>>>>>>>>> I think say that explicitly. 
>>>>>>>>>>>> avcodec_default_get_encoder_buffer()
>>>>>>>>>>>> does give you aligned memory, even though it isn't needed.
>>>>>>>>>>>
>>>>>>>>>>> Would saying "There's no alignment requirement for the data
>>>>>>>>>>> pointer" add
>>>>>>>>>>> anything of value to the doxy? If i don't mention any kind of
>>>>>>>>>>> alignment
>>>>>>>>>>> requirement, it's because there isn't any, and it's implicit.
>>>>>>>>>>> I listed the requirements the user needs to keep in mind,
>>>>>>>>>>> like the
>>>>>>>>>>> padding and the need for an AVBufferRef. But if you think
>>>>>>>>>>> it's worth
>>>>>>>>>>> adding, then sure.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>> +     * - buf must contain a pointer to an AVBufferRef
>>>>>>>>>>>>>>> structure. The packet's
>>>>>>>>>>>>>>> +     *   data pointer must be contained in it.
>>>>>>>>>>>>>>> +     *   See: av_buffer_create(), av_buffer_alloc(),
>>>>>>>>>>>>>>> and av_buffer_ref().
>>>>>>>>>>>>>>> +     *
>>>>>>>>>>>>>>> +     * If AV_CODEC_CAP_DR1 is not set then
>>>>>>>>>>>>>>> get_encoder_buffer() must call
>>>>>>>>>>>>>>> +     * avcodec_default_get_encoder_buffer() instead of
>>>>>>>>>>>>>>> providing a buffer allocated by
>>>>>>>>>>>>>>> +     * some other means.
>>>>>>>>>>>>>>> +     *
>>>>>>>>>>>>>>> +     * If AV_GET_ENCODER_BUFFER_FLAG_REF is set in
>>>>>>>>>>>>>>> flags then the packet may be reused
>>>>>>>>>>>>>>> +     * (read and/or written to if it is writable) later
>>>>>>>>>>>>>>> by libavcodec.
>>>>>>>>>>>>>>> +     *
>>>>>>>>>>>>>>> +     * This callback must be thread-safe, as when frame
>>>>>>>>>>>>>>> multithreading is used, it may
>>>>>>>>>>>>>>> +     * be called from multiple threads simultaneously.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Allowing simulatenous calls feels unexpectedly tricky.  Is
>>>>>>>>>>>>>> it really necessary?
>>>>>>>>>>>>>
>>>>>>>>>>>>> This was a suggestion by Lynne, i personally don't know. We
>>>>>>>>>>>>> support frame threading encoding (For intra-only codecs), but
>>>>>>>>>>>>> currently ff_alloc_packet2() does not seem to be thread safe,
>>>>>>>>>>>>> seeing it calls av_fast_padded_malloc(), yet it's called by
>>>>>>>>>>>>> frame threaded encoders.
>>>>>>>>>>>>> Should i remove this?
>>>>>>>>>>>>
>>>>>>>>>>>> I don't know, I was asking only because it sounds tricky. 
>>>>>>>>>>>> For cases
>>>>>>>>>>>> with a limited number of buffers available (like memory-mapped
>>>>>>>>>>>> devices) you are going to need locking anyway, so maybe
>>>>>>>>>>>> rentrancy
>>>>>>>>>>>> adds no additional inconvenience.
>>>>>>>>>>>>
>>>>>>>>>>>>>>> +     *
>>>>>>>>>>>>>>> +     * @see avcodec_default_get_encoder_buffer()
>>>>>>>>>>>>>>> +     *
>>>>>>>>>>>>>>> +     * - encoding: Set by libavcodec, user can override.
>>>>>>>>>>>>>>> +     * - decoding: unused
>>>>>>>>>>>>>>> +     */
>>>>>>>>>>>>>>> +    int (*get_encoder_buffer)(struct AVCodecContext *s,
>>>>>>>>>>>>>>> AVPacket *pkt, int flags);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can the encoder ask for arbitrarily many packets?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can the user return "not yet" somehow to this if they have a
>>>>>>>>>>>>>> fixed output buffer pool but no buffer is currently
>>>>>>>>>>>>>> available?
>>>>>>>>>>>>>
>>>>>>>>>>>>> No, as is it can't. Return values < 0 are considered errors.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't much like the idea of the user suspending the thread
>>>>>>>>>>>>>> in the callback until they have some available, which might
>>>>>>>>>>>>>> work in some cases but might also deadlock if an
>>>>>>>>>>>>>> avcodec_receive_packet() call is blocked by it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can we make what's in essence a malloc() call return something
>>>>>>>>>>>>> like EAGAIN, and this in turn be propagated back to
>>>>>>>>>>>>> encode_receive_packet_internal()?
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe, or if it has many threads maybe it could wait for
>>>>>>>>>>>> something
>>>>>>>>>>>> else to finish first.
>>>>>>>>>>>>
>>>>>>>>>>>>> Couldn't this potentially end up in the forbidden scenario of
>>>>>>>>>>>>> avcodec_send_frame() and avcodec_receive_packet() both
>>>>>>>>>>>>> returning
>>>>>>>>>>>>> EAGAIN?
>>>>>>>>>>>>
>>>>>>>>>>>> Yes.  If the forbidden case happens then the encoder is
>>>>>>>>>>>> stuck anyway
>>>>>>>>>>>> and can't make any forward progress so we need to error out
>>>>>>>>>>>> properly, but the EAGAIN return isn't needed if there is
>>>>>>>>>>>> something
>>>>>>>>>>>> else to do on another thread.
>>>>>>>>>>>
>>>>>>>>>>> Ok, but I'm not familiar or knowledgeable enough with the
>>>>>>>>>>> frame thread
>>>>>>>>>>> encoder code to implement this.
>>>>>>>>>>
>>>>>>>>>> Looked at bit into this. AVCodec->encode2() based encoders
>>>>>>>>>> don't support
>>>>>>>>>> returning EAGAIN at all, as it completely breaks the frame
>>>>>>>>>> threading logic.
>>>>>>>>>> It would require a considerable rewrite in order to re-add a
>>>>>>>>>> task that
>>>>>>>>>> didn't fail but also didn't succeed.
>>>>>>>>>>
>>>>>>>>>> Non frame threading encoders could probably support it with
>>>>>>>>>> some minimal
>>>>>>>>>> changes, but i don't think suddenly letting an scenario that
>>>>>>>>>> was until now
>>>>>>>>>> guaranteed to never happen start happening
>>>>>>>>>> (avcodec_send_frame() and
>>>>>>>>>> avcodec_receive_packet() both returning EAGAIN) is a good
>>>>>>>>>> idea. It's an API
>>>>>>>>>> break.
>>>>>>>>>> Letting the user's custom get_encode_buffer() callback suspend
>>>>>>>>>> the thread is
>>>>>>>>>> IMO acceptable. In frame threading scenarios, the other
>>>>>>>>>> threads are still
>>>>>>>>>> working on their own packets (afaics none depends on the
>>>>>>>>>> others, since it's
>>>>>>>>>> intra only encoders only).
>>>>>>>>>
>>>>>>>>> I think it was not suggested in the thread so:
>>>>>>>>> if the users allocation fails the code can fallback to the
>>>>>>>>> default allocator
>>>>>>>>> That would lead to the relation:
>>>>>>>>> If a users allocator can fail (out of buffers) it must be able
>>>>>>>>> to handle
>>>>>>>>> that only some of the returned packets are from its own allocator
>>>>>>>>
>>>>>>>> In general, custom allocators are used when the caller doesn't
>>>>>>>> want to use
>>>>>>>> the default one. But yes, they could use
>>>>>>>> avcodec_default_get_encoder_buffer() as fallback, which is why
>>>>>>>> it was added
>>>>>>>> to begin with. Same applies to get_buffer2() custom
>>>>>>>> implementations, and so
>>>>>>>> far i don't think anybody had issues identifying what allocated
>>>>>>>> a packet
>>>>>>>> buffer.
>>>>>>>>
>>>>>>>> One of the additions to AVPacket people were talking about was a
>>>>>>>> user opaque
>>>>>>>> field that libav* would never touch or look at beyond
>>>>>>>> propagating them
>>>>>>>> around all the way to the output AVFrame, if any. This opaque
>>>>>>>> field could
>>>>>>>> perhaps store such allocator specific information the caller
>>>>>>>> could use to
>>>>>>>> identify packets allocated by their own allocator, or those by
>>>>>>>> avcodec_default_get_encoder_buffer().
>>>>>>>>
>>>>>>>>>
>>>>>>>>> About alignment, we should at least recommand that allocated
>>>>>>>>> packets are
>>>>>>>>> aligned not less than what out av_malloc() would align to.
>>>>>>>>> Is there a reason to align less ?
>>>>>>>>
>>>>>>>> There's no alignment requirement for AVPacket->data, and
>>>>>>>> av_new_packet()
>>>>>>>> uses av_buffer_realloc(), which does not guarantee any alignment
>>>>>>>> whatsoever
>>>>>>>> on platforms other than Windows. So basically, packet payload
>>>>>>>> buffers
>>>>>>>> allocated by our own helpers never had any alignment.
>>>>>>>
>>>>>>> for the purpose of exporting raw images, alignment would be "nice
>>>>>>> to have"
>>>>>>> because later filters may need it or need to memcpy
>>>>>>
>>>>>> Filters don't use AVPackets, they use AVFrames.
>>>>>
>>>>> demuxers return AVPackets, so do encoders.
>>>>> These can contain raw frames.
>>>>>
>>>>> also i see for example in rawdec:
>>>>> frame->buf[0] = av_buffer_ref(avpkt->buf);
>>>>
>>>> I ask again, where are you going with this? The alignment for
>>>> AVPacket data
>>>> buffers is defined: There is *none*.
>>>
>>> I simply stated that 'alignment would be "nice to have"'.
>>> and then showed some cases where it would be usefull.
>>
>> But don't those cases already happen, and without required or
>> guaranteed alignment?
>>
>>>
>>> I guess where iam going with this is, is the API you add extensible?
>>> That is if something is not supported now, can it be added later without
>>> adding a new API.
>>
>> I should, it shares a signature with get_buffer2(). That means the
>> packet to fill (Which fields can be read from it and set can be easily
>> redefined), avctx so the user can have access to avctx->opaque and so
>> we can eventually use something like a buffer pool in the default
>> allocator callback, and a flags parameter to tell the callback there
>> are requirements.
>>
>> Which makes me realize, maybe a flag to tell the callback "Alignment
>> is required" could solve your concerns?
> 
> Actually, thinking about it, it's the same situation as always requiring
> it. The mere existence of such a flag would require users of the old API
> moving onto the new to redefine their buffers, since now they *may* need
> to align them, when before they didn't. So not really an option.
> 

One could say that currently (i.e. up until the next + 1 bump),
alignment is advisable as it may improve performance, but that from next
+ 1 bump onwards the desired alignment is required.
Furthermore one can add public fields to each encoder containing the
maximal alignment that buffers for these encoders may required. This way
users may choose to use the default buffers instead of their own for the
(few) encoders that need bigger alignment. Said flag would be a bit like
the max_lowres value for decoders.
Notice that I am of course only speaking about requirements for packet
buffers for encoders, not for AVPackets in general.

- Andreas