[FFmpeg-devel] [PATCH v2] avcodec: add a get_encoder_buffer() callback to AVCodecContext

Wed Mar 10 22:59:11 EET 2021

On 3/10/2021 5:18 PM, Michael Niedermayer wrote:
> On Mon, Feb 22, 2021 at 07:27:34PM -0300, James Almer wrote:
>> On 2/21/2021 6:04 PM, James Almer wrote:
>>> On 2/21/2021 5:29 PM, Mark Thompson wrote:
>>>> On 21/02/2021 20:00, James Almer wrote:
>>>>> On 2/21/2021 4:13 PM, Mark Thompson wrote:
>>>>>> On 21/02/2021 17:35, James Almer wrote:
>>>>>>> This callback is functionally the same as get_buffer2()
>>>>>>> is for decoders, and
>>>>>>> implements for the new encode API the functionality of
>>>>>>> the old encode API had
>>>>>>> where the user could provide their own buffers.
>>>>>>>
>>>>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>>>>> ---
>>>>>>> Used the names Lynne suggested this time, plus a line
>>>>>>> about how the callback
>>>>>>> must be thread safe.
>>>>>>>
>>>>>>>    libavcodec/avcodec.h | 45 ++++++++++++++++++++++++++++++++++++
>>>>>>>    libavcodec/codec.h   |  8 ++++---
>>>>>>>    libavcodec/encode.c  | 54
>>>>>>> +++++++++++++++++++++++++++++++++++++++++++-
>>>>>>>    libavcodec/encode.h  |  8 +++++++
>>>>>>>    libavcodec/options.c |  1 +
>>>>>>>    5 files changed, 112 insertions(+), 4 deletions(-)
>>>>>>>
>>>>>>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>>>>>>> index 7dbf083a24..e60eb16ce1 100644
>>>>>>> --- a/libavcodec/avcodec.h
>>>>>>> +++ b/libavcodec/avcodec.h
>>>>>>> @@ -513,6 +513,11 @@ typedef struct AVProducerReferenceTime {
>>>>>>>     */
>>>>>>>    #define AV_GET_BUFFER_FLAG_REF (1 << 0)
>>>>>>> +/**
>>>>>>> + * The encoder will keep a reference to the packet and
>>>>>>> may reuse it later.
>>>>>>> + */
>>>>>>> +#define AV_GET_ENCODER_BUFFER_FLAG_REF (1 << 0)
>>>>>>> +
>>>>>>>    struct AVCodecInternal;
>>>>>>>    /**
>>>>>>> @@ -2346,6 +2351,39 @@ typedef struct AVCodecContext {
>>>>>>>         * - encoding: set by user
>>>>>>>         */
>>>>>>>        int export_side_data;
>>>>>>> +
>>>>>>> +    /**
>>>>>>> +     * This callback is called at the beginning of each
>>>>>>> packet to get a data
>>>>>>> +     * buffer for it.
>>>>>>> +     *
>>>>>>> +     * The following field will be set in the packet
>>>>>>> before this callback is
>>>>>>> +     * called:
>>>>>>> +     * - size
>>>>>>> +     * This callback must use the above value to
>>>>>>> calculate the required buffer size,
>>>>>>> +     * which must padded by at least
>>>>>>> AV_INPUT_BUFFER_PADDING_SIZE bytes.
>>>>>>> +     *
>>>>>>> +     * This callback must fill the following fields in the packet:
>>>>>>> +     * - data
>>>>>>
>>>>>> Is the data pointer allowed to be in write-only memory?
>>>>>
>>>>> I'm not sure what the use case for this would be, so probably no?
>>>>
>>>> The two use-cases I see for this API are:
>>>>
>>>> * You want to avoid a copy when combining the output with something
>>>> else.  E.g. you pass a pointer to the block of memory following
>>>> where you are going to put your header data (for something you are
>>>> going to send over the network, say).
>>>>
>>>> * You want to avoid a copy when passing the output directly to
>>>> something external.  E.g. you pass a pointer to a memory-mapped
>>>> device buffer (such as a V4L2 buffer, say).
>>>>
>>>> In the second case, write-only memory on an external device seems
>>>> possible, as does memory which is, say, readable but uncached, so
>>>> reading it is a really bad idea.
>>>
>>> Allowing the second case would depend on how encoders behave. Some may
>>> attempt to read data already written to the output packet. It's not like
>>> all of them allocate the packet, do a memcpy from an internal buffer,
>>> then return.
>>> There is also the flag meant to signal that the encoder will keep a
>>> reference to the packet around, which more or less implies it will be
>>> read later in the encoding process.
>>>
>>> The doxy for avcodec_encode_video2(), which allowed the user to provide
>>> their own buffers in the output packet, does not mention any kind of
>>> requirement for the data pointer, so I don't think we can say it's an
>>> allowed scenario here either.
>>>
>>>>
>>>>>> Does it have any alignment requirements?
>>>>>
>>>>> No, just padding. AVPacket doesn't require alignment for the payload.
>>>>
>>>> I think say that explicitly.  avcodec_default_get_encoder_buffer()
>>>> does give you aligned memory, even though it isn't needed.
>>>
>>> Would saying "There's no alignment requirement for the data pointer" add
>>> anything of value to the doxy? If i don't mention any kind of alignment
>>> requirement, it's because there isn't any, and it's implicit.
>>> I listed the requirements the user needs to keep in mind, like the
>>> padding and the need for an AVBufferRef. But if you think it's worth
>>> adding, then sure.
>>>
>>>>
>>>>>>> +     * - buf must contain a pointer to an AVBufferRef
>>>>>>> structure. The packet's
>>>>>>> +     *   data pointer must be contained in it.
>>>>>>> +     *   See: av_buffer_create(), av_buffer_alloc(),
>>>>>>> and av_buffer_ref().
>>>>>>> +     *
>>>>>>> +     * If AV_CODEC_CAP_DR1 is not set then
>>>>>>> get_encoder_buffer() must call
>>>>>>> +     * avcodec_default_get_encoder_buffer() instead of
>>>>>>> providing a buffer allocated by
>>>>>>> +     * some other means.
>>>>>>> +     *
>>>>>>> +     * If AV_GET_ENCODER_BUFFER_FLAG_REF is set in
>>>>>>> flags then the packet may be reused
>>>>>>> +     * (read and/or written to if it is writable) later
>>>>>>> by libavcodec.
>>>>>>> +     *
>>>>>>> +     * This callback must be thread-safe, as when frame
>>>>>>> multithreading is used, it may
>>>>>>> +     * be called from multiple threads simultaneously.
>>>>>>
>>>>>> Allowing simulatenous calls feels unexpectedly tricky.  Is
>>>>>> it really necessary?
>>>>>
>>>>> This was a suggestion by Lynne, i personally don't know. We
>>>>> support frame threading encoding (For intra-only codecs), but
>>>>> currently ff_alloc_packet2() does not seem to be thread safe,
>>>>> seeing it calls av_fast_padded_malloc(), yet it's called by
>>>>> frame threaded encoders.
>>>>> Should i remove this?
>>>>
>>>> I don't know, I was asking only because it sounds tricky.  For cases
>>>> with a limited number of buffers available (like memory-mapped
>>>> devices) you are going to need locking anyway, so maybe rentrancy
>>>> adds no additional inconvenience.
>>>>
>>>>>>> +     *
>>>>>>> +     * @see avcodec_default_get_encoder_buffer()
>>>>>>> +     *
>>>>>>> +     * - encoding: Set by libavcodec, user can override.
>>>>>>> +     * - decoding: unused
>>>>>>> +     */
>>>>>>> +    int (*get_encoder_buffer)(struct AVCodecContext *s,
>>>>>>> AVPacket *pkt, int flags);
>>>>>>
>>>>>> Can the encoder ask for arbitrarily many packets?
>>>>>>
>>>>>> Can the user return "not yet" somehow to this if they have a
>>>>>> fixed output buffer pool but no buffer is currently
>>>>>> available?
>>>>>
>>>>> No, as is it can't. Return values < 0 are considered errors.
>>>>>
>>>>>>
>>>>>> I don't much like the idea of the user suspending the thread
>>>>>> in the callback until they have some available, which might
>>>>>> work in some cases but might also deadlock if an
>>>>>> avcodec_receive_packet() call is blocked by it.
>>>>>
>>>>> Can we make what's in essence a malloc() call return something
>>>>> like EAGAIN, and this in turn be propagated back to
>>>>> encode_receive_packet_internal()?
>>>>
>>>> Maybe, or if it has many threads maybe it could wait for something
>>>> else to finish first.
>>>>
>>>>> Couldn't this potentially end up in the forbidden scenario of
>>>>> avcodec_send_frame() and avcodec_receive_packet() both returning
>>>>> EAGAIN?
>>>>
>>>> Yes.  If the forbidden case happens then the encoder is stuck anyway
>>>> and can't make any forward progress so we need to error out
>>>> properly, but the EAGAIN return isn't needed if there is something
>>>> else to do on another thread.
>>>
>>> Ok, but I'm not familiar or knowledgeable enough with the frame thread
>>> encoder code to implement this.
>>
>> Looked at bit into this. AVCodec->encode2() based encoders don't support
>> returning EAGAIN at all, as it completely breaks the frame threading logic.
>> It would require a considerable rewrite in order to re-add a task that
>> didn't fail but also didn't succeed.
>>
>> Non frame threading encoders could probably support it with some minimal
>> changes, but i don't think suddenly letting an scenario that was until now
>> guaranteed to never happen start happening (avcodec_send_frame() and
>> avcodec_receive_packet() both returning EAGAIN) is a good idea. It's an API
>> break.
>> Letting the user's custom get_encode_buffer() callback suspend the thread is
>> IMO acceptable. In frame threading scenarios, the other threads are still
>> working on their own packets (afaics none depends on the others, since it's
>> intra only encoders only).
> 
> I think it was not suggested in the thread so:
> if the users allocation fails the code can fallback to the default allocator
> That would lead to the relation:
> If a users allocator can fail (out of buffers) it must be able to handle
> that only some of the returned packets are from its own allocator

In general, custom allocators are used when the caller doesn't want to 
use the default one. But yes, they could use 
avcodec_default_get_encoder_buffer() as fallback, which is why it was 
added to begin with. Same applies to get_buffer2() custom 
implementations, and so far i don't think anybody had issues identifying 
what allocated a packet buffer.

One of the additions to AVPacket people were talking about was a user 
opaque field that libav* would never touch or look at beyond propagating 
them around all the way to the output AVFrame, if any. This opaque field 
could perhaps store such allocator specific information the caller could 
use to identify packets allocated by their own allocator, or those by 
avcodec_default_get_encoder_buffer().

> 
> About alignment, we should at least recommand that allocated packets are
> aligned not less than what out av_malloc() would align to.
> Is there a reason to align less ?

There's no alignment requirement for AVPacket->data, and av_new_packet() 
uses av_buffer_realloc(), which does not guarantee any alignment 
whatsoever on platforms other than Windows. So basically, packet payload 
buffers allocated by our own helpers never had any alignment.

> 
> About multithreading, if there are 2 single thread encoders, they could of
> course call the callback at the same time. These would then have seperate
> AVCodecContexts though.
> Maybe it should be documented more precissely what needs locking / when
> AVCodecContexts / opaques

Each call/thread will have their own separate AVCodecContext, yes. The 
callback should ensure it can be called from multiple threads at the 
same time not to access one or another avctx or avpkt fields, but in 
case whatever they use to provide buffers needs such considerations 
(Like when using a buffer pool).
Not sure what extra comment could be added beyond what i already added 
as suggested by Lynne. The user is already told to only look at 
avpkt->size and to fill avpkt->{data,buf}.

> 
> Thanks
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>