[FFmpeg-devel] [PATCH 1/3] avcodec/avpacket: add av_packet_resize()

Fri Mar 12 19:24:47 EET 2021

On 3/12/2021 2:09 PM, Marton Balint wrote:
> 
> 
> On Thu, 11 Mar 2021, James Almer wrote:
> 
>> On 3/11/2021 7:40 PM, Marton Balint wrote:
>>>
>>>
>>> On Thu, 11 Mar 2021, James Almer wrote:
>>>
>>>> On 3/11/2021 1:11 PM, Marton Balint wrote:
>>>>>
>>>>>
>>>>> On Thu, 11 Mar 2021, James Almer wrote:
>>>>>
>>>>>> This function acts as a replacement for both av_grow_packet() and
>>>>>> av_shrink_packet(), the latter which is now deprecated and will be 
>>>>>> removed as
>>>>>> it does not correctly handle non-writable packets.
>>>>>
>>>>> I don't think this is a good idea, av_shrink_packet cannot fail, 
>>>>> av_grow_packet can. By using the same function you are losing the 
>>>>> information if the end result should be checked or not.
>>>>
>>>> I'm not sure i follow. av_shrink_packet() is not being changed at 
>>>> all, just deprecated, scheduled for removal, and its use discouraged.
>>>
>>> But why are you complicating code with mandatory return value checking?
>>
>> Because unlike av_shrink_packet(), av_packet_resize() is more thorough 
>> when handling AVPackets and allows new usage scenarios that were not 
>> allowed with the former, thus it can fail.
>>
>>> Because as soon as a function returns an error, you have to check it, 
>>> and forward it upwards.
>>
>> Is error checking that much of a problem? I don't understand why it 
>> bothers people so much.
>>
>>>
>>>> Maybe i should have split this in two, one to add av_packet_resize() 
>>>> and one to deprecate av_shrink_packet(), to avoid confusion.
>>>>
>>>> In any case, the fact av_shrink_packet() cannot fail is the reason 
>>>> I'm getting rid of it. It's zeroing the padding without bothering to 
>>>> check if the packet is writable at all.
>>>
>>> Add an assert to it then.
>>
>> We only assert on internal bugs, not invalid arguments passed by 
>> callers to a public function. The asserts would need to be added above 
>> each av_shrink_packet() call.
> 
> There are a lot of cases when we assert for bad API usage. See 
> av_frame_ref() or av_frame_move_ref(). We assert if the dst is not empty.
> 
>> And that's only for our internal uses of the function. It does nothing 
>> to the issue of it being public API that can't properly handle 
>> AVPackets in their current form.
>>
>> > Shrinking a non-writable packet seems bad API usage anyways.
>>
>> I get a packet from a demuxer. It contains two independent portions 
>> (NALUs, OBUs, etc) i want to separate in order to feed them to 
>> something like a multi threaded decoder. And so I create a new 
>> reference to it, resulting in two packets pointing to the same data. I 
>> shrink one to only contain the first portion, and i add the required 
>> byte offset to the data pointer and subtract it to the size field on 
>> the second packet so it contains only the second portion.
>> The result if i use av_packet_resize() will be two valid, properly 
>> padded packets containing their respective expected data, but if i use 
>> av_shrink_packet(), the result will be the packet with the second 
>> portion featuring padding bytes worth of data zeroed at the start of 
>> its payload.
> 
> This looks like an unfortunate example, since you are:
> - adding a reference to the packet, so you don't have to duplicate data
> - and then want to add zero padding to the partial packet, so you will
> duplicate data.
> And I think the padding does not have to be zero for the partial packet.

The padding exists AFAIK because something, like an optimized bitstream 
reader that tries to process several bytes at the same time, may end up 
reading or writing pass the reported end of the buffer.
That means that if reading and it's not all zeroes, it could in theory 
have unpredictable results. Hence why everything always zeroes the 
padding, even av_shrink_packet().

And yes, what you describe is what some bitstream filters and the 
matroska demuxer do. They just create several packet references pointing 
to the same data buffer, but using different offsets for the data 
pointer. They all have "padding", but in many cases said padding is the 
beginning of the payload of another packet, and that's not ideal.

> 
>>
>> I'm sure there are other similar valid scenarios where attempting to 
>> shrink a non writable packet is not "bad API usage".
>>
>>>
>>> If you want to shrink a writable packet, then maybe you don't even 
>>> need zero padding, because the buffer possibly already contains some 
>>> defined value, and the main reason of zero padding is avoiding 
>>> reading uninitialized data...
>>
>> If i allocate a payload of size 1024, ultimately fill only 512 bytes, 
>> then resize it to that value (typical scenario in lavf demuxers), if i 
>> don't zero the bytes after the 512 offset then they will remain 
>> uninitialized.
> 
> I am not sure I see your point here, if the data after the padding is 
> uninitalized, that is not a problem. I made a typo above, and meant 
> non-writable packet in my comment. And my reasoning is that if a packet 
> is non-writable, it already contains initialized data with a zero 
> padding. If you shrink that by reducing pkt->size only, you will not 
> have uninitialized data, only the padding will not be zeroed. And that 
> is preferable to copying the data only for zeroing the padding, because 
> - as far as I know - the padding does not have to be zeroed, it only 
> have to be initialized.
> 
> I agree that it is not nice that av_shrink_packet() writes something to 
> the packet, people may not think about it and misuse it instead of 
> directly decreaseing pkt->size when they need a partial packet. That is 
> why I suggested the assert(). One might argue that it kind of breaks 
> behaviour, but I'd say it does not break it too much, and writing to a 
> non-writable packet was broken in the first place.
> 
> Alternatively we can change av_shrink_packet() to only zero the padding 
> if the packet is writable, which has the benefit that it will do what 
> people generally expect, no matter if you throw a writable or a 
> non-writable packet to it.
> 
> A third alternative is to introduce void av_shrink_packet2() in which 
> you explicitly document that a writable packet is needed and do the 
> assert there, and deprecate av_shrink_packet().

Not a fan of functions with a 2 suffix. And to be honest, I really don't 
care about what we do with av_shrink_packet().
Do you want to keep it around? Ok. Want to deprecate and remove it? 
Better. Want to add an assert to it? Not a fan and it may result in 
unexpected crashes for library users, but whatever.

I just want to add av_packet_resize() to have a single resize function 
that will properly handle AVPackets in their current and any potential 
future states.