[FFmpeg-devel] Re: [PATCH 1/3] avcodec/avpacket: add av_packet_resize()

Anton Khirnov anton at khirnov.net
Sun Mar 14 12:34:51 EET 2021


Quoting James Almer (2021-03-12 18:24:47)
> 
> The padding exists AFAIK because something, like an optimized bitstream 
> reader that tries to process several bytes at the same time, may end up 
> reading or writing pass the reported end of the buffer.
> That means that if reading and it's not all zeroes, it could in theory 
> have unpredictable results. Hence why everything always zeroes the 
> padding, even av_shrink_packet().

On that topic, I've been wondering about eliminating this requirement.
Does anyone know which code it is precisely that depends on the padding
being zeroed? Is this optimization really worth it?
It seems rather silly to jump through all these hoops for an
unmeasurable benefit in one decoder.

It would also give us zero-copy packet splitting.

> 
> And yes, what you describe is what some bitstream filters and the 
> matroska demuxer do. They just create several packet references pointing 
> to the same data buffer, but using different offsets for the data 
> pointer. They all have "padding", but in many cases said padding is the 
> beginning of the payload of another packet, and that's not ideal.
> 
> > 
> >>
> >> I'm sure there are other similar valid scenarios where attempting to 
> >> shrink a non writable packet is not "bad API usage".
> >>
> >>>
> >>> If you want to shrink a writable packet, then maybe you don't even 
> >>> need zero padding, because the buffer possibly already contains some 
> >>> defined value, and the main reason of zero padding is avoiding 
> >>> reading uninitialized data...
> >>
> >> If i allocate a payload of size 1024, ultimately fill only 512 bytes, 
> >> then resize it to that value (typical scenario in lavf demuxers), if i 
> >> don't zero the bytes after the 512 offset then they will remain 
> >> uninitialized.
> > 
> > I am not sure I see your point here, if the data after the padding is 
> > uninitalized, that is not a problem. I made a typo above, and meant 
> > non-writable packet in my comment. And my reasoning is that if a packet 
> > is non-writable, it already contains initialized data with a zero 
> > padding. If you shrink that by reducing pkt->size only, you will not 
> > have uninitialized data, only the padding will not be zeroed. And that 
> > is preferable to copying the data only for zeroing the padding, because 
> > - as far as I know - the padding does not have to be zeroed, it only 
> > have to be initialized.
> > 
> > I agree that it is not nice that av_shrink_packet() writes something to 
> > the packet, people may not think about it and misuse it instead of 
> > directly decreaseing pkt->size when they need a partial packet. That is 
> > why I suggested the assert(). One might argue that it kind of breaks 
> > behaviour, but I'd say it does not break it too much, and writing to a 
> > non-writable packet was broken in the first place.
> > 
> > Alternatively we can change av_shrink_packet() to only zero the padding 
> > if the packet is writable, which has the benefit that it will do what 
> > people generally expect, no matter if you throw a writable or a 
> > non-writable packet to it.
> > 
> > A third alternative is to introduce void av_shrink_packet2() in which 
> > you explicitly document that a writable packet is needed and do the 
> > assert there, and deprecate av_shrink_packet().
> 
> Not a fan of functions with a 2 suffix. And to be honest, I really don't 
> care about what we do with av_shrink_packet().
> Do you want to keep it around? Ok. Want to deprecate and remove it? 
> Better. Want to add an assert to it? Not a fan and it may result in 
> unexpected crashes for library users, but whatever.

I would suggest keeping av_shrink_packet() with a big scary warning that
it does not check for ownership and it is the caller's responsibility to
ensure that writing to the packet is safe.

Also, I'd like to emphasise that av_*_is_writable() is not gospel. It is
merely a convention that is used "by default", when you don't have more
accurate information.
Some bits of code may use other conventions and consider a buffer
writable even if av_buffer_is_writable() returns 0. For example this is
at the core of frame threading, where a reference to a frame currently
being decoded is propagated to other threads before decoding finishes.
The owner thread then writes to the frame because frame-mt conventions
allow it to, even though there are multiple references to the frame.

-- 
Anton Khirnov


More information about the ffmpeg-devel mailing list