[FFmpeg-devel] [PATCH 2/7] decode: add a method for attaching lavc-internal data to frames

Fri Oct 6 01:01:30 EEST 2017

On Thu, Oct 05, 2017 at 09:03:40PM +0100, Mark Thompson wrote:
> On 05/10/17 17:47, Michael Niedermayer wrote:
> > On Wed, Oct 04, 2017 at 02:04:54PM +0200, wm4 wrote:
> >> On Wed, 4 Oct 2017 13:37:31 +0200
> >> Tobias Rapp <t.rapp at noa-archive.com> wrote:
> >>
> >>> On 04.10.2017 11:34, wm4 wrote:
> >>>> On Wed, 4 Oct 2017 11:22:37 +0200
> >>>> Michael Niedermayer <michael at niedermayer.cc> wrote:
> >>>>   
> >>>>> On Wed, Oct 04, 2017 at 09:12:29AM +0200, wm4 wrote:  
> >>>>>> On Tue, 3 Oct 2017 21:40:58 +0200
> >>>>>> Michael Niedermayer <michael at niedermayer.cc> wrote:
> >>>>>>      
> >>>>>>> On Tue, Oct 03, 2017 at 03:15:13PM +0200, wm4 wrote:  
> >>>>>>>> From: Anton Khirnov <anton at khirnov.net>
> >>>>>>>>        
> >>>>>>>      
> >>>>>>>> Use the AVFrame.opaque_ref field. The original user's opaque_ref is
> >>>>>>>> wrapped in the lavc struct and then unwrapped before the frame is
> >>>>>>>> returned to the caller.  
> >>>>>>>
> >>>>>>> this is a ugly hack
> >>>>>>>
> >>>>>>> one and the same field should not be used to hold both the
> >>>>>>> users opaque_ref as well as a structure which is itself not a user
> >>>>>>> opaque_ref  
> >>>>>>
> >>>>>> While the AVFrame is within libavcodec, it's libavcodec's frame, not
> >>>>>> the user's. Thus your claim doesn't make too much sense. libavcodec
> >>>>>> fully controls the meaning for its own AVFrames' opaque_ref, but
> >>>>>> reconstruct the user's value when returning it.  
> >>>>>
> >>>>> i disagree  
> >>>>
> >>>> Well, you're wrong anyway.
> >>>>   
> >>>>> such hacks should not be added, we do have enough hacks already  
> >>>>
> >>>> It's not a hack.  
> >>>
> >>> Changing the semantics of a field during its lifetime, even when only 
> >>> done within private code, is at least unexpected behavior.
> >>
> >> That's not done.
> > 
> > The semantics are defined by the docs, which state:
> > "AVBufferRef for free use by the API user."
> > 
> > And before the patch this is true, all instances of this field are
> > controled by the user application and are consistent.
> > 
> > after the patch the AVFrames used by a codec have their opaque_ref
> > replaced by a wraped structure relative to what the outside of this
> > codec has.
> > 
> > 
> >> Conceptually the AVFrame with the changed behavior is
> >> a new reference. Internally, AVFrame.opaque_ref will always have the
> >> same semantics, pointing to the FrameDecodeData struct. Only at points
> >> where the AVFrame ref is converted to/from the user struct this is
> >> changed.
> >>
> >>>> This is done strictly when returning a valid AVFrame, so there is no
> >>>> room for mistakes.  
> >>>
> >>> The room for mistake might not increase for external developers but it 
> >>> increases for internal developers (maintenance cost).
> >>
> >> Like where? There are only 2 places where the code needs to deal with
> >> it, and these are in shared code, not individual decoders.
> > 
> > just greping for AVFrame in the headers shows callbacks, a direct
> > pointer to a AVFrame and the API functions that interface the codec
> > 
> > Just thinking of a codec that instanciates another codec and how
> > exactly the callback which may originate from the user or the outer
> > codec would unwrap the potential nested wraping.
> > I really dont think we want this in FFmpeg
> > And this is just one example ...
> 

> I don't understand this discussion.

yes, i realize this, but iam not sure why

its pretty simple and clear but you seem to skip over parts of what
i said.

> 
> As far as I can tell, the sequence is this:
> 
> * libavcodec allocates an AVFrame structure.
> * libavcodec calls get_buffer2 with that AVFrame structure; the user fills its fields.
> * libavcodec extracts the buffer references and metadata from the AVFrame, and maybe frees it (some codecs reuse a single AVFrame for the lifetime of the codec, others allocate them each time).
> ~ decoding happens, the buffers are written to ~
> * The user allocates a new AVFrame structure.
> * The user calls receive_frame with that new AVFrame structure; libavcodec its fields with references to the decoded data and associated metadata.
> * The user can then read the buffers of the frame, along with its metadata.
> 
> Why would it matter what happens in the middle?  The AVFrame structure at the end is not the AVFrame structure at the start, and the user can't assume anything about it at all - if they try to dereference a pointer to the AVFrame supplied by libavcodec for get_buffer2 after get_buffer2 has returned then they are definitely invoking undefined behaviour.

First this does not consider nested codecs.
A decoder can instanciate a internal decoder, if that now uses the
user callback it would have to unwrap twice for it if it used a
callback from the outer decoder it has to unwrap once.
This of course also depends on how it instanciates the decoder, that
is at which API level.

Next, if you look at the API and search for AVFrame you will find
that there are more callbacks than get_buffer2() that use AVFrames.
there is for example also
 void (*draw_horiz_band)(struct AVCodecContext *s,
                            const AVFrame *src, int offset[AV_NUM_DATA_POINTERS],
                            int y, int type, int height);

Which too has a AVFrame, which too must be unwraped and in a way specific
on the nesting depth of the decoder.

and then there is
AVFrame *coded_frame;
in AVCodecContext

Which can be accessed by the user as well as codec.

Iam pretty sure if i look more ill find more examples.

And future API changes would all have to consider this wraping and
unwraping.

And then all above is just about decoders, there are encoders too
some share code with decoders and at the same time encoders take
AVFrames from the user. Does the shared code expect wraped opaque_ref
or not?
All this adds alot of complexity to maintaining the code and it would
add many bugs

The opaque_ref wraping is a really bad design. Iam not sure why
people defend it.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you drop bombs on a foreign country and kill a hundred thousand
innocent people, expect your government to call the consequence
"unprovoked inhuman terrorist attacks" and use it to justify dropping
more bombs and killing more people. The technology changed, the idea is old.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20171006/1606c0ca/attachment.sig>