[FFmpeg-devel] [PATCH V4 2/4] libavfilter/buffersink.c: unref private_ref when frame leaves libavfilter

Thu Mar 4 10:48:14 EET 2021

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Guo,
> Yejun
> Sent: 2021年3月1日 23:47
> To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V4 2/4] libavfilter/buffersink.c: unref
> private_ref when frame leaves libavfilter
> 
> 
> 
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> > Nicolas George
> > Sent: 2021年3月1日 23:07
> > To: FFmpeg development discussions and patches
> > <ffmpeg-devel at ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH V4 2/4] libavfilter/buffersink.c:
> > unref private_ref when frame leaves libavfilter
> >
> > Guo, Yejun (12021-03-01):
> > > Actually, I think private_ref in libavfilter can only be used for an
> > > exclusive usage at a time.
> >
> > Exactly. If we use it for this, then we cannot use for anything else in
> libavfilter.
> > This use seems too specific to warrant dedicating such an unique field
> > to it, even though we do not have a better use in sight.
> >
> > > As Paul mentioned, I think AVFrame.metadata is a better choice.
> >
> > If you can express it as a string or set of strings with a clear
> > syntax that can easily be parsed, then possibly, yes.
> 
> ooo, it is not easy to express the bounding boxes as strings in
> AVDictionaryEntry.value, the bounding box has several data members, and
> they are data and have high possibility to contain '\0' in the middle of the data.
> So, we might not use AVFrame.metadata.
> 
> So, where to put the bounding boxes (object detection result generated by
> vf_dnn_detect), I now see several possible methods which all have
> positive/negative comments:
> 
> 1. Add into side data
> The final result is to be in side data since it might be used by new encoders in
> the future, but this method changes the API.
> 
> 1.1 We just add a new enum for side data, and keep the .h file (for structs)
> internal at first.
> There's comment that this is not allowed. (I personally prefer this one.)
> 
> 1.2 We add enum for side data, and also export the .h file as part of FFmpeg
> API.
> The risk is that we might change the structs in .h file later, it breaks API.
> We need a versioning management for the struct, just like film grain as
> explained at
> http://ffmpeg.org/pipermail/ffmpeg-devel/2021-February/276586.html .
> 
> 2. Use private_ref
> Use private_ref for bounding boxes at first, and then change to side data when
> it is required.
> The disadvantage is that during the period, we cannot use it for anything else
> in libavfilter.
> 
> any comment or any other suggestion? thanks.
> 

Hi, I think option 2 might be the good choice for now.

The background is that: to support video analytics in libavfilter, we need a place
to put the analytics result (it is bounding box in this example for object detection)
in AVFrame, and so the data can be transferred between filters, (it is possible that
the data need to be transferred between filters and codecs in the future, since
new codecs are going to support AI labels). The options describe the possible
places for the data.

For option 1.1, it is not allowed because .h file (structs for bounding box) for side
data is not exported.

For option 1.2, to avoid the possible risk of API breaking, we need a versioning
management, and the code such as BoundingBoxV1 and BoundingBoxV2 etc. is
not elegant.

For option 2, private_ref in libavfilter can be used to transfer data between filters.
And we don't see other usages now, except for the bounding box.

So, imho, we can use private_ref for now, and move to side data once the
structs are mature enough, and so no versioning management is required. 
And the private_ref can be used for other usage at that time.