[FFmpeg-devel] [PATCH 2/2] libavfi/dnn: add LibTorch as one of DNN backend

Wed May 25 06:20:04 EEST 2022

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Soft
> Works
> Sent: Tuesday, May 24, 2022 10:24 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] libavfi/dnn: add LibTorch as one of
> DNN backend
> 
> 
> 
> > -----Original Message-----
> > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Fu,
> > Ting
> > Sent: Tuesday, May 24, 2022 4:03 PM
> > To: FFmpeg development discussions and patches
> > <ffmpeg-devel at ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH 2/2] libavfi/dnn: add LibTorch as
> > one of DNN backend
> >
> > Hi Jean-Baptiste,
> >
> > I am trying to add this backend since we got some users who have
> > interest in doing PyTorch model(BasicVSR model) inference with FFmpeg.
> > And as we all know, the PyTorch is one of the most popular AI
> > inference engines and it has large number of models. So, I think if
> > LibTorch is one of FFmpeg DNN backend, would help the PyTorch users a
> lot.
> >
> > PS, ONNX is not in my plan. I am going to improve the LibTorch backend
> > performance and make it compatible with more models in next steps.
> >
> > Thank you.
> > Ting FU
> 
> Hi Ting,
> 
> I've never looked at the DNN part in ffmpeg, so just out of curiosity:
> 
> Is this working 1-way or 2-way? What I mean is whether this is just about
> feeding images to the AI engines or does the ffmpeg filter get some data in
> return for each frame that is processed?

Hi Softworkz,

Since the DNN is a part of FFmpeg libavfilter, so it can work with other filters. Other filters can get the output(metadata or just frames) from DNN.

> 
> So for example, in case of object identification/tracking, is it possible to get
> identified rectangles back from the inference result, attach it to an AVFrame
> so that a downstream filter could paint those rectangles on each video frame?
> 

Yes, for your example object identification, we preserved the output in structure AVFrameSideData of AVFrame. So, the following filters can use such data.
And for now, the AVFrameSideData we saved contains bounding box, the object position info, and the object category and confidence.

Thank you.
Ting FU

> Thanks,
> softworkz
> 
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-request at ffmpeg.org
> with subject "unsubscribe".