[FFmpeg-devel] [PATCH 1/4] libavfilter/dnn: add layer maximum for native mode.

Guo, Yejun yejun.guo at intel.com
Sat Sep 21 10:04:21 EEST 2019



> -----Original Message-----
> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf Of
> Pedro Arthur
> Sent: Friday, September 20, 2019 11:14 PM
> To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 1/4] libavfilter/dnn: add layer maximum for
> native mode.
> 
> Em sex, 20 de set de 2019 às 11:50, Guo, Yejun <yejun.guo at intel.com>
> escreveu:
> >
> >
> >
> > > -----Original Message-----
> > > From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf
> Of
> > > Pedro Arthur
> > > Sent: Friday, September 20, 2019 10:17 PM
> > > To: FFmpeg development discussions and patches
> <ffmpeg-devel at ffmpeg.org>
> > > Subject: Re: [FFmpeg-devel] [PATCH 1/4] libavfilter/dnn: add layer maximum
> for
> > > native mode.
> > >
> > > Hi,
> > >
> > > Em sex, 20 de set de 2019 às 01:00, Guo, Yejun <yejun.guo at intel.com>
> > > escreveu:
> > > >
> > > > The reason to add this layer is that it is used by srcnn in vf_sr.
> > > > This layer is currently ignored in native mode. After this patch,
> > > > we can add multiple outputs support for native mode.
> > > >
> > > I did not quite understand the commit message. Where does srcnn needs
> > > max a layer?
> >
> > see
> https://github.com/HighVoltageRocknRoll/sr/blob/master/models/model_srcn
> n.py#L39 ,
> > the maximum layer is the last layer of the model.
> I see, indeed if I'm not missing something this max layer is
> superfulous as the relu activation already does this right?
> What we have to guarantee is that the output is in the range [0, 1],
> that means we should have had a layer min(y, 1) instead of the max or
> guarantee the conversion from float to integer properly saturates y >
> 1.

yes, I think so.

> 
> >
> > > What is the relation between max layer and supporting multiple outputs?
> >
> > thanks, I did not describe it explicitly, will add more detail as below.
> >
> > The direct relation is the max layer and the model output name, and then
> multiple outputs
> > can be supported after the output name matching is supported.
> >
> > suppose the output name of srcnn is 'y', it means that the output name of
> max layer is 'y'
> > since max layer is the last layer. And suppose the input name of max layer is
> 'z', the network
> > looks like:
> > ... -> 'z' -> (max layer) -> 'y'
> >
> > In current implementation, the max layer is ignored in native mode, it means
> that 'y' is also
> > discarded in native mode. The output name of the native model becomes 'z'.
> And so we could not
> > find the correct output operand with name 'y'.
> >
> > The reason that current implementation works is that we just consider the
> last operand as the
> > model output, ignoring the name matching.
> >
> > to support multiple outputs, we have to recognize output operands by names.
> To support the output searching
> > with name, we must add 'y' back to srcnn (that is to handle max layer), so the
> vf_sr is compatible to work in both tf mode and native mode.
> >
> thanks, in any case the patch is useful, I should push it soon.

thanks.



More information about the ffmpeg-devel mailing list