[FFmpeg-devel] [PATCH 4/4] avfilter: add a generic filter for rgb proccessing with dnn networks

Wed Oct 16 14:18:48 EEST 2019

> -----Original Message-----
> From: Paul B Mahol [mailto:onemda at gmail.com]
> Sent: Wednesday, October 16, 2019 5:17 PM
> To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
> Cc: Guo, Yejun <yejun.guo at intel.com>
> Subject: Re: [FFmpeg-devel] [PATCH 4/4] avfilter: add a generic filter for rgb
> proccessing with dnn networks
> 
> There should be only one dnn_processing filter. Not one that does only
> rgb packed formats.

Got it, I'll change it to dnn_processing and firstly implement the rgb format.

For another possible case that multiple AVFrame are queued in the filter, it means that the dnn network needs more than one AVFrame, could it be a separate filter? Or it must be also integrated into dnn_processing? Thanks.

Btw, for the rest 3 patches in this patch set, they can be reviewed, the comment for this patch does not impact those patches. Thanks.

> 
> On 10/16/19, Guo, Yejun <yejun.guo at intel.com> wrote:
> > This filter accepts all the dnn networks which do image processing
> > on RGB-based format. Currently, frame with formats rgb24 and bgr24
> > are supported. Other formats such as gray and YUV can be supported
> > in separated filters. The dnn network can accept RGB data in float32
> > or uint8 format. And the dnn network can change frame size.
> >
> > Let's take an example with the following python script. This script
> > halves the value of the first channel of the pixel.
> > import tensorflow as tf
> > import numpy as np
> > import scipy.misc
> > in_img = scipy.misc.imread('in.bmp')
> > in_img = in_img.astype(np.float32)/255.0
> > in_data = in_img[np.newaxis, :]
> > filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0,
> > 1.]).reshape(1,1,3,3).astype(np.float32)
> > filter = tf.Variable(filter_data)
> > x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
> > y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID',
> > name='dnn_out')
> > sess=tf.Session()
> > sess.run(tf.global_variables_initializer())
> > output = sess.run(y, feed_dict={x: in_data})
> > graph_def = tf.graph_util.convert_variables_to_constants(sess,
> > sess.graph_def, ['dnn_out'])
> > tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb',
> > as_text=False)
> > output = output * 255.0
> > output = output.astype(np.uint8)
> > scipy.misc.imsave("out.bmp", np.squeeze(output))
> >
> > - generate halve_first_channel.pb with the above script
> > - generate halve_first_channel.model with tools/python/convert.py
> > - try with following commands
> >   ./ffmpeg -i input.jpg -vf
> >
> dnn_rgb_processing=model=halve_first_channel.model:input=dnn_in:output=d
> nn_out:fmt=rgb24:dnn_backend=native
> > -y out.native.png
> >   ./ffmpeg -i input.jpg -vf
> >
> dnn_rgb_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_
> out:fmt=rgb24:dnn_backend=tensorflow
> > -y out.tf.png
> >
> > Signed-off-by: Guo, Yejun <yejun.guo at intel.com>
> > ---
> >  configure                           |   1 +
> >  doc/filters.texi                    |  46 ++++++
> >  libavfilter/Makefile                |   2 +
> >  libavfilter/allfilters.c            |   1 +
> >  libavfilter/dnn_filter_utils.c      |  81 +++++++++++
> >  libavfilter/dnn_filter_utils.h      |  35 +++++
> >  libavfilter/vf_dnn_rgb_processing.c | 276
> > ++++++++++++++++++++++++++++++++++++
> >  7 files changed, 442 insertions(+)
> >  create mode 100644 libavfilter/dnn_filter_utils.c
> >  create mode 100644 libavfilter/dnn_filter_utils.h
> >  create mode 100644 libavfilter/vf_dnn_rgb_processing.c