[FFmpeg-devel] [PATCH 4/4] avfilter: add a generic filter for rgb proccessing with dnn networks

Wed Oct 16 14:30:21 EEST 2019

On 10/16/19, Guo, Yejun <yejun.guo at intel.com> wrote:
>
>
>> -----Original Message-----
>> From: Paul B Mahol [mailto:onemda at gmail.com]
>> Sent: Wednesday, October 16, 2019 5:17 PM
>> To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
>> Cc: Guo, Yejun <yejun.guo at intel.com>
>> Subject: Re: [FFmpeg-devel] [PATCH 4/4] avfilter: add a generic filter for
>> rgb
>> proccessing with dnn networks
>>
>> There should be only one dnn_processing filter. Not one that does only
>> rgb packed formats.
>
> Got it, I'll change it to dnn_processing and firstly implement the rgb
> format.
>
> For another possible case that multiple AVFrame are queued in the filter, it
> means that the dnn network needs more than one AVFrame, could it be a
> separate filter? Or it must be also integrated into dnn_processing? Thanks.

Same filter, unless it needs multiple input/output pads, than needs
different name.

>
> Btw, for the rest 3 patches in this patch set, they can be reviewed, the
> comment for this patch does not impact those patches. Thanks.
>
>>
>> On 10/16/19, Guo, Yejun <yejun.guo at intel.com> wrote:
>> > This filter accepts all the dnn networks which do image processing
>> > on RGB-based format. Currently, frame with formats rgb24 and bgr24
>> > are supported. Other formats such as gray and YUV can be supported
>> > in separated filters. The dnn network can accept RGB data in float32
>> > or uint8 format. And the dnn network can change frame size.
>> >
>> > Let's take an example with the following python script. This script
>> > halves the value of the first channel of the pixel.
>> > import tensorflow as tf
>> > import numpy as np
>> > import scipy.misc
>> > in_img = scipy.misc.imread('in.bmp')
>> > in_img = in_img.astype(np.float32)/255.0
>> > in_data = in_img[np.newaxis, :]
>> > filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0,
>> > 1.]).reshape(1,1,3,3).astype(np.float32)
>> > filter = tf.Variable(filter_data)
>> > x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
>> > y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID',
>> > name='dnn_out')
>> > sess=tf.Session()
>> > sess.run(tf.global_variables_initializer())
>> > output = sess.run(y, feed_dict={x: in_data})
>> > graph_def = tf.graph_util.convert_variables_to_constants(sess,
>> > sess.graph_def, ['dnn_out'])
>> > tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb',
>> > as_text=False)
>> > output = output * 255.0
>> > output = output.astype(np.uint8)
>> > scipy.misc.imsave("out.bmp", np.squeeze(output))
>> >
>> > - generate halve_first_channel.pb with the above script
>> > - generate halve_first_channel.model with tools/python/convert.py
>> > - try with following commands
>> >   ./ffmpeg -i input.jpg -vf
>> >
>> dnn_rgb_processing=model=halve_first_channel.model:input=dnn_in:output=d
>> nn_out:fmt=rgb24:dnn_backend=native
>> > -y out.native.png
>> >   ./ffmpeg -i input.jpg -vf
>> >
>> dnn_rgb_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_
>> out:fmt=rgb24:dnn_backend=tensorflow
>> > -y out.tf.png
>> >
>> > Signed-off-by: Guo, Yejun <yejun.guo at intel.com>
>> > ---
>> >  configure                           |   1 +
>> >  doc/filters.texi                    |  46 ++++++
>> >  libavfilter/Makefile                |   2 +
>> >  libavfilter/allfilters.c            |   1 +
>> >  libavfilter/dnn_filter_utils.c      |  81 +++++++++++
>> >  libavfilter/dnn_filter_utils.h      |  35 +++++
>> >  libavfilter/vf_dnn_rgb_processing.c | 276
>> > ++++++++++++++++++++++++++++++++++++
>> >  7 files changed, 442 insertions(+)
>> >  create mode 100644 libavfilter/dnn_filter_utils.c
>> >  create mode 100644 libavfilter/dnn_filter_utils.h
>> >  create mode 100644 libavfilter/vf_dnn_rgb_processing.c
>
>