[FFmpeg-devel] [PATCH V3] avfilter/vf_dnn_processing: add a generic filter for image proccessing with dnn networks

Thu Nov 7 20:50:14 EET 2019

Em qui., 7 de nov. de 2019 às 13:17, Guo, Yejun <yejun.guo at intel.com>
escreveu:

>
> > > From: Pedro Arthur [mailto:bygrandao at gmail.com]
> > > Sent: Thursday, November 07, 2019 1:18 AM
> > > To: FFmpeg development discussions and patches
> > <ffmpeg-devel at ffmpeg.org>
> > > Cc: Guo, Yejun <yejun.guo at intel.com>
> > > Subject: Re: [FFmpeg-devel] [PATCH V3] avfilter/vf_dnn_processing: add
> a
> > > generic filter for image proccessing with dnn networks
> > >
> > > Hi,
> > >
> > > Em qui., 31 de out. de 2019 às 05:39, Guo, Yejun <yejun.guo at intel.com>
> > > escreveu:
> > > This filter accepts all the dnn networks which do image processing.
> > > Currently, frame with formats rgb24 and bgr24 are supported. Other
> > > formats such as gray and YUV will be supported next. The dnn network
> > > can accept data in float32 or uint8 format. And the dnn network can
> > > change frame size.
> > >
> > > The following is a python script to halve the value of the first
> > > channel of the pixel. It demos how to setup and execute dnn model
> > > with python+tensorflow. It also generates .pb file which will be
> > > used by ffmpeg.
> > >
> > > import tensorflow as tf
> > > import numpy as np
> > > import scipy.misc
> > > in_img = scipy.misc.imread('in.bmp')
> > > in_img = in_img.astype(np.float32)/255.0
> > > in_data = in_img[np.newaxis, :]
> > > filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0,
> > > 1.]).reshape(1,1,3,3).astype(np.float32)
> > > filter = tf.Variable(filter_data)
> > > x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
> > > y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID',
> > name='dnn_out')
> > > sess=tf.Session()
> > > sess.run(tf.global_variables_initializer())
> > > output = sess.run(y, feed_dict={x: in_data})
> > > graph_def = tf.graph_util.convert_variables_to_constants(sess,
> > > sess.graph_def, ['dnn_out'])
> > > tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb',
> as_text=False)
> > > output = output * 255.0
> > > output = output.astype(np.uint8)
> > > scipy.misc.imsave("out.bmp", np.squeeze(output))
> > >
> > > To do the same thing with ffmpeg:
> > > - generate halve_first_channel.pb with the above script
> > > - generate halve_first_channel.model with tools/python/convert.py
> > > - try with following commands
> > >   ./ffmpeg -i input.jpg -vf
> > >
> > dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_
> > > out:fmt=rgb24:dnn_backend=native -y out.native.png
> > >   ./ffmpeg -i input.jpg -vf
> > >
> > dnn_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_out:f
> > > mt=rgb24:dnn_backend=tensorflow -y out.tf.png
> > It would be great if you could transform the above steps in a fate test,
> that
> > way one can automatically ensure the filter is always working properly.
>
> sure, I'll add a fate test to test this filter with
> halve_first_channel.model. There will
> be no test for tensorflow part since the fate test requires no external
> dependency.
>
> furthermore, more industry-famous models can be added into this fate test
> after we support them by
> adding more layers into native mode, and after we optimize the conv2d
> layer which is now
> very very very very slow.
>
> > > +};
> > > +
> > > +AVFilter ff_vf_dnn_processing = {
> > > +    .name          = "dnn_processing",
> > > +    .description   = NULL_IF_CONFIG_SMALL("Apply DNN processing
> > filter
> > > to the input."),
> > > +    .priv_size     = sizeof(DnnProcessingContext),
> > > +    .init          = init,
> > > +    .uninit        = uninit,
> > > +    .query_formats = query_formats,
> > > +    .inputs        = dnn_processing_inputs,
> > > +    .outputs       = dnn_processing_outputs,
> > > +    .priv_class    = &dnn_processing_class,
> > > +};
> > > --
> > > 2.7.4
> > rest LGTM.
>
> thanks, could we first push this patch?
>
patch pushed, thanks.
I slight edited the commit message, changed "scipy.misc" to "imageio" as
the former is deprecated and not present in newer versions.

> I plan to add two more changes for this filter next:
> - add gray8 and gray32 support
> - add y_from_yuv support, in other words, the network only handles the Y
> channel,
> and uv parts are not changed (or just scaled), just like what vf_sr does.
>
> I currently do not have plan to add specific yuv formats, since I do not
> see a famous
> network which handles all the y u v channels.
>
>
> > BTW do you have already concrete use cases (or plans) for this filter?
>
> not yet, the idea of this filter is that it is general for image
> processing and should be very useful,
> and my basic target is to at least cover the features provided by vf_sr
> and vf_derain
>
> actually, I do have a use case plan for a general video analytic filter,
> the side data type might be
> a big challenge, I'm still thinking about it. I choose this image
> processing filter first because
> it is simpler and community can be familiar with dnn based filters step by
> step.
>
> > >
> > >
> > > _______________________________________________
> > > ffmpeg-devel mailing list
> > > ffmpeg-devel at ffmpeg.org
> > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > >
> > > To unsubscribe, visit link above, or email
> > > ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>