[FFmpeg-devel] [PATCH v2 4/4] avfilter/vf_drawbox: support draw specific face by facedetect metadata

Wed May 13 20:16:47 EEST 2020

On Wed, 13 May 2020, lance.lmwang at gmail.com wrote:

> From: Limin Wang <lance.lmwang at gmail.com>
>
> Signed-off-by: Limin Wang <lance.lmwang at gmail.com>
> ---
> doc/filters.texi         | 10 ++++++++++
> libavfilter/vf_drawbox.c | 27 +++++++++++++++++++++++++++
> 2 files changed, 37 insertions(+)
>
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 76e12ef..bf9043c 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -9337,6 +9337,10 @@ See below for the list of accepted constants.
> Applicable if the input has alpha. With value @code{1}, the pixels of the painted box
> will overwrite the video's color and alpha pixels.
> Default is @code{0}, which composites the box onto the input, leaving the video's alpha intact.
> +
> + at item face
> +Draw the box by the facedetect metadata for the specific face.

@item face @var{integer}
Draw the box onto the position of the nth face as detected by the 
ocv filter's facedetect mode. If no face detection metadata exists
then the filter will use the specified box parameters instead.

> +
> @end table
> 
> The parameters for @var{x}, @var{y}, @var{w} and @var{h} and @var{t} are expressions containing the
> @@ -9405,6 +9409,12 @@ Draw a 2-pixel red 2.40:1 mask:
> @example
> drawbox=x=-t:y=0.5*(ih-iw/2.4)-t:w=iw+t*2:h=iw/2.4+t*2:t=2:c=red
> @end example
> +
> + at item
> +draw the box with red color for the first face by metadata if its postion is detected:

Draw a red box onto the position of the first face as detected by the ocv 
filter's facedetect method.

> + at example
> +ocv=filter_name=facedetect:filter_params=facedetect=./haarcascade_frontalface_alt.xml,drawbox=face=0:color=red
> + at end example
> @end itemize
> 
> @subsection Commands
> diff --git a/libavfilter/vf_drawbox.c b/libavfilter/vf_drawbox.c
> index 21d520e..239a149 100644
> --- a/libavfilter/vf_drawbox.c
> +++ b/libavfilter/vf_drawbox.c
> @@ -81,6 +81,7 @@ typedef struct DrawBoxContext {
>     char *t_expr;          ///< expression for thickness
>     int have_alpha;
>     int replace;
> +    int face;
> } DrawBoxContext;
> 
> static const int NUM_EXPR_EVALS = 5;
> @@ -220,6 +221,31 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame)
>     int plane, x, y, xb = s->x, yb = s->y;
>     unsigned char *row[4];
> 
> +    if (s->face >= 0) {
> +        AVDictionaryEntry *ex, *ey, *ew, *eh;
> +        char key2[128];
> +        AVDictionary *metadata = frame->metadata;
> +
> +        snprintf(key2, sizeof(key2), "lavfi.facedetect.%d.%s", s->face, "x");
> +        ex = av_dict_get(metadata, key2, NULL, AV_DICT_MATCH_CASE);
> +
> +        snprintf(key2, sizeof(key2), "lavfi.facedetect.%d.%s", s->face, "y");
> +        ey = av_dict_get(metadata, key2, NULL, AV_DICT_MATCH_CASE);
> +
> +        snprintf(key2, sizeof(key2), "lavfi.facedetect.%d.%s", s->face, "w");
> +        ew = av_dict_get(metadata, key2, NULL, AV_DICT_MATCH_CASE);
> +
> +        snprintf(key2, sizeof(key2), "lavfi.facedetect.%d.%s", s->face, "h");
> +        eh = av_dict_get(metadata, key2, NULL, AV_DICT_MATCH_CASE);
> +
> +        if (ex && ey && ew && eh) {
> +            xb = s->x = strtol(ex->value, NULL, 10);
> +            yb = s->y = strtol(ey->value, NULL, 10);
> +            s->w = strtol(ew->value, NULL, 10);
> +            s->h = strtol(eh->value, NULL, 10);
> +        }
> +    }
> +
>     if (s->have_alpha && s->replace) {
>         for (y = FFMAX(yb, 0); y < frame->height && y < (yb + s->h); y++) {
>             row[0] = frame->data[0] + y * frame->linesize[0];
> @@ -323,6 +349,7 @@ static const AVOption drawbox_options[] = {
>     { "thickness", "set the box thickness",                        OFFSET(t_expr),    AV_OPT_TYPE_STRING, { .str="3" },       0, 0, FLAGS },
>     { "t",         "set the box thickness",                        OFFSET(t_expr),    AV_OPT_TYPE_STRING, { .str="3" },       0, 0, FLAGS },
>     { "replace",   "replace color & alpha",                        OFFSET(replace),   AV_OPT_TYPE_BOOL,   { .i64=0 },         0,        1,        FLAGS },
> +    { "face",      "set which face to draw with metadata",         OFFSET(face),      AV_OPT_TYPE_INT,    { .i64=-1 },        -1,     256,        FLAGS },
>     { NULL }
> };
>

It is a bit more work, but have you considered making the face position 
available to the filter as a variable to the expressions? Also possibly 
you could add a new mode to the filter to select when to evaluate the 
x/y/width/height/tickness parameters, at initalization time, or for each 
frame. Several filters alrady have such selectors (e.g.: see the the eval 
parameter of vf_eq).

I don't oppose the patch as is, but it is something worth considering.

Regards,
Marton