[FFmpeg-devel] [PATCH] lavfi: add thumb video filter.
Stefano Sabatini
stefasab at gmail.com
Tue Dec 20 15:11:13 CET 2011
On date Tuesday 2011-12-20 10:30:06 +0100, Clément Bœsch encoded:
> On Tue, Dec 06, 2011 at 03:34:07AM +0100, Michael Niedermayer wrote:
> > On Mon, Dec 05, 2011 at 05:41:10PM +0100, Clément Bœsch wrote:
> > [...]
> > > +AVFilter avfilter_vf_thumbnail = {
> > > + .name = "thumbnail",
> > > + .description = NULL_IF_CONFIG_SMALL("Thumbnail selection filter"),
> > > + .priv_size = sizeof(ThumbContext),
> > > + .init = init,
> > > + .uninit = uninit,
> > > + .query_formats = query_formats,
> > > + .inputs = (const AVFilterPad[]) {
> > > + { .name = "default",
> > > + .type = AVMEDIA_TYPE_VIDEO,
> > > + .get_video_buffer = avfilter_null_get_video_buffer,
> > > + .start_frame = null_start_frame,
> > > + .draw_slice = draw_slice,
> > > + .end_frame = end_frame,
> > > + },{ .name = NULL }
> > > + },
> > > + .outputs = (const AVFilterPad[]) {
> > > + { .name = "default",
> > > + .type = AVMEDIA_TYPE_VIDEO,
> > > + .rej_perms = AV_PERM_REUSE2,
> > > + },{ .name = NULL }
> >
> > you need to implement request and poll_frame() as the defaults are
> > wrong for this filter
> >
> > request should call the sources request in a loop until a frame is
> > output from thumbnail
> >
> > poll_frame() is a bit more tricky, if the sources filters poll returns
> > 0, it should return 0 too
> > otherwise it has to call request frame from the source filter until
> > either its poll_frame returns 0 or the next input frame would cause a
> > frame to be output in which case it should return 1
> > see vf_yadif
> >
>
> OK, new patch attached.
>
> --
> Clément B.
> From 81aa7d67a077b9a874e43925a449f0787a33c4ec Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Cl=C3=A9ment=20B=C5=93sch?= <clement.boesch at smartjog.com>
> Date: Mon, 24 Oct 2011 17:11:10 +0200
> Subject: [PATCH] lavfi: add thumbnail video filter.
>
> ---
> Changelog | 1 +
> doc/filters.texi | 12 ++
> libavfilter/Makefile | 1 +
> libavfilter/allfilters.c | 1 +
> libavfilter/avfilter.h | 2 +-
> libavfilter/vf_thumbnail.c | 242 ++++++++++++++++++++++++++++++++++++++++++++
> 6 files changed, 258 insertions(+), 1 deletions(-)
> create mode 100644 libavfilter/vf_thumbnail.c
>
> diff --git a/Changelog b/Changelog
> index ad7fa8d..590752b 100644
> --- a/Changelog
> +++ b/Changelog
> @@ -139,6 +139,7 @@ easier to use. The changes are:
> - SBaGen (SBG) binaural beats script demuxer
> - OpenMG Audio muxer
> - Simple segmenting muxer
> +- Thumbnails support (see thumbnail video filter)
>
>
> version 0.8:
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 699e0c1..3f50ebf 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -2400,6 +2400,18 @@ For example:
> will create two separate outputs from the same input, one cropped and
> one padded.
>
> + at section thumbnail
> +Select potential thumbnail frames.
"thumbnail" is not a very clear term, maybe you could say "the most
representative frame in a given sequence of consecutive frames".
> +
> +It accepts as argument the threshold of frames to analyze (default is 100). The
> +filter will pick one of these frames.
Please give an explanation of the meaning of threshold, this is not
clear at all from this description.
>From my reading of the code this filter reads N frames, and outputs
the one whose histogram is nighest to the global average
histogram. Maybe "batch_size" or "nb_frames" may be less confusing.
> A bigger value will result in a slower
> +analysis and higher memory usage, but is likely to be more efficient.
Again, if I understand the code the specified N will affect the number
of output frames, because it changes the number of frames in each
analyzed sequence, thus this statement looks quite misleading.
> +
> +Example of thumbnail creation:
> + at example
> +ffmpeg -i in.avi -vf thumbnail,scale=300:200 -frames:v 1 out.png
> + at end example
A pure libavfilter example also may be useful.
[...]
> diff --git a/libavfilter/vf_thumbnail.c b/libavfilter/vf_thumbnail.c
> new file mode 100644
> index 0000000..9be871e
> --- /dev/null
> +++ b/libavfilter/vf_thumbnail.c
> @@ -0,0 +1,242 @@
[...]
> +/**
> + * @file
> + * Potential thumbnail lookup filter to reduce the risk of an inappropriate
> + * selection (such as a black frame) we could get with an absolute seek.
> + *
> + * Algorithm by Vadim Zaliva <lord at crocodile.org>.
> + * @url http://notbrainsurgery.livejournal.com/29773.html
> + */
> +
> +#include <math.h>
> +#include "libavcodec/avcodec.h"
why these?
> +#include "libavutil/imgutils.h"
> +#include "libavutil/internal.h"
> +#include "libswscale/swscale.h"
> +#include "avfilter.h"
> +
> +#define HIST_SZ (3*256)
Nit++: possibly HISTOGRAM_SIZE, or at least HIST_SIZE
> +#define DEF_FRAMES_THRESHOLD 100
> +
> +struct thumb_frame {
> + AVFilterBufferRef *buf; ///< cached frame
> + int histogram[HIST_SZ]; ///< RGB color distribution histogram of the frame
> +};
> +
> +typedef struct {
> + int n; ///< current frame
> + int n_frames; ///< threshold of frames for analysis
> + struct thumb_frame *frames; ///< the n_frames frames
> +} ThumbContext;
> +
> +static av_cold int init(AVFilterContext *ctx, const char *args, void *opaque)
> +{
> + ThumbContext *thumb = ctx->priv;
> +
> + if (args)
> + thumb->n_frames = strtol(args, NULL, 10);
> + if (thumb->n_frames < 2) {
> + if (args)
> + av_log(ctx, AV_LOG_WARNING,
> + "Invalid frame threshold specified, fallback to "
> + AV_STRINGIFY(DEF_FRAMES_THRESHOLD) "\n");
uh? why not a simple %d?
> + thumb->n_frames = DEF_FRAMES_THRESHOLD;
> + }
> + thumb->frames = av_calloc(thumb->n_frames, sizeof(*thumb->frames));
> + if (!thumb->frames) {
> + av_log(ctx, AV_LOG_ERROR,
> + "Allocation failure, try to lower the frames threshold\n");
> + return AVERROR(ENOMEM);
> + }
> + av_log(ctx, AV_LOG_INFO, "Select thumbnail with threshold of %d frames\n",
> + thumb->n_frames);
simpler/less cluttered: threshold:%d
> + return 0;
> +}
> +
> +static void draw_slice(AVFilterLink *inlink, int y, int h, int slice_dir)
> +{
> + int i, j;
> + AVFilterContext *ctx = inlink->dst;
> + ThumbContext *thumb = ctx->priv;
> + int *hist = thumb->frames[thumb->n].histogram;
> + AVFilterBufferRef *picref = inlink->cur_buf;
> + const uint8_t *p = picref->data[0] + y * picref->linesize[0];
> +
> + // update current frame RGB histogram
> + for (j = 0; j < h; j++) {
> + for (i = 0; i < inlink->w; i++) {
> + hist[0*256 + p[i*3 ]]++;
> + hist[1*256 + p[i*3 + 1]]++;
> + hist[2*256 + p[i*3 + 2]]++;
> + }
> + p += picref->linesize[0];
> + }
> +}
> +
> +/**
> + * @brief compute Root-mean-square deviation to estimate "closeness"
> + * @param hist color distribution histogram
> + * @param median average color distribution histogram
> + * @return root mean squared error
> + */
> +static float frame_rmse(const int *hist, const float *median)
> +{
> + int i;
> + float err, mean_sq_err = 0;
> + for (i = 0; i < HIST_SZ; i++) {
> + err = median[i] - (float)hist[i];
> + mean_sq_err += err*err / HIST_SZ;
> + }
you can factor out the division, and gain speed (and precision)
> + return sqrtf(mean_sq_err);
> +}
> +
> +static void end_frame(AVFilterLink *inlink)
> +{
> + int i, j, best_frame = 0;
> + float avg[HIST_SZ] = {0}, rmse, min_rmse = -1;
avg -> please more meaningful name, or use it in a local scope
> + AVFilterLink *outlink = inlink->dst->outputs[0];
> + ThumbContext *thumb = inlink->dst->priv;
> + AVFilterContext *ctx = inlink->dst;
> +
> + // keep a reference of each frame
> + thumb->frames[thumb->n].buf = inlink->cur_buf;
> +
> + // no selection until the buffer of N frames is filled up
> + if (thumb->n < thumb->n_frames - 1) {
> + thumb->n++;
> + return;
> + }
> +
> + // average histogram of the N frames
> + for (j = 0; j < FF_ARRAY_ELEMS(avg); j++)
> + for (i = 0; i < thumb->n_frames; i++)
> + avg[j] += (float)thumb->frames[i].histogram[j] / thumb->n_frames;
again, you can factor out the division
> + // find the frame closer to the average using RMSE
> + for (i = 0; i < thumb->n_frames; i++) {
> + rmse = frame_rmse(thumb->frames[i].histogram, avg);
> + if (i == 0 || rmse < min_rmse)
> + best_frame = i, min_rmse = rmse;
> + }
> +
> + // free and reset everything (except the best frame buffer)
> + for (i = 0; i < thumb->n_frames; i++) {
> + memset(thumb->frames[i].histogram, 0, sizeof(thumb->frames[i].histogram));
is this required?
> + if (i == best_frame)
> + continue;
> + avfilter_unref_buffer(thumb->frames[i].buf);
> + thumb->frames[i].buf = NULL;
> + }
> + thumb->n = 0;
> +
> + // raise the chosen one
> + av_log(ctx, AV_LOG_INFO, "frame id #%d selected\n", best_frame);
> + avfilter_start_frame(outlink, thumb->frames[best_frame].buf);
> + thumb->frames[best_frame].buf = NULL;
> + avfilter_draw_slice(outlink, 0, inlink->h, 1);
> + avfilter_end_frame(outlink);
> +}
> +
> +static av_cold void uninit(AVFilterContext *ctx)
> +{
> + int i;
> + ThumbContext *thumb = ctx->priv;
> + for (i = 0; i < thumb->n_frames && thumb->frames[i].buf; i++) {
> + avfilter_unref_buffer(thumb->frames[i].buf);
> + thumb->frames[i].buf = NULL;
> + }
> + av_freep(&thumb->frames);
> +}
> +
> +static void null_start_frame(AVFilterLink *link, AVFilterBufferRef *picref) { }
> +
> +static int request_frame(AVFilterLink *link)
> +{
> + ThumbContext *thumb = link->src->priv;
> +
> + /* loop until a frame thumbnail is available (when a frame is queued,
> + * thumb->n is reset to zero) */
> + while (thumb->n) {
> + int ret = avfilter_request_frame(link->src->inputs[0]);
> + if (ret < 0)
> + return ret;
> + }
> + return 0;
> +}
> +
> +static int poll_frame(AVFilterLink *link)
> +{
> + ThumbContext *thumb = link->src->priv;
> + AVFilterLink *inlink = link->src->inputs[0];
> + int ret, available_frames = avfilter_poll_frame(inlink);
> +
> + /* If the input link is not able to provide any frame, we can't do anything
> + * at the moment and thus have zero thumbnail available. */
> + if (!available_frames)
> + return 0;
> +
> + /* Since at least one frame is available and the next frame will allow us
> + * to compute a thumbnail, we can return 1 frame. */
> + if (thumb->n == thumb->n_frames - 1)
> + return 1;
> +
> + /* we have some frame(s) available in the input link, but not yet enough to
> + * output a thumbnail, so we request more */
> + ret = avfilter_request_frame(inlink);
> + return ret < 0 ? ret : 0;
> +}
> +
> +static int query_formats(AVFilterContext *ctx)
> +{
> + static const enum PixelFormat pix_fmts[] = {
> + PIX_FMT_RGB24, PIX_FMT_BGR24,
> + PIX_FMT_NONE
> + };
> + avfilter_set_common_pixel_formats(ctx, avfilter_make_format_list(pix_fmts));
> + return 0;
note: this can be easily extended to support more pixel formats
> +}
> +
> +AVFilter avfilter_vf_thumbnail = {
> + .name = "thumbnail",
> + .description = NULL_IF_CONFIG_SMALL("Thumbnail selection filter"),
Nit: description is a complete sentence describing what the filter
*does*, rather than a long name.
--
FFmpeg = Fostering & Formidable Murdering Philosofic Epic Glue
More information about the ffmpeg-devel
mailing list