[FFmpeg-devel] [RFC] libavfilter audio API and related issues
Stefano Sabatini
stefano.sabatini-lala
Sun May 23 19:12:36 CEST 2010
On date Saturday 2010-05-22 22:37:18 -0700, S.N. Hemanth Meenakshisundaram encoded:
[...]
> Hi,
>
> I started off trying to make the ffplay changes required for audio
> filtering just to get an idea of what all will be required of an
> audio filter API. Attached is a rudimentary draft of the changes. It
> is merely to better understand the required design and based on this
> I have the following questions and observations about the design:
>
> 1. ffplay currently gets only a single sample back for every
> audio_decode_frame call (even if encoded packet decodes to multiple
> samples). Should we be putting each sample individually through the
> filter chain or would it be better to collect a number of samples
> and then filter them together?
The second looks more efficient, so yes you could give a try at it.
> 2. Can sample rate, audio format etc change between samples? If not,
> can we move those parameters to the AVFilterLink structure as Bobby
> suggested earlier? The AVFilterLink strructure also needs to be
> generalized.
Sample rate and audio format can be considered constant, as they are
for video. This should be changed, but for the moment I believe is OK
to assume this.
> 3. The number of channels can also be stored in the filter link
> right? That way, we will know how many of the data[8] pointers are
> valid.
We need some way to describe the layout of every single channel. Maybe
we could store CH_LAYOUT_* (check libavcodec/avcodec.h) in the
AVFilterLink, this should provide more information that the mere
number of channels.
> 4. Do we require linesize[8] for audio. I guess linesize here would
> represent the length of data in each channel. Isn't this already
> captured by sample format? Can different channels ever have
> different datasizes for a sample?
If we know the sample format and the number of samples/duration, then
the linesize information is indeed redundant. If it bothers you can
set it to 0.
> 5. Is it necessary to have a separate num_samples value in the
> BufferRef or Buffer (in case we filter multiple samples at a time)?
> Can we instead capture it as part of a more useful 'datasize'
> variable that can be quickly be used for copying the data between
> filters?
>
> Also if we are converting AVFilterPic structure to a more generic
> AVFilterBuffer that is referred to by an AVFilterPicRef and
> AVFilterBufferRef, should the video specific items like PixFormat be
> removed and kept confined to PicRef and BufferRef?
Yes that was the idea.
AVFilterBuffer => contains the data common to A/V/T (data+linesize)
AVFilterPicRef => reference an AVFilterBuffer, and contains data specific
to video
AVFilterSamplesRef => reference an AVFilterBuffer, and contains data specific
to audio
According to this scheme samples_nb would be moved to
AVFilterSamplesRef.
> Regards,
>
> --- ffplay.c 2010-05-22 22:18:09.573923072 -0700
> +++ ../ffplay.af 2010-05-22 22:16:49.277933683 -0700
> @@ -105,6 +105,7 @@
>
> #if CONFIG_AVFILTER
> AVFilterPicRef *picref;
> + AVFilterBufferRef *bufref;
> #endif
> } VideoPicture;
>
> @@ -209,6 +210,8 @@
>
> #if CONFIG_AVFILTER
> AVFilterContext *out_video_filter; ///<the last filter in the video chain
> + AVFilterContext *out_audio_filter; ///<the last filter in the audio chain
> + AVFilterGraph *agraph;
> #endif
>
> float skip_frames;
> @@ -265,6 +268,7 @@
> static int rdftspeed=20;
> #if CONFIG_AVFILTER
> static char *vfilters = NULL;
> +static char *afilters = NULL;
> #endif
>
> /* current context */
> @@ -1752,6 +1756,138 @@
> { .name = NULL }},
> .outputs = (AVFilterPad[]) {{ .name = NULL }},
> };
> +
> +typedef struct {
> + VideoState *is;
> +} AudioFilterPriv;
> +
> +static int input_audio_init(AVFilterContext *ctx, const char *args, void *opaque)
> +{
> + AudioFilterPriv *priv = ctx->priv;
> + AVCodecContext *codec;
> + if(!opaque) return -1;
> +
> + priv->is = opaque;
> + codec = priv->is->video_st->codec;
> + codec->opaque = ctx;
> +
> + return 0;
> +}
> +
> +static void input_audio_uninit(AVFilterContext *ctx)
> +{
> +}
> +
> +static int input_request_samples(AVFilterLink *link)
> +{
> + AudioFilterPriv *priv = link->src->priv;
> + AVFilterBufferRef *bufref;
> + int64_t pts = 0;
> + int buf_size;
> +
> +#if CONFIG_AVFILTER
> + buf_size = audio_get_filtered_samples(is->out_audio_filter, priv->is, &pts)
> +#else
> + buf_size = audio_decode_frame(priv->is, &pts)
> +#endif
> + if (buf_size <= 0)
> + return -1;
> +
> + bufref = avfilter_get_audio_samples(link, AV_PERM_WRITE, link->sample_rate, link->a_format);
> + memcpy(&bufref->data, priv->is->audio_buf, buf_size);
> +
> + bufref->pts = pts;
> + bufref->datasize = buf_size;
> + avfilter_filter_samples(link, bufref);
> +
> + return 0;
> +}
> +
> +static int input_query_audio_formats(AVFilterContext *ctx)
> +{
> + AudioFilterPriv *priv = ctx->priv;
> + enum SampleFormat sample_fmts[] = {
> + priv->is->audio_st->codec->sample_fmt, SAMPLE_FMT_NONE
> + };
> +
> + avfilter_set_common_formats(ctx, avfilter_make_format_list(sample_fmts));
> + return 0;
> +}
> +
> +static int input_config_audio_props(AVFilterLink *link)
> +{
> + AudioFilterPriv *priv = link->src->priv;
> + AVCodecContext *c = priv->is->video_st->codec;
> +
> + link->sample_rate = c->sample_rate;
> + link->channels = c->channels;
> + link->sample_fmt = c->sample_fmt;
> +
> + return 0;
> +}
> +
> +static AVFilter input_filter =
> +{
> + .name = "ffplay_audio_input",
> +
> + .priv_size = sizeof(AudioFilterPriv),
> +
> + .init = input_audio_init,
> + .uninit = input_audio_uninit,
> +
> + .query_formats = input_query_audio_formats,
> +
> + .inputs = (AVFilterPad[]) {{ .name = NULL }},
> + .outputs = (AVFilterPad[]) {{ .name = "default",
> + .type = AVMEDIA_TYPE_AUDIO,
> + .request_samples = input_request_samples,
> + .config_props = input_config_audio_props, },
> + { .name = NULL }},
> +};
> +
> +static void output_filter_samples(AVFilterLink *link)
> +{
> +}
> +
> +static int output_query_audio_formats(AVFilterContext *ctx)
> +{
> + enum SampleFormat sample_fmts[] = { SAMPLE_FMT_S16, SAMPLE_FMT_NONE };
> +
> + avfilter_set_common_formats(ctx, avfilter_make_format_list(sample_fmts));
> + return 0;
> +}
> +
> +static int get_filtered_audio_frame(AVFilterContext *ctx, VideoState *is, int64_t *pts);
> +{
> + AVFilterBufRef *bufref;
> +
> + if(avfilter_request_samples(ctx->inputs[0]))
> + return -1;
> + if(!(bufref = ctx->inputs[0]->cur_buf))
> + return -1;
> + ctx->inputs[0]->cur_buf = NULL;
> +
> + *pts = bufref->pts;
> +
> + memcpy(is->audio_buf1, bufref->data, bufref->datasize);
> + is->audio_buf = is->audio_buf1;
> +
> + return bufref->datasize;
> +}
> +
> +static AVFilter output_audio_filter =
> +{
> + .name = "ffplay_audio_output",
> +
> + .query_formats = output_query_audio_formats,
> +
> + .inputs = (AVFilterPad[]) {{ .name = "default",
> + .type = AVMEDIA_TYPE_AUDIO,
> + .filter_samples = output_filter_samples,
> + .min_perms = AV_PERM_READ, },
> + { .name = NULL }},
> + .outputs = (AVFilterPad[]) {{ .name = NULL }},
> +};
> #endif /* CONFIG_AVFILTER */
>
> static int video_thread(void *arg)
> @@ -2175,6 +2311,9 @@
> AVCodecContext *avctx;
> AVCodec *codec;
> SDL_AudioSpec wanted_spec, spec;
> +#if CONFIG_AVFILTER
> + AVFilterContext *afilt_src = NULL, *afilt_out = NULL;
> +#endif
>
> if (stream_index < 0 || stream_index >= ic->nb_streams)
> return -1;
> @@ -2227,6 +2366,45 @@
> is->audio_src_fmt= SAMPLE_FMT_S16;
> }
>
> +#if CONFIG_AVFILTER
> + is->agraph = av_mallocz(sizeof(AVFilterGraph));
> + if(!(filt_src = avfilter_open(&input_audio_filter, "asrc"))) goto the_end;
> + if(!(filt_out = avfilter_open(&output_audio_filter, "aout"))) goto the_end;
> +
> + if(avfilter_init_filter(afilt_src, NULL, is)) goto the_end;
> + if(avfilter_init_filter(afilt_out, NULL, NULL)) goto the_end;
> +
> +
> + if(afilters) {
> + AVFilterInOut *outputs = av_malloc(sizeof(AVFilterInOut));
> + AVFilterInOut *inputs = av_malloc(sizeof(AVFilterInOut));
> +
> + outputs->name = av_strdup("ain");
> + outputs->filter = afilt_src;
> + outputs->pad_idx = 0;
> + outputs->next = NULL;
> +
> + inputs->name = av_strdup("aout");
> + inputs->filter = afilt_out;
> + inputs->pad_idx = 0;
> + inputs->next = NULL;
> +
> + if (avfilter_graph_parse(agraph, afilters, inputs, outputs, NULL) < 0)
> + goto the_end;
> + av_freep(&afilters);
> + } else {
> + if(avfilter_link(afilt_src, 0, afilt_out, 0) < 0) goto the_end;
> + }
> + avfilter_graph_add_filter(agraph, afilt_src);
> + avfilter_graph_add_filter(agraph, afilt_out);
> +
> + if(avfilter_graph_check_validity(agraph, NULL)) goto the_end;
> + if(avfilter_graph_config_formats(agraph, NULL)) goto the_end;
> + if(avfilter_graph_config_links(agraph, NULL)) goto the_end;
> +
> + is->out_audio_filter = afilt_out;
> +#endif
> +
> ic->streams[stream_index]->discard = AVDISCARD_DEFAULT;
> switch(avctx->codec_type) {
> case AVMEDIA_TYPE_AUDIO:
> @@ -2287,6 +2465,10 @@
> if (is->reformat_ctx)
> av_audio_convert_free(is->reformat_ctx);
> is->reformat_ctx = NULL;
> +#if CONFIG_AVFILTER
> + avfilter_graph_destroy(is->agraph);
> + av_freep(&(is->agraph));
> +#endif
> break;
> case AVMEDIA_TYPE_VIDEO:
> packet_queue_abort(&is->videoq);
> @@ -3046,6 +3228,7 @@
> { "window_title", OPT_STRING | HAS_ARG, {(void*)&window_title}, "set window title", "window title" },
> #if CONFIG_AVFILTER
> { "vf", OPT_STRING | HAS_ARG, {(void*)&vfilters}, "video filters", "filter list" },
> + { "af", OPT_STRING | HAS_ARG, {(void*)&afilters}, "audio filters", "filter list" },
> #endif
> { "rdftspeed", OPT_INT | HAS_ARG| OPT_AUDIO | OPT_EXPERT, {(void*)&rdftspeed}, "rdft speed", "msecs" },
> { "default", OPT_FUNC2 | HAS_ARG | OPT_AUDIO | OPT_VIDEO | OPT_EXPERT, {(void*)opt_default}, "generic catch all option", "" },
Looks fine at first look.
So let's try to sketch a plan:
* Implement AVFilterBuffer, and use it in place of AVFilterPic.
Make AVFilterPicRef reference such a struct, and create an
AVFilterSamples containing the audio data.
* Have a first sketch at the API.
* Integrate it into ffplay. This step is more or less already
implemented ;-).
As for the use of the SVN soc. It shouldn't be too bad to let you work
in the current libavfilter soc tree. Audio is quite indipendent from
video, so I don't expect major breakages and the stability of the tree
shouldn't be affected too much, so that shouldn't be a major issue even
for those who are currently using the libavfilter tree.
Regards.
--
FFmpeg = Friendly and Fancy Murdering Powerful Exploitable Game
More information about the ffmpeg-devel
mailing list