[FFmpeg-devel] [RFC] libavfilter audio API and related issues

Fri Jul 2 00:14:46 CEST 2010

On date Friday 2010-06-25 03:52:45 -0700, S.N. Hemanth Meenakshisundaram encoded:
> Hi All,
> 
> I moved to Git and this is a first git test patch.
> 
> The changes in this patch are:
> 
> 1. Fixes to audio filter framework code based on comments from last patch.
> 
> 2. SampleFormat and ChannelLayout definitions moved to libavutil/audiofmt.h
> 
> 3. Some channel layout and sample format utility functions copied to
> libavutil/audiodesc.h and audiodesc.c. Copies of these functions
> (with different names) still exist in libavcodec/utils.c and
> libavcodec/audioconvert.c, I haven't yet removed them to prevent
> breakage of existing code.
> 
> 4. Libavfilter audio changes now dependent only on lavu and not on lavc.
> 
> 5. Incomplete version of af_resample.c which does sample format
> conversion and will also do channel layout conversion.
> 
> Before that, I have a few questions on how to proceed:
> 
> 1. lavc has a resample.c for the above operations (in conjunction
> with audioconvert.c) and also a resample2.c which does sample rate
> conversion. Is it ok to name this filter af_resample or should I
> name this one af_reformat and reserve the af_resample name for the
> sample rate conversion filter?

Yes I prefer the second scheme:
af_reformat
af_resample

is consistent with the video scale / format filters scheme, maybe even
af_format -> aformat will do.

> 2. The current resample.c only accepts two channel input data - why?
> Depending on the file and codec, isn't it possible for input data to
> be more than two channels? Can I attempt adding multichannel input
> support or is there a reason it is not supported?

Audio multichannel support is welcome, and I believe is not
implemented for historical reasons (the same for which most codecs
assume the sample format is always signed 16 bit).

> 3. The current output sample format is always forced to S16. The
> channel layout conversion filters after this all assume that the
> individual samples are shorts. This needs to be fixed right? For
> example, I was thinking of changing stereo_to_mono function so it
> accepts an extra stride parameter and uses that to do its job
> irrespective of whether inputs are shorts (S16) or otherwise. Is
> this ok?

Yes and yes, and I would like to hear someone which worked extensively
with the audio API to say which are the potential problems/warts to
keep in mind for such an implementation.

Resuming:
* audioconvert.h/audioconvert.c:
    struct AVAudioConvert
    generic sample and channel layout API
    av_audio_convert* API, supports up to 6 channels

* avcodec.h/resample.c
    struct ReSampleContext
    audio_resample_init() - deprecated
    av_audio_resample_init()
    audio_resample()
    audio_resample_close()

* avcodec.h/resample2.c
    struct AVResampleContext
    av_resample_init()
    av_resample_close()
    av_resample_compensate()
    av_resample()

Can someone say which is the main difference between the various APIs
and what's the plan for their support (i.e. which is going to be
deprecated in the future?).

This API seems to need anyway some (major?) rework, I already proposed
some time ago to create a separate lib for this (libavresample -
lavre?), which would have a role similar to that of libswscale.

> I still need to fix nits etc in af_resample. Will do those when
> finishing the channel layout conversion.

For the moment I believe that we need to focus on getting the first
simple version of the audio framework in place, then we will hopefully
work through the audio resampling framework current limitations.

Regards.
-- 
FFmpeg = Free and Frenzy Miracolous Puristic Elastic God