[FFmpeg-devel] suggestion for expanding audio bitdepth support in libav/ffmpeg

Mon Jan 21 10:37:57 CET 2008

madshi a ?crit :
> I've been told by different people that sooner or later ffmpeg is supposed
> to get support for multiple audio bitdepths with all the bells and whistles.
> But it seems to be a very big change and so it gets pushed back, 
> respectively
> nobody is daring to take over this job. Please correct me, if I'm wrong 
> on this.
> 
> Here comes a suggestion on a first step to the final goal which should be
> much easier to realize without breaking anything:
> 
> --------
> 
> CURRENT STATE:
> As far as I understand things, currently ffmpeg calls the audio decoders
> and expects 16bit integer samples from them. Other formats are not
> supported by ffmpeg.
> 
> ULTIMATE SOLUTION:
> ffmpeg can handle any audio format and every decoder can output what
> it wants.
> 
> INTERMEDIATE STEP:
> We let the decoders output whatever they want but add an intermediate
> step between the decoders and ffmpeg which converts every audio
> format to 16bit integer. This way ffmpeg can stay as it is. And the
> decoders can in the beginning also stay as they are without breaking
> anything. But this would allow us to change one decoder after the other
> to output the most optimal sample format and bitdepth without having
> any effect on ffmpeg. Furthermore if there is a central audio conversion
> routine to 16bit integer, it would also be rather easy to add proper
> TPDF dithering, which is missing right now, I believe.
> 
> --------
> 
> What do you guys think?
> 
> I have to add that I'm not up to the task of doing the work. I just wanted
> to throw my thoughts at you.

I would say the first step should be to wrap sound in some struct called 
e.g. AVSnippet, similar to AVFrame (or maybe accept both audio and/or 
video in an AVFrame), instead of stupidly moving sound around as a 
buffer without any indication of number of channels, packed or planar, 
bitdepth, sampling rate, duration...

Next, the special-cases for PCM that are all over the place, should be 
handled like any other codec or format.

When these 2 goals are reached, *then* it should be much easier to add 
support for other internal sound 'pixel formats'.

Of course, the *real* challenge is to achieve all that without adding 
even 1% overhead, which some would deem unacceptable...

Cheers,
-- 
Michel Bardiaux
R&D Director
T +32 [0] 2 790 29 41
F +32 [0] 2 790 29 02
E mailto:mbardiaux at mediaxim.be

Mediaxim NV/SA
Vorstlaan 191 Boulevard du Souverain
Brussel 1160 Bruxelles
http://www.mediaxim.com/