[FFmpeg-devel] Audio conversion and floating-point codecs

Tue May 18 00:05:38 CEST 2010

On Sat, May 15, 2010 at 12:17 PM, M?ns Rullg?rd <mans at mansr.com> wrote:

> There is a long-standing desire from some to make the floating-point
> decoders output float samples instead of converting to int16
> internally, and I agree with the reasons for this.  However, making
> this change hastily will make decoding orders of magnitude slower on
> many CPUs.  The reason is that when a decoder outputs float samples,
> the fast asm code for float-to-int conversion is not used.
>
> In order to change the output format of these decoders without
> impacting performance, we must first make a few improvements to the
> avcodec API and to the generic audio format conversion code.
>
> What we have
> ------------
>
> - Very fast float-to-int16 conversion code in dsputil.  These
>  functions require input scaled to -32k..32k.
>

Adding scaling to this code wouldnt slow it down much.  Here's a scalar
float to int I used for wmapro conversion:

const __m128 kFloatScaler = _mm_set1_ps( 2147483648.0f );static void
FloatToIntSaturate(float* p) {  __m128 a = _mm_set1_ps(*p);  a =
_mm_mul_ss(a, kFloatScaler);  *reinterpret_cast<int32*>(p) =
_mm_cvtss_si32(a);}

>
> - The codecs in question all scale the output to the correct range as
>  part of transforms or filters.  The scaling is thus effectively free.
>
> - Generic sample format conversion code (audioconvert.c).  This code
>  requires float input in the range -1..1.  It does not use any asm
>  and is thus excruciatingly slow.  Decoding wmapro on Cortex-A8
>  spends more than 50% of the total time here.
>

Short term, I would prefer wmapro output int16 and have ffmpeg do it
efficiently on arm.
If you pass thru float asis to Pulse or kmixer, performance is poor and/or
you get stutter.
If you leave conversion to applications, they'll tend to do it poorly,
especially on less common CPUs.

> What we need
> ------------
>
> - The libavcodec API needs to be amended such that a specific scaling
>  can be requested of the decoders.  This should probably be done
>  similarly to how channel down-mixing is already handled.
>

Its better to require a consistent -1 to 1.

>
> - The decoders should output planar audio instead of interleaved for
>  multichannel streams.  This probably means introducing
>  avcodec_decode_audio4() with an AVFrame output.
>

Planar requires interleaving before it can be played.  Is there a compelling
advantage?

> - A better audio conversion system needs to be implemented.  Ideally,
>  this should be able to reuse existing asm code.  To this end, the
>  desired input range should be exported after configuration, allowing
>  it to be passed back to decoders.
>
> - The conversion system should be designed to allow up/down-mixing of
>  channels within the same API.  This is a feature currently missing
>  from FFmpeg, making playback of multi-channel AAC or Vorbis on a
>  stereo output difficult.  Implementing this is not a prerequisite
>  for switching the output format of the decoders.
>

agreed.

> - All the pieces above need to be tied together in ffmpeg.c.
>
> None of the above is especially difficult to do, but it is important
> that it is done properly, or performance will suffer.
>

Also note that at some point, float video channels would be good.