[FFmpeg-devel] Audio conversion and floating-point codecs
Måns Rullgård
mans
Tue May 18 00:23:31 CEST 2010
Frank Barchard <fbarchard at google.com> writes:
> On Sat, May 15, 2010 at 12:17 PM, M?ns Rullg?rd <mans at mansr.com> wrote:
>
>> There is a long-standing desire from some to make the floating-point
>> decoders output float samples instead of converting to int16
>> internally, and I agree with the reasons for this. However, making
>> this change hastily will make decoding orders of magnitude slower on
>> many CPUs. The reason is that when a decoder outputs float samples,
>> the fast asm code for float-to-int conversion is not used.
>>
>> In order to change the output format of these decoders without
>> impacting performance, we must first make a few improvements to the
>> avcodec API and to the generic audio format conversion code.
>>
>> What we have
>> ------------
>>
>> - Very fast float-to-int16 conversion code in dsputil. These
>> functions require input scaled to -32k..32k.
>>
>
> Adding scaling to this code wouldnt slow it down much.
I know what I'm talking about, ...
> Here's a scalar float to int I used for wmapro conversion:
>
> const __m128 kFloatScaler = _mm_set1_ps( 2147483648.0f );static void
> FloatToIntSaturate(float* p) { __m128 a = _mm_set1_ps(*p); a =
> _mm_mul_ss(a, kFloatScaler); *reinterpret_cast<int32*>(p) =
> _mm_cvtss_si32(a);}
... whereas about whoever wrote that POS, I'm not so sure.
>> - The codecs in question all scale the output to the correct range as
>> part of transforms or filters. The scaling is thus effectively free.
>>
>> - Generic sample format conversion code (audioconvert.c). This code
>> requires float input in the range -1..1. It does not use any asm
>> and is thus excruciatingly slow. Decoding wmapro on Cortex-A8
>> spends more than 50% of the total time here.
>>
>
> Short term, I would prefer wmapro output int16 and have ffmpeg do it
> efficiently on arm.
Yes, that is the plan.
> If you pass thru float asis to Pulse or kmixer, performance is poor and/or
> you get stutter.
> If you leave conversion to applications, they'll tend to do it poorly,
> especially on less common CPUs.
I know that, which is why we need a decent conversion API.
>> What we need
>> ------------
>>
>> - The libavcodec API needs to be amended such that a specific scaling
>> can be requested of the decoders. This should probably be done
>> similarly to how channel down-mixing is already handled.
>
> Its better to require a consistent -1 to 1.
No, that would be slower.
>> - The decoders should output planar audio instead of interleaved for
>> multichannel streams. This probably means introducing
>> avcodec_decode_audio4() with an AVFrame output.
>
> Planar requires interleaving before it can be played. Is there a
> compelling advantage?
1. The user might want it like that.
2. The existing float2int16 asm does interleaving more or less for free.
A separate interleaving pass would definitely be slower.
> Also note that at some point, float video channels would be good.
ROTFL
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list