[FFmpeg-devel] [PATCH 3/3] Use DSPContext.vector_fmul() and DSPContext.vector_fmul_reverse() in floating-point version of apply_window(). 46% faster in function apply_window().
Justin Ruggles
justin.ruggles
Sun Jan 2 04:30:32 CET 2011
On 01/01/2011 10:09 PM, Michael Niedermayer wrote:
> On Fri, Dec 31, 2010 at 03:11:40PM -0500, Justin Ruggles wrote:
>> diff --git libavcodec/ac3enc_float.c libavcodec/ac3enc_float.c
>> index 6a061d6..addc84f 100644
>> --- libavcodec/ac3enc_float.c
>> +++ libavcodec/ac3enc_float.c
>> @@ -77,16 +77,13 @@ static void mdct512(AC3MDCTContext *mdct, float *out, float *in)
>> /**
>> * Apply KBD window to input samples prior to MDCT.
>> */
>> -static void apply_window(float *output, const float *input,
>> +static void apply_window(DSPContext *dsp, float *output, const float *input,
>> const float *window, int n)
>> {
>> - int i;
>> int n2 = n >> 1;
>> -
>> - for (i = 0; i < n2; i++) {
>> - output[i] = input[i] * window[i];
>> - output[n-i-1] = input[n-i-1] * window[i];
>> - }
>> + memcpy(output, input, n2 * sizeof(*input));
>> + dsp->vector_fmul(output, window, n2);
>> + dsp->vector_fmul_reverse(output+n2, input+n2, window, n2);
>
> The memcpy is ugly
yeah, I know... I'll see if I can implement a new version of
vector_fmul that will handle different input from output and compare the
speed.
-Justin
More information about the ffmpeg-devel
mailing list