[FFmpeg-devel] [PATCH + RFC] Faster ff_celp_lp_synthesis_filterf() (and failed SSE SIMD version)

Vitor Sessak vitor1001
Wed Dec 16 18:10:39 CET 2009


Michael Niedermayer wrote:
> On Mon, Dec 14, 2009 at 10:21:47PM +0100, Vitor Sessak wrote:
>> Vitor Sessak wrote:
>>> Michael Niedermayer wrote:
>>>> On Sun, Dec 13, 2009 at 08:55:08PM +0100, Vitor Sessak wrote:
>>>> [...]
>>>>> +            old_out3 = old_out2;
>>>>> +            old_out2 = old_out1;
>>>>> +            old_out1 = old_out0;
>>>>> +            old_out0 = out[-i-1];
>>>>> +
>>>>> +            val = filter_coeffs[i];
>>>>> +
>>>>> +            out0 -= val * old_out0;
>>>>> +            out1 -= val * old_out1;
>>>>> +            out2 -= val * old_out2;
>>>>> +            out3 -= val * old_out3;
>>>> old_out3 = out[-i-1];
>>>>
>>>> val = filter_coeffs[i];
>>>> out0 -= val * old_out3;
>>>> out1 -= val * old_out0;
>>>> out2 -= val * old_out1;
>>>> out3 -= val * old_out2;
>>>>
>>>> and similarly you can get rid of the other copies if you unroll it more
>>> Indeed, done. New patch attached.
>>> BTW, in my SSE code, there was a line of code missing:
>>>> DECLARE_ASM_CONST(16, uint32_t, mask[4]) = {0xFFFFFFFF, 0xFFFFFFFF,
>>>>                                             0xFFFFFFFF, 0x00000000};
>>>>
>> Err, this time without reinventing FFSWAP()...
>>
>> -Vitor
> 
> do you want to be maintainer of celp_filters*

Yes, done and patch committed.

-Vitor



More information about the ffmpeg-devel mailing list