[FFmpeg-devel] [PATCH] SSE dct32() [Was: r23095 - in trunk/libavcodec: ...]
Sat Jun 5 07:35:29 CEST 2010
Moving discussion to -devel...
On 05/31/2010 09:59 PM, Vitor Sessak wrote:
> On 05/14/2010 05:52 PM, Michael Niedermayer wrote:
>> On Fri, May 14, 2010 at 08:39:48AM +0200, Vitor Sessak wrote:
>>> Michael Niedermayer wrote:
>>>> On Tue, May 11, 2010 at 03:56:45PM -0400, Alex Converse wrote:
>>>>> On Tue, May 11, 2010 at 3:52 PM, michael<subversion at mplayerhq.hu>
>>>>>> Author: michael
>>>>>> Date: Tue May 11 21:52:42 2010
>>>>>> New Revision: 23095
>>>>>> float based mp1/mp2/mp3 decoders.
>>>> btw, any volunteers to try to hook it up to our split radix dct and or
>>>> simd optimize it?
>>> Without rdft or dct simd, our split radix code is slower. Ugly hack
>>> to test
>>> it attached.
>> if dct32() is faster then it should be used by our generic dct code.
>> at least for the plain C case
> I've given a try at a SSE dct32(). It is much faster than current C
> code. The only problem is that current code in mpegaudiodec.c expect two
> arguments, one input (which is destructed) and one output. ITOH,
> ff_dct_calc() does everything in-place, what does not glue well with the
> current mpegaudiodec.c code. Can you (or anyone else that knows
> mpegaudiodec.c well) fix it?
I've given a try of making mpegaudiodec.c use the same buffer for dct
input and output and it is not trivial. It is much easier (and has no
measurable slowdown) to make ff_dct_calc() take both an input and an
output pointer as in attached patch.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 6538 bytes
Desc: not available
More information about the ffmpeg-devel