[FFmpeg-devel] [PATCH] SSE dct32() [Was: r23095 - in trunk/libavcodec: ...]
Måns Rullgård
mans
Sun Jun 20 13:33:39 CEST 2010
Vitor Sessak <vitor1001 at gmail.com> writes:
> On 06/20/2010 12:15 PM, M?ns Rullg?rd wrote:
>> Vitor Sessak<vitor1001 at gmail.com> writes:
>>
>>>>> I don't remember seeing a big difference _for the dct32 code_ between in ==
>>>>> out and in != out.
>>>>
>>>> now iam confused, i thought the 3% you quoted was about in ==out vs in!= out
>>>> ?
>>>
>>> No, the 3% slowdown was when converting our general code (using FFT)
>>> to have in != out.
>>
>> And that was due to missed optimisations caused by gcc not knowing
>> that those pointers don't alias each other. Marking them restrict is
>> not good either, since we actually want to pass the same value
>> sometimes.
>
> That and one extra used register.
So what do we do? I see the following options:
1. Change mp3 decoder to work with inplace transform.
2. Copy the block before doing inplace transform.
3. Apply magic to remove slowdown from splitting in/out.
Did I miss anything?
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list