[FFmpeg-devel] [PATCH 2/2] SSE optimized mp3 windowing
Vitor Sessak
vitor1001
Thu Jun 17 22:56:36 CEST 2010
On 06/17/2010 09:56 PM, Loren Merritt wrote:
> On Thu, 17 Jun 2010, Vitor Sessak wrote:
>
>> + "movaps (%0,%5), %%xmm1 \n\t"
>> + "movaps (%2,%5), %%xmm2 \n\t"
>> + "movaps (%1,%5), %%xmm3 \n\t"
>
> One of these can be a memory arg to mulps.
Already addressed in my latest patch in the same thread.
>> + "mulps %%xmm2, %%xmm1 \n\t"
>> + "subps %%xmm1, %%xmm0 \n\t"
>> + "mulps %%xmm2, %%xmm3 \n\t"
>> + "subps %%xmm3, %%xmm4 \n\t"
>> [repeated lots of times]
>
> Looks like a place for a macro.
Good point. Used macros also for the other block.
>> + if (incr == 1) {
>
> Does output really need to be interleaved?
It's a known TODO to allow codecs to outpu some kind of
SAMPLE_FMT_PLANAR_FLOAT.
>> + "movups 52(%4), %%xmm0 \n\t"
>> + "shufps $0x1b, %%xmm0, %%xmm0 \n\t"
>> + "movaps (%1), %%xmm1 \n\t"
>
> memory arg
Fixed
>> + "subps %%xmm1, %%xmm0 \n\t"
>> + "movaps %%xmm0, (%0) \n\t"
>> +
>> + "movups 4(%3), %%xmm0 \n\t"
>> + "movaps 48(%2), %%xmm1 \n\t"
>> + "shufps $0x1b, %%xmm0, %%xmm0 \n\t"
>> + "addps %%xmm1, %%xmm0 \n\t"
>> + "movaps %%xmm0, 112(%0) \n\t"
>
> Why do you alternate between two schedules?
No good reason, fixed.
New patch attached.
-Vitor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mp3_dspfy5.diff
Type: text/x-patch
Size: 8050 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100617/2709f79c/attachment.bin>
More information about the ffmpeg-devel
mailing list