[FFmpeg-devel] [PATCH 2/2] SSE optimized mp3 windowing
Loren Merritt
lorenm
Thu Jun 17 21:56:16 CEST 2010
On Thu, 17 Jun 2010, Vitor Sessak wrote:
> + "movaps (%0,%5), %%xmm1 \n\t"
> + "movaps (%2,%5), %%xmm2 \n\t"
> + "movaps (%1,%5), %%xmm3 \n\t"
One of these can be a memory arg to mulps.
> + "mulps %%xmm2, %%xmm1 \n\t"
> + "subps %%xmm1, %%xmm0 \n\t"
> + "mulps %%xmm2, %%xmm3 \n\t"
> + "subps %%xmm3, %%xmm4 \n\t"
> [repeated lots of times]
Looks like a place for a macro.
> + if (incr == 1) {
Does output really need to be interleaved?
> + "movups 52(%4), %%xmm0 \n\t"
> + "shufps $0x1b, %%xmm0, %%xmm0 \n\t"
> + "movaps (%1), %%xmm1 \n\t"
memory arg
> + "subps %%xmm1, %%xmm0 \n\t"
> + "movaps %%xmm0, (%0) \n\t"
> +
> + "movups 4(%3), %%xmm0 \n\t"
> + "movaps 48(%2), %%xmm1 \n\t"
> + "shufps $0x1b, %%xmm0, %%xmm0 \n\t"
> + "addps %%xmm1, %%xmm0 \n\t"
> + "movaps %%xmm0, 112(%0) \n\t"
Why do you alternate between two schedules?
--Loren Merritt
More information about the ffmpeg-devel
mailing list