[FFmpeg-devel] [PATCH 2/2] SSE optimized mp3 windowing

Loren Merritt lorenm
Thu Jun 17 21:56:16 CEST 2010


On Thu, 17 Jun 2010, Vitor Sessak wrote:

> + "movaps     (%0,%5), %%xmm1           \n\t"
> + "movaps     (%2,%5), %%xmm2           \n\t"
> + "movaps     (%1,%5), %%xmm3           \n\t"

One of these can be a memory arg to mulps.

> + "mulps       %%xmm2, %%xmm1           \n\t"
> + "subps       %%xmm1, %%xmm0           \n\t"
> + "mulps       %%xmm2, %%xmm3           \n\t"
> + "subps       %%xmm3, %%xmm4           \n\t"
> [repeated lots of times]

Looks like a place for a macro.

> + if (incr == 1) {

Does output really need to be interleaved?

> + "movups   52(%4), %%xmm0           \n\t"
> + "shufps    $0x1b, %%xmm0, %%xmm0   \n\t"
> + "movaps     (%1), %%xmm1           \n\t"

memory arg

> + "subps    %%xmm1, %%xmm0           \n\t"
> + "movaps   %%xmm0, (%0)             \n\t"
> +
> + "movups    4(%3), %%xmm0           \n\t"
> + "movaps   48(%2), %%xmm1           \n\t"
> + "shufps    $0x1b, %%xmm0, %%xmm0   \n\t"
> + "addps    %%xmm1, %%xmm0           \n\t"
> + "movaps   %%xmm0, 112(%0)          \n\t"

Why do you alternate between two schedules?

--Loren Merritt



More information about the ffmpeg-devel mailing list