[FFmpeg-devel] [PATCH] SSE RDFT
Måns Rullgård
mans
Sat Mar 20 22:34:06 CET 2010
Jason Garrett-Glaser <darkshikari at gmail.com> writes:
> On Sun, Mar 14, 2010 at 3:23 PM, Alex Converse <alex.converse at gmail.com> wrote:
>> I'm sure I've made some embarrassingly amateurish mistakes here.
>> Feedback is more than welcome.
>>
>> --Alex
>
> In the interests of getting away from discussions about yasm and into
> actually reviewing the asm...
>
> +///sign mask of RDFT sine terms
>
> Three / ?
>
> Looking at the asm overall, it looks like there's a huge amount of
> moving stuff around and very little actual calculation. Is there no
> better way to organize it?
>
> + "movlps (%4,%0,4), %%xmm4 \n\t"
> + "unpcklps %%xmm4, %%xmm4 \n\t"
> + "movlps (%5,%0,4), %%xmm3 \n\t"
> + "unpcklps %%xmm3, %%xmm3 \n\t"
>
> This looks like a candidate for movsldup in an SSE3 version.
Well?
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list