[FFmpeg-devel] [PATCH] SSE3/4 implementation of flac_encode_residual_lpc
Michael Niedermayer
michaelni
Thu Jun 18 13:51:00 CEST 2009
On Sat, May 30, 2009 at 09:30:28PM +0000, Loren Merritt wrote:
> On Sat, 30 May 2009, Bobby Bingham wrote:
>> On Fri, 29 May 2009, Loren Merritt wrote:
>>
>>> For the remainder, this logic should be doable
>>> with just 1 paddd and 1 por per vector. Merge several vectors before
>>> branching.
>>
>> I'm afraid I don't quite see what you mean by using 1 paddd and 1 por.
>> The attached patch does have a slight improvement in this piece of
>> code, but I doubt it's what you meant.
>
> The C version is:
> (unsigned)(x+0x8000) >= 0x10000
> And to merge several entries before the branch:
> (unsigned)((x[0]+0x8000) | (x[1]+0x8000) | ...) >= 0x10000
> Or since sse doesn't have an uint32 compare:
> (((x[0]+0x8000) | (x[1]+0x8000) | ...) >> 16) != 0
>
> This won't be much if any faster than yours when testing one vector at a
> time.
whats the status of this patch?
waiting for changes?
ok to commit?
want me to review it?
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I have never wished to cater to the crowd; for what I know they do not
approve, and what they approve I do not know. -- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090618/daabb19b/attachment.pgp>
More information about the ffmpeg-devel
mailing list