[Ffmpeg-devel] [PATCH] another vorbis optimization
Loren Merritt
lorenm
Fri Aug 18 19:23:09 CEST 2006
On Fri, 18 Aug 2006, Rich Felker wrote:
> On Fri, Aug 11, 2006 at 02:17:29AM +0200, Luca Barbato wrote:
>>
>> Loren Merritt wrote:
>>> One branch (perfectly predictable) vs saving 224 integer additions when
>>> non-simd. Well, if you value simplicity of the code over speed in this
>>> less-common case (and I guess I do too, else I would've mmxed this
>>> loop), then I can remove it.
>>
>> At least on ppc the branchless version is slightly faster (better
>> average with lower deviation)
>
> This should be true on any sane cpu. Only exception might be shit with
> slow bit arithmetic, P4 anyone??
Really? one branch should be slower than 224 adds?
Granted, there is a tradeoff only without the sse/3dnow/altivec version
of float2int. If an optimized version is used, then both choices use the
same amount of arithmatic, and the only difference is branch vs no branch.
--Loren Merritt
More information about the ffmpeg-devel
mailing list