[FFmpeg-devel] [PATCH] Altivec vector_fmul_scalar
Tue Jan 6 14:19:58 CET 2009
David Conrad wrote:
> ~7-9% faster vorbis, aac, and ac3.
> I have no clue why it's not bitexact to the C version; I tried not
> using the add of madd, and even enabling denormal handling to match
> the C version. The differences are only a very occasional +/- 1 however.
Actually, the more you use madd, the better the precision since
intermediate computations are made with a greater precision.
I somehow thought that this code
> + t0 = vec_madd(s0, wj, vadd_bias);
> + t1 = vec_madd(s1, wi, zero);
> + t0 = vec_sub(t0, t1);
could be reduced to 2 instructions using vec_nmsub(), but I guess
I'll test your code and commit tonite if it works as expected.
More information about the ffmpeg-devel