[FFmpeg-devel] [PATCH] unscaled float 2 int conversion
Mon May 19 18:22:11 CEST 2008
Michael Niedermayer wrote:
>> And can you rerun the benchmarks on your P3 but not prescale the float
>> buffer. Ie change to this and.
>> tmpa[i] = in[i]* (1.0/32768) + 385;
>> The reason I'm wondering is that sometimes it's not trivial to get the
>> scaling for free and then you would have to do it during the loop to add
>> the bias. I suspect that it is slower on platforms where it matter.
> 228651 dezicycles in conv_cast, 16256 runs, 128 skips
> 108574 dezicycles in conv_lrint, 16321 runs, 63 skips
> 63418 dezicycles in conv_x87_asm, 16329 runs, 55 skips
> 51975 dezicycles in conv_x87_asm_ex, 16349 runs, 35 skips
> 54081 dezicycles in conv_bias, 16351 runs, 33 skips
> that is with hand tuned conv_x87_asm_ex and gcc generated conv_bias
> if i just hand tune the fmul/fadd loop a little with the integer code left
> as gcc generated it i get
> 46308 dezicycles in conv_bias, 16336 runs, 48 skips
This result puzzled me somewhat until I found out that the benchmark
tests had this kind of code for all methods instead of only for the bias
for(i=0; i<SIZE; i++)
tmpa[i] = in[i];
In the case when you can't get scaling for free that loop should be
omitted. It would be truly bizarre if it is was always faster to do
float2int by using the bias trick.
So pretty please can you retry without that line also ?
More information about the ffmpeg-devel