[FFmpeg-devel] [PATCH] unscaled float 2 int conversion

Michael Niedermayer michaelni
Thu Jul 31 20:27:39 CEST 2008


On Thu, Jul 31, 2008 at 08:45:23PM +0300, Ivan Kalvachev wrote:
> On 5/19/08, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Mon, May 19, 2008 at 06:22:11PM +0200, Benjamin Larsson wrote:
> >> Michael Niedermayer wrote:
> >> >> And can you rerun the benchmarks on your P3 but not prescale the float
> >> >> buffer. Ie change to this and.
> >> >>
> >> >> tmpa[i] = in[i]* (1.0/32768) + 385;
> >> >>
> >> >> The reason I'm wondering is that sometimes it's not trivial to get the
> >> >> scaling for free and then you would have to do it during the loop to
> >> >> add
> >> >> the bias. I suspect that it is slower on platforms where it matter.
> >> >>
> >> >
> >> > 228651 dezicycles in conv_cast, 16256 runs, 128 skips
> >> > 108574 dezicycles in conv_lrint, 16321 runs, 63 skips
> >> > 63418 dezicycles in conv_x87_asm, 16329 runs, 55 skips
> >> > 51975 dezicycles in conv_x87_asm_ex, 16349 runs, 35 skips
> >> > 54081 dezicycles in conv_bias, 16351 runs, 33 skips
> >> >
> >> > that is with hand tuned conv_x87_asm_ex and gcc generated conv_bias
> >> > if i just hand tune the fmul/fadd loop a little with the integer code
> >> > left
> >> > as gcc generated it i get
> >> > 46308 dezicycles in conv_bias, 16336 runs, 48 skips
> >> >
> >>
> >>
> >> This result puzzled me somewhat until I found out that the benchmark
> >> tests had this kind of code for all methods instead of only for the bias
> >> code:
> >>
> >> for(i=0; i<SIZE; i++)
> >>   tmpa[i] = in[i];
> >>
> >> In the case when you can't get scaling for free that loop should be
> >> omitted. It would be truly bizarre if it is was always faster to do
> >> float2int by using the bias trick.
> >>
> >> So pretty please can you retry without that line also ?
> >
> > 114005 dezicycles in conv_lrint, 16352 runs, 32 skips
> > 42600 dezicycles in conv_x87_asm, 16355 runs, 29 skips
> > 31168 dezicycles in conv_x87_asm_ex, 16357 runs, 27 skips
> >
> > So to summarize
> > if you can scale the floats for free -> bias is fastest
> > if you cannot scale the floats for free -> bias is fastest
> > if you do not have any previous loop accessing the floats, thus you need an
> > additional pass to scale the floats -> conv_x87_asm_ex is maybe faster
> >
> > Its just "maybe" because the conv_bias code is not fully hand tuned and
> > you assume that the floats would be betweem -32768 32767 instead of
> > -1.0 .. 1.0 which is a assumtation which might not hold.
> >
> > all only on P3/P2/PPro with no SIMD of course
> 
> After some rant on irc, I got to look at the float_to_int16() function.
> 
> One of the rants is the range of the function [-1;1],
> so samples have to be rescaled before used.
> 
> The function can work just fine with input range [-32768;32767] ,
> if instead of 385 is used bias of 385*32768, this changes only
> the exponent and keeps the fraction bits at the same place.
> The only modification to the function is the constant used for clipping:
> 
> @@ -3948,7 +3948,7 @@
>  static av_always_inline int float_to_int16_one(const float *src){
>      int_fast32_t tmp = *(const int32_t*)src;
>      if(tmp & 0xf0000){
> -        tmp = (0x43c0ffff - tmp)>>31;
> +        tmp = (0x4b40ffff - tmp)>>31;
> 
> I hope, this somehow helps.

Its interresting how we missed this for so long ...
This way all the non constant scale factors can be droped. You dont happen
to also have an idea on how to get rid of the +385*C ? :)

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When the tyrant has disposed of foreign enemies by conquest or treaty, and
there is nothing more to fear from them, then he is always stirring up
some war or other, in order that the people may require a leader. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080731/95d11f3a/attachment.pgp>



More information about the ffmpeg-devel mailing list