[FFmpeg-devel] [PATCH] Fast half-float to float conversion

Wed Jul 1 12:52:08 CEST 2009

On 2009-07-01 09:23, Reimar D?ffinger wrote:
> On Wed, Jul 01, 2009 at 09:05:36AM +0200, Jimmy Christensen wrote:
>> Looks more correct now, but still a very dark image. I multiplied with 2
>> and it seems to match the table lookup method. Is this correct?
>
> Maybe, though adding a *2 is the wrong way.
>
>> float av_int2halflt(int16_t v){
>>        uint16_t nosign = v+v;
>>        if (nosign>= 0xfc00) {
>>            if (nosign == 0xfc00) return v>>15 ? -1.0/0.0 : 1.0/0.0;
>>            else return 0.0/0.0;
>>        }
>>        if (nosign<   0x0400) return 0; // denormal or 0
>>        return ldexp((v&0x3ff) + (1<<10) * (v>>15|1), (v>>10&0x1f)-26)*2;
>
> You forgot a (), this will be wrong for negative values.
> And instead of *2 the -26 should be -25 (this is related to the 11 ->  10
> change, the value is 0x1f/2+10, rounded down).
> The denormal or 0 case probably should instead of "return 0" use
> "return ldexp((v&0x3ff)*(v>>15|1), 1-25);" to handle denormals.
>
>> I would love to do such optimizations if I was able to. Right now I'm
>> just looking for some reference code which can be used to do the job
>> which is simple and not too bloated coding wise. Implementing the table
>> based is not a good solution for smaller systems, but never had these
>> systems in mind when proposing the code for half-float>  float.
>
> Even the paper you linked to has the code to do it without table:
> f = ((h&0x8000)<<16) | (((h&0x7c00)+0x1C000)<<13) | ((h&0x03FF)<<13)
> you'd only have to add handling of 0, denormals, NaN and Inf as far as
> you need.

I modified for the purpose I need it for. Converting from half-float to 
unsigned int.

uint16_t av_halflt2uint(uint16_t v){
      uint16_t nosign = v+v;
      if (v>>15)
          return 0; // negatives are not interesting so clamp it to 0
      if (nosign >= 0xfc00)
          return 65535; // Anything above 1 should be clamped to 65535
      if (nosign < 0x0200)
          return ldexp((v&0x3ff), 1-25)*65535; // denormal or 0
      return ldexp((v&0x3ff) + (1<<10), (v>>10&0x1f)-25)*65535;
}

It's used in the decoder like this :

*ptr_x++ = av_halflt2uint(AV_RL16(buf));

I did some tests speed wise and got the following (full encode to mov) :

With table lookup : 58fps
With above reference code : 26fps
Same images as 32-bit float with normal float to unsigned int : 56fps

-- 
Best Regards
Jimmy Christensen
Developer
Ghost A/S