[FFmpeg-devel] [PATCH] Fix 4XM decoding on big-endian and unaligned reads

Thu Nov 11 22:05:36 CET 2010

On 11/11/2010 09:45 PM, Reimar D?ffinger wrote:
> On Thu, Nov 11, 2010 at 09:31:51PM +0100, Vitor Sessak wrote:
>> Index: libavcodec/4xm.c
>> ===================================================================
>> --- libavcodec/4xm.c	(revision 25719)
>> +++ libavcodec/4xm.c	(working copy)
>> @@ -260,6 +260,21 @@
>>       }
>>   }
>>
>> +#if HAVE_BIGENDIAN
>> +#define LE_CENTRIC_MUL(dst, src, scale, dc) \
>> +    { \
>> +        unsigned tmpval = ((src)[1]<<  16) + (src)[0];  \
>> +        tmpval = tmpval * (scale) + (dc);               \
>> +        (dst)[0] = tmpval&  0xFFFF;                     \
>> +        (dst)[1] = tmpval>>  16;                        \
>> +    }
>> +#else
>> +#define LE_CENTRIC_MUL(dst, src, scale, dc) \
>> +    { \
>> +        *((uint32_t *) (dst)) = AV_RL32(src) * (scale) + (dc); \
>> +    }
>> +#endif
>
>
> Hmm.. Isn't this the same as
> uint32_t tmp = AV_RN32(src);
> #if HAVE_BIGENDIAN
> tmp = (tmp<<  16) | (tmp>>  16);
> #endif
> tmp = tmp * scale + dc;
> #if HAVE_BIGENDIAN
> tmp = (tmp<<  16) | (tmp>>  16);
> #endif
> AV_RN32A(dst, tmp);
>
> Note that the two things under #if should compile
> into a single rotate instruction.
> Which one is faster overall depends on whether unaligned
> accesses are fast or not...

Look pretty equivalent to my patch to me and I have no particular 
preference for either. No idea also of what is faster.

-Vitor