[FFmpeg-devel] [PATCH] Fix 4XM decoding on big-endian and unaligned reads
Reimar Döffinger
Reimar.Doeffinger
Thu Nov 11 22:14:38 CET 2010
On Thu, Nov 11, 2010 at 10:05:36PM +0100, Vitor Sessak wrote:
> On 11/11/2010 09:45 PM, Reimar D?ffinger wrote:
> >On Thu, Nov 11, 2010 at 09:31:51PM +0100, Vitor Sessak wrote:
> >>Index: libavcodec/4xm.c
> >>===================================================================
> >>--- libavcodec/4xm.c (revision 25719)
> >>+++ libavcodec/4xm.c (working copy)
> >>@@ -260,6 +260,21 @@
> >> }
> >> }
> >>
> >>+#if HAVE_BIGENDIAN
> >>+#define LE_CENTRIC_MUL(dst, src, scale, dc) \
> >>+ { \
> >>+ unsigned tmpval = ((src)[1]<< 16) + (src)[0]; \
> >>+ tmpval = tmpval * (scale) + (dc); \
> >>+ (dst)[0] = tmpval& 0xFFFF; \
> >>+ (dst)[1] = tmpval>> 16; \
> >>+ }
> >>+#else
> >>+#define LE_CENTRIC_MUL(dst, src, scale, dc) \
> >>+ { \
> >>+ *((uint32_t *) (dst)) = AV_RL32(src) * (scale) + (dc); \
> >>+ }
> >>+#endif
> >
> >
> >Hmm.. Isn't this the same as
> >uint32_t tmp = AV_RN32(src);
> >#if HAVE_BIGENDIAN
> >tmp = (tmp<< 16) | (tmp>> 16);
> >#endif
> >tmp = tmp * scale + dc;
> >#if HAVE_BIGENDIAN
> >tmp = (tmp<< 16) | (tmp>> 16);
> >#endif
> >AV_RN32A(dst, tmp);
> >
> >Note that the two things under #if should compile
> >into a single rotate instruction.
> >Which one is faster overall depends on whether unaligned
> >accesses are fast or not...
>
> Look pretty equivalent to my patch to me and I have no particular
> preference for either. No idea also of what is faster.
First, the last one should be AV_WN32A of course.
Second, the write part definitely is faster.
For the read part it depends on whether unaligned reads are available,
but it seems easy enough to add a special case if someone actually
cares about the performance when they aren't available.
More information about the ffmpeg-devel
mailing list