[FFmpeg-devel] [PATCH] Fix for DECLARE_ALIGNED_8 macro on ARM (also prevent gcc 3.4.x ICE)

Måns Rullgård mans
Sat Feb 16 20:19:20 CET 2008


Michael Niedermayer <michaelni at gmx.at> writes:

> On Sat, Feb 16, 2008 at 02:39:19PM +0000, M?ns Rullg?rd wrote:
>> M?ns Rullg?rd <mans at mansr.com> writes:
>> 
>> > Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
>> >
>> >> Hello All,
>> >>
>> >> Currently DECLARE_ALIGNED_8 macro is quite misleading on ARM (it tries to
>> >> enforce 4 byte alignment contarary to what a developer reading the 
>> >> ffmpeg sources could naturally expect):
>> >>
>> >> ...
>> >> /* This is to use 4 bytes read to the IDCT pointers for some 'zero'
>> >>    line optimizations */
>> >> #define DECLARE_ALIGNED_8(t, v) DECLARE_ALIGNED(4, t, v)
>> >> #define STRIDE_ALIGN 4
>> >> ...
>> >>
>> >> And it really causes problems with gcc 3.4.x:
>> >> http://lists.mplayerhq.hu/pipermail/ffmpeg-user/2005-August/000930.html
>> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22177
>> >>
>> >> The offending line is the following:
>> >>     DECLARE_ALIGNED_8 (uint64_t, aligned_bak[stride]);
>> >>
>> >> Apparently gcc got confused when asked to align uint64_t array at 4 byte
>> >> boundary and died with ICE. Earlier I just thought 'WTF?' and removed
>> >> DECLARE_ALIGNED_8 around that array (gcc should be clever enough to align 
>> >> that array properly without any help anyway). Now it is clear that
>> >> DECLARE_ALIGNED_8 macro definition triggered this problem.
>> >>
>> >> Modern ARM cores have instructions for reading/writing 64-bit data
>> >> and require strict 8 byte alignment for using them. According to ARM
>> >> manual, using LDRD instruction with the data that is not aligned at
>> >> 8 byte boundary is UNPREDICTABLE and IMPLEMENTATION
>> >> DEFINED. Unfortunately, both ARM cores that I have available (ARM9E
>> >> and ARM11) require just 4 byte alignment for these instructions
>> >> (ARM11 is faster when used with 8 byte aligned data though). So if
>> >> there were any other problems related to the use of this macro, I
>> >> could not observe them in my tests.
>> >>
>> >> Anyway, I think it is better to fix this macro even though the
>> >> immediate visible effect is only the "out of the box" support for
>> >> legacy compiler (gcc 3.4.x).
>> >
>> > I agree.  Since some ARM variants might require strict alignment for
>> > 64-bit loads, we must provide this.
>> >
>> >> The comment in the code was removed by patch as I don't see direct
>> >> relation of this macro and IDCT, 16 byte alignment is supposedly
>> >> guaranteed for IDCT according to the comments from dsputil.h, anyway
>> >> feel free to prove me wrong:
>> >>     void (*idct)(DCTELEM *block/* align 16*/);
>> >>
>> >> Index: libavcodec/dsputil.h
>> >> ===================================================================
>> >> --- libavcodec/dsputil.h	(revision 12120)
>> >> +++ libavcodec/dsputil.h	(working copy)
>> >> @@ -535,10 +535,8 @@
>> >>  
>> >>  #elif defined(ARCH_ARMV4L)
>> >>  
>> >> -/* This is to use 4 bytes read to the IDCT pointers for some 'zero'
>> >> -   line optimizations */
>> >> -#define DECLARE_ALIGNED_8(t, v) DECLARE_ALIGNED(4, t, v)
>> >> -#define STRIDE_ALIGN 4
>> >> +#define DECLARE_ALIGNED_8(t, v) DECLARE_ALIGNED(8, t, v)
>> >> +#define STRIDE_ALIGN 8
>> >
>> > Looks good to me.
>> 
>> Scratch that; that section of the file is a mess.  I propose the
>> attached patch to clean it up instead.
>
> ok

Applied.

-- 
M?ns Rullg?rd
mans at mansr.com




More information about the ffmpeg-devel mailing list