[Libav-user] gcc auto-vectorisation

Tue Feb 26 11:05:42 CET 2013

On Feb 26, 2013, at 02:21, Claudio Freire wrote:

> I wouldn't assume. Even if they are in effect aligned, if the compiler
> doesn't know it (ie, if malloc doesn't mark them as such),
> vectorization will still assume out-of-alignment access.

I may be wrong, but if that were the case (and glue code were added to ensure proper alignment), auto-vectorisation should not in that case be able to provoke a crash on win32 because of ... incorrect alignment. And yet that happens (i.e. crashes).

> Architecture-mandated and SSE/2/3/MMX/Whatever alignment requirements
> tend to be different.

Of course, but as far as I have understood not in this case, because Apple makes such intensive use of SIMD throughout its APIs/SDKs.
> 
> You can write a very simple test case to check it out.

Done. More exactly, I was doing some comparisons of a hand-coded SIMD vs. a straightforward scalar version of functions I'd found when I discovered that gcc-4.7 has auto-vectorisation on by default (at least on OS X) because the scalar version was almost 2.5x faster than the SIMD version. That's what set the whole thing rolling, begging the question if there wouldn't be any gains (albeit undoubtedly smaller) to be had letting the compiler do its thing on the ffmpeg sources.

R