[FFmpeg-devel] [PATCH 6/6] lossless audio dsp: unroll

Christophe Gisquet christophe.gisquet at gmail.com
Mon Apr 18 19:52:11 CEST 2016

2016-04-18 19:15 GMT+02:00 James Almer <jamrial at gmail.com>:
> On 4/18/2016 10:07 AM, Christophe Gisquet wrote:
>> The loops are guaranteed to be at least multiples of 8, so this
>> unrolling is safe but allows exploiting execution ports.
>> For int32 version: 72 -> 57c.
> What compiler are you using, and what cpu at configure time?

gcc 5.1, Win64, haswell. I don't use mingw64 compiler.

> We're currently enabling tree vectorization for gcc 4.9 or newer on x86,
> and at least with gcc 5.3.0 on mingw-w64 the resulting code now seems worse.
> I didn't bench it, but after this patch it's not being vectorized anymore.

The code I benchmarked as being 72c is vectorized and keeps being
vectorized here. It actually looks better than the previously
vectorized one.

The 16_c version is no longer vectorized, but is really a mess here
when vectorized.


More information about the ffmpeg-devel mailing list