[FFmpeg-devel] [PATCH] Optimize QTRLE encoding

Thu Feb 14 20:24:33 CET 2013

On Tue, Feb 12, 2013 at 09:34:44PM -0500, Malcolm Bechard wrote:
> On Tue, Feb 12, 2013 at 2:47 PM, Alexis Ballier <alexis.ballier at gmail.com>wrote:
> 
> > +
> > +    pixel_size = s->pixel_size;
> > +
> >
> > this (and related changes s->pixel_size -> pixel_size): does it have
> > any performance impact alone ? compiler should be able to optimize it
> > but some hint may not hurt.
> >
> > IMHO this should be a separate patch so that it eases reviewing the
> > non-trivial part of the patch
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> 
> I don't think we can assume all compilers will optimize that. The fact that
> this code runs 2x-3x faster when compiled with VS2010 vs. gcc shows how
> much optimization can be missed (although I'm hoping it's something I'm
> doing wrong with the gcc compile).

I would expect that for the VS compiler auto-vectorization is on.
For gcc this is disabled, since for the most important codecs all that
makes sense to use SIMD for is optimized with assembler code, so all
that auto-vectorization will do is break things and make it go slower.
However that is of course not necessarily the case for QTRLE.
Also if you compile for 32 bit you should check whether SSE/SSE2 is
enabled or not (it will not be when compiling with gcc I think).
Lastly, the way to figure this out properly is of course by
profiling and/or looking at the generated assembler code.