[FFmpeg-devel] [PATCH 02/12] mips/float_dsp: replace assembly with C implementations

Wed Mar 4 12:31:03 CET 2015

On Wed, 2015-03-04 at 11:08 +0000, Nedeljko Babic wrote:
> >The assembly versions have a few problems
> >- They only work with mips32r2 enabled
> >- They don't work on 64-bits
> >- They're massive and complex
> >
> >So replace them with C implementations which solve these problems and let GCC
> >magically optimize for different platforms. All the functions are manually
> >unrolled 4 times (like the assembly code). With the addition of a few restrict
> >keywords, the functions produce almost identical assembly to the original
> >versions when compiled with gcc -O3.
> >
> >Since this code now uses no fpu assembly, drop the HAVE_MIPSFPU guard as well.
> 
> All improvements of the C code should be put in generic C code so all architectures
> can benefit from them.
> 
> The purpose of this code was to create optimizations for specific architecture.
> In this way optimizations for mips32r2 architecture are here even without tweaking
> configure line and even for older compilers.

That's ok until you try to run it on an old MIPS processor and the
default FFmpeg options cause lots of SIGILLs, but that's another
discussion (and maybe nobody cares :/).

> By putting these optimizations under HAVE_MIPS32R2 problem with building mips64 should
> be resolved and this can be optimized for mips64 later if needed.

I was thinking about just dropping this patch for the time being and
porting a few bits to mips64 like in the other files (the code only uses
the mips vi parts of mips32r2). The code only kept its performance when
I unrolled the loops and used av_restrict. Strictly speaking you're not
even supposed to use restrict if the arrays could be exactly equal
(which is permitted in the contracts for some of these functions).

James