[MPlayer-dev-eng] [PATCH] vf_eq2 extensions

Hampa Hug hampa at hampa.ch
Fri Jan 31 20:11:17 CET 2003


D Richard Felker III wrote:

> On Fri, Jan 31, 2003 at 06:17:20PM +0100, Michael Niedermayer wrote:
> > i doubt that evaluating a polynom is faster than a single L1 cache read from a 
> > 256 byte LUT
> 
> Nope, it's not. But with MMX, evaluating 4 polynomials is just as fast
> as evaluating one. :) And loading a single byte from memory, then
> immediately using it as a 32bit offset into a lookup table, is VERY
> SLOW. x86 cpu's don't like mixing register sizes these days. I spent a
> lot of time trying to improve performance on stuff just like this in
> another program, and I never could get it to work as fast as I wanted.
> 
> BTW, as a reference, eq2 uses well over twice the cpu time of eq on my
> system, last I checked.

On my system (ultrasparc) eq2 is about twice as fast as eq (when
performing the same task). I doubt that there is a single solution
that is always faster.

Could you explain the register size mixing a bit more? Are you saying
that the 'movzx' instruction is slow?

Hampa


More information about the MPlayer-dev-eng mailing list