[FFmpeg-devel] [PATCH] MMX VP3 Loop Filter

David Conrad lessen42
Sun Oct 12 02:40:25 CEST 2008

On Oct 11, 2008, at 5:14 AM, Jason Garrett-Glaser wrote:

> On Sat, Oct 11, 2008 at 1:53 AM, David Conrad <lessen42 at gmail.com>  
> wrote:
>> filter_limit *= 0x02020202;
>> "movd     "#flim", %%mm5 \n\t" \
>> "punpcklbw  %%mm5, %%mm5 \n\t" \
> Which is faster, this, or SPLATB in the form of punpcklbw + pshufw +
> psllw (psllw because the filter_limit values are guaranteed to be <
> 128, so a word left shift is equivalent to a byte left shift)?
> The SPLATB would avoid the integer multiply, and perhaps also as
> importantly avoid the register->mm move, since you'll be able to load
> it directly off the stack.

I couldn't measure any difference between these, but I'm  
precalculating the *2 and loading it memory now anyways.

More information about the ffmpeg-devel mailing list