[FFmpeg-devel] Anybody has a Core 2? [PATCH] Small SSSE3 optimization
Tue May 8 18:42:00 CEST 2007
2007/5/9, Loren Merritt <lorenm at u.washington.edu>:
> On Tue, 8 May 2007, Mike Melanson wrote:
> > Zuxy Meng wrote:
> >> Attached patch makes use of SSSE3 instruction pabsw to calculate the
> >> absolute value of packed words. Just for fun. And I don't have a SSSE3
> >> capable CPU so hopefully someone with a Core 2 can help test it to
> >> ensure it doesn't break anything (better with benchmarks of course:-)
> You'd get even more speedup by also using SSE2/xmmregs.
Absolutely (and plus 8 more xmmregs for x86-64!). But I found
MMX->SSE2 quite tricky (especially for ffmpeg with so complex
algorithm and so many macro definitions) and I'm lazy:-(
> And both additions would be better written as macros, no need for code duplication.
I've considered that; but to achieve best performance I made the
function body of the SSSE3 version a little bit different than the
Beauty is truth,
While truth is beauty.
PGP KeyID: E8555ED6
More information about the ffmpeg-devel