[FFmpeg-devel] [PATCH] some SIMD write-combining for h264
Sat Jan 16 06:35:58 CET 2010
On Fri, Jan 15, 2010 at 11:11:23PM -0500, Alexander Strange wrote:
> This adds intreadwrite macros for 64/128-bit memory operations and uses them in h264.
> Unlike the other macros, these assume correct alignment, and the patch only defines the ones there was an immediate use for.
> This only has x86 versions, but others should be easy. The 64-bit operations can be done with double copies on most systems, I guess.
> Decoding a 30s file on Core 2 Merom with --cpu=core2 (minimum of 5 runs):
> x86-32: 12.72s before, 12.51s after (1.7%)
> x86-64: 10.24s before, 10.20s after (.4%)
> Tested on x86-32, x86-64, x86-32 with --arch=c.
as your code uses MMX you need to at least mention EMMS/float issue in the
dox and probably a emms_c(); call before draw_horiz_band()
dunno if these are all
also what sets __MMX__ ? we have our own defines for that
1.7% makes 50->48.3 % left to CoreAVC if we assume we are 50% behind
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Rewriting code that is poorly written but fully understood is good.
Rewriting code that one doesnt understand is a sign that one is less smart
then the original author, trying to rewrite it will not make it better.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel