[FFmpeg-devel] [PATCH] H.264: x264 SSE2 iDCT functions
Fri Jan 2 20:03:48 CET 2009
idct_add16: 293 -> 282 clocks
idct_add16intra: 343 -> 257 clocks
"300" sample (contains almost no i16x16 blocks so I didn't test add16intra):
idct_add16: 518 -> 433
Higher benefit is due to higher bitrate, most likely.
idct_DC was ommitted from idct_add16 because the extra branching logic
turned out to make it significantly slower (the branching becomes much
more complicated and less likely as *both* 4x4 DCT blocks have to be
DC-only for it to work).
x264 iDCT code was modified to add a stride parameter, required for ffh264.
x86util.asm was included from x264 in full for simplicity's sake and
ease of use for adding future x264 assembly that uses it.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 13224 bytes
Desc: not available
More information about the ffmpeg-devel