[FFmpeg-devel] [PATCH] H.264: x264 SSE2 iDCT functions
Jason Garrett-Glaser
darkshikari
Fri Jan 2 20:03:48 CET 2009
$subject
Benchmarks:
Cathedral:
idct_add16: 293 -> 282 clocks
idct_add16intra: 343 -> 257 clocks
"300" sample (contains almost no i16x16 blocks so I didn't test add16intra):
idct_add16: 518 -> 433
Higher benefit is due to higher bitrate, most likely.
idct_DC was ommitted from idct_add16 because the extra branching logic
turned out to make it significantly slower (the branching becomes much
more complicated and less likely as *both* 4x4 DCT blocks have to be
DC-only for it to work).
x264 iDCT code was modified to add a stride parameter, required for ffh264.
x86util.asm was included from x264 in full for simplicity's sake and
ease of use for adding future x264 assembly that uses it.
Dark Shikari
-------------- next part --------------
A non-text attachment was scrubbed...
Name: x264_idct.diff
Type: text/x-diff
Size: 13224 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090102/89b5063f/attachment.diff>
More information about the ffmpeg-devel
mailing list