[Ffmpeg-devel] [PATCH] Some MMX optimizations for Chinese AVS

Fri Jul 28 19:57:41 CEST 2006

Hi,

attached is a patch which implements inverse transform and a few sub-pel
motion compensation functions in MMX assembler. Because of its similarity,
I took a lot of ideas from the H.264 MMX code. While I checked that the
stuff is functionally correct, I am not sure how to prove the efficiency
of the MMX implementation. Is there some way to find bottlenecks in it?
Like "execution of instruction foo stalls, due to dependency on bar", 
which would suggest reordering some instructions? Or can I only try some
changes, benchmark the whole function/block and retry? I am afraid the
latter, as probably the difference of implementations of MMX extension
between different processor models might also be quite large.

Because the motion compensation is implemented only in part, I am not sure
if this is worth a commit. But on the other hand, it makes it easier if somebody has ideas for improvement, and it already gives an overall speedup
of more than 20% on my machine.

Regards
Stefan Gehrer
-- 

Echte DSL-Flatrate dauerhaft f?r 0,- Euro*. Nur noch kurze Zeit!
"Feel free" mit GMX DSL: http://www.gmx.net/de/go/dsl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cavsmmx.patch
Type: application/octet-stream
Size: 26382 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20060728/18a269cb/attachment.obj>