[Ffmpeg-devel] [PATCH] Some MMX optimizations for Chinese AVS
Fri Jul 28 20:05:36 CEST 2006
Stefan Gehrer wrote:
> attached is a patch which implements inverse transform and a few sub-pel
> motion compensation functions in MMX assembler. Because of its similarity,
> I took a lot of ideas from the H.264 MMX code. While I checked that the
> stuff is functionally correct, I am not sure how to prove the efficiency
> of the MMX implementation. Is there some way to find bottlenecks in it?
> Like "execution of instruction foo stalls, due to dependency on bar",
> which would suggest reordering some instructions? Or can I only try some
> changes, benchmark the whole function/block and retry? I am afraid the
> latter, as probably the difference of implementations of MMX extension
> between different processor models might also be quite large.
> Because the motion compensation is implemented only in part, I am not sure
> if this is worth a commit. But on the other hand, it makes it easier if somebody has ideas for improvement, and it already gives an overall speedup
> of more than 20% on my machine.
Search for uses of the existing START_TIMER() and STOP_TIMER() macros in
the tree. They give you a good all-around picture of how tight pieces of
code are performing. They will give you useful numbers for your purposes.
More information about the ffmpeg-devel