[Ffmpeg-devel] [PATCH] Some MMX optimizations for Chinese AVS

Uoti Urpala uoti.urpala
Fri Jul 28 20:33:16 CEST 2006

On Fri, 2006-07-28 at 11:05 -0700, Mike Melanson wrote:
> Stefan Gehrer wrote:
> > stuff is functionally correct, I am not sure how to prove the efficiency
> > of the MMX implementation. Is there some way to find bottlenecks in it?
> > Like "execution of instruction foo stalls, due to dependency on bar", 
> > which would suggest reordering some instructions? Or can I only try some
> > changes, benchmark the whole function/block and retry? I am afraid the

> Search for uses of the existing START_TIMER() and STOP_TIMER() macros in
> the tree. They give you a good all-around picture of how tight pieces of
> code are performing. They will give you useful numbers for your purposes.

Oprofile should be able to give you detailed timing information without
changing the sources. I'm not sure whether it would be accurate enough
to identify problems on the level of individual stalling MMX
instructions etc though (after all the time used on a single instruction
can be a somewhat fuzzy value because of out of order execution).

More information about the ffmpeg-devel mailing list