[FFmpeg-devel] [HACK] 50% faster H.264 decoding
Ronald S. Bultje
Fri Aug 20 02:05:08 CEST 2010
On Thu, Aug 19, 2010 at 7:30 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> I'll also benchmark mc4 (if that doesn't improve, the whole patch is
> pointless ;-) ) and mc8 (should stay the same, otherwise again the
> patch is pointless)...
After looking through these even-weirder looking ones, I decided that
something was wrong and noticed one of my patches had HAVE_MMX
commented out, hence causing odd numbers from the C-versions... Don't
believe my previous mc2 numbers also, now they're about identical
before vs. after. ;-). For reference, here's mc4/mc8 numbers also.
811 dezicycles in w=2, 65527 runs, 9 skips
796 dezicycles in w=2, 131061 runs, 11 skips
827 dezicycles in w=2, 65530 runs, 6 skips
807 dezicycles in w=2, 131065 runs, 7 skips
So that's about the same.
504 dezicycles in w=2, 262140 runs, 4 skips
501 dezicycles in w=2, 524275 runs, 13 skips
497 dezicycles in w=2, 1048553 runs, 23 skips
503 dezicycles in w=2, 131066 runs, 6 skips
499 dezicycles in w=2, 262135 runs, 9 skips
496 dezicycles in w=2, 524272 runs, 16 skips
499 dezicycles in w=2, 1048543 runs, 33 skips
Also about the same, which is probably because x=0, y=0 doesn't occur
very much statistically in a random distribution, only 1 in 64).
That's a little disappointing, I would've expected to see them more,
but then again I'm testing reference stream samples for now...
I should probably write 1D versions also, which occur more often (14
in 64) and then re-measure as I did just now, to be able to see any
benefit at all. Does that make sense?
More information about the ffmpeg-devel