[Ffmpeg-devel] benchmark of different CABAC routines
Michael Niedermayer
michaelni
Wed Oct 11 21:41:57 CEST 2006
Hi
On Wed, Oct 11, 2006 at 09:07:14PM +0200, Michael Niedermayer wrote:
[...]
> >
> > The modified non-branchless C version has
> >
> > uint8_t tmp = s + 2;
> > if (tmp < 126)
> > s = tmp;
> > *state = s;
> >
> > instead of
> >
> > *state= ff_h264_mps_state[s];
> >
> > Writing it that way instead of the "s += 2; if (s < 128) *state = s"
> > which was there earlier (and was slower) makes gcc use cmov instead of a
> > branch and is faster.
>
> interresting, i will experiment with this a little ...
ive used the following patch
@@ -396,7 +396,13 @@
"shl %%cl, %%edx \n\t"
"shl %%cl, %%ebx \n\t"
#endif
+#if 1
+ "leal 2(%%eax), %%ecx \n\t"
+ "cmpl $124, %%eax \n\t"
+ "cmovae %%eax, %%ecx \n\t"
+#else
"movzbl "MANGLE(ff_h264_mps_state)"(%%eax), %%ecx \n\t"
+#endif
"movb %%cl, (%1) \n\t"
//eax:state ebx:low, edx:range, esi:RangeLPS
"test %%bx, %%bx \n\t"
P3:
branched asm:
4110 dezicycles in decode_residual, 2094505 runs, 2647 skipsbits/s dup=0 drop=0
4126 dezicycles in decode_residual, 2094479 runs, 2673 skipsbits/s dup=0 drop=0
branched asm + patch:
4172 dezicycles in decode_residual, 2094355 runs, 2797 skipsbits/s dup=0 drop=0
4177 dezicycles in decode_residual, 2094341 runs, 2811 skipsbits/s dup=0 drop=0
athlon:
branched asm:
4067 dezicycles in decode_residual, 2096725 runs, 427 skipskbits/s dup=0 drop=0
4088 dezicycles in decode_residual, 2096733 runs, 419 skipskbits/s dup=0 drop=0
4089 dezicycles in decode_residual, 2096753 runs, 399 skipskbits/s dup=0 drop=0
branched asm + patch:
4066 dezicycles in decode_residual, 2096708 runs, 444 skipskbits/s dup=0 drop=0
4092 dezicycles in decode_residual, 2096747 runs, 405 skipskbits/s dup=0 drop=0
4065 dezicycles in decode_residual, 2096759 runs, 393 skipskbits/s dup=0 drop=0
so as far as i can see theres no speed gain from this
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is
More information about the ffmpeg-devel
mailing list