[FFmpeg-devel] [PATCH] VC-1 MMX DSP functions
Michael Niedermayer
michaelni
Wed Oct 3 21:47:01 CEST 2007
Hi
On Wed, Oct 03, 2007 at 08:16:39PM +0200, Reimar D?ffinger wrote:
> Hello,
> On Tue, Oct 02, 2007 at 11:19:42PM +0200, Michael Niedermayer wrote:
> [...]
> > > + ASMALIGN(3)
> > > + "1: \n\t"
> >
> > how much speed is gained by the align?
>
> Inconclusive in my tests on AMD64 on a 64 bit OS:
> without:
> 3012 dezicycles in vc1_put_ver_16b_shift2_mmx, 1048397 runs, 179 skips
> 1249 dezicycles in vc1_put_hor_16b_shift2_mmx, 1048505 runs, 71 skips
>
> 3011 dezicycles in vc1_put_ver_16b_shift2_mmx, 1048397 runs, 179 skips
> 1232 dezicycles in vc1_put_hor_16b_shift2_mmx, 1048517 runs, 59 skips
>
> 3011 dezicycles in vc1_put_ver_16b_shift2_mmx, 1048514 runs, 62 skips
> 1232 dezicycles in vc1_put_hor_16b_shift2_mmx, 1048548 runs, 28 skips
>
> with:
> 3038 dezicycles in vc1_put_ver_16b_shift2_mmx, 1048340 runs, 236 skips
> 1259 dezicycles in vc1_put_hor_16b_shift2_mmx, 1048487 runs, 89 skips
>
> 3027 dezicycles in vc1_put_ver_16b_shift2_mmx, 1048415 runs, 161 skips
> 1259 dezicycles in vc1_put_hor_16b_shift2_mmx, 1048515 runs, 61 skips
>
> 3030 dezicycles in vc1_put_ver_16b_shift2_mmx, 1048384 runs, 192 skips
> 1258 dezicycles in vc1_put_hor_16b_shift2_mmx, 1048516 runs, 60 skips
i wouldnt call that "Inconclusive" but rather slower, and its what i
expected as thats how all code aligns i remember on x86 behaved
maybe we should try to missalign all branch targets :)
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Those who are too smart to engage in politics are punished by being
governed by those who are dumber. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20071003/90eb95c2/attachment.pgp>
More information about the ffmpeg-devel
mailing list