[FFmpeg-devel] [PATCH 2/3] x86/float_dsp: unroll loop in vector_fmac_scalar
michaelni at gmx.at
Wed Apr 16 19:11:40 CEST 2014
On Wed, Apr 16, 2014 at 06:35:56PM +0200, Christophe Gisquet wrote:
> Le 16 avr. 2014 18:12, "James Almer" <jamrial at gmail.com> a écrit :
> > Athlon 64 7750+ mingw-w64. Went from 274 cycles to 257 when i benched with
> > the dts-es sample i uploaded for the fate test.
> > Also, does aac even use vector_fmac_scalar? A grep on libavcodec shows
> > results only in dcadec.c.
> I must have mistaken in which batch I modified what code. So what I am
> remembering must have been for something else, then.
> > The difference in the resulting code is in the order of instructions
> > to the unrolling of the loop. The mulps now have enough room to finish
> > the addps are executed, and so do the addps before the mova to memory.
> I would have expected this to be handled by out of order execution. But I
> guess the mulps have too long a latency to not cause a dependency. I can't
> help benchmark this atm but there should be no harm to your changes then.
> OK from my side then.
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Observe your enemies, for they first find out your faults. -- Antisthenes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 198 bytes
Desc: Digital signature
More information about the ffmpeg-devel