[Ffmpeg-devel] [PATCH] another vorbis optimization
Michael Niedermayer
michaelni
Tue Aug 8 12:14:56 CEST 2006
Hi
On Mon, Aug 07, 2006 at 11:32:34PM -0700, Loren Merritt wrote:
> Another 6% faster vorbis decoding. But I am unsure as to the cleanest way
> to integrate it with run-time cpu detection.
hmm currently we add the BIAS during the windowing, maybe it could
be done prior to the imdct and only to a few coeffs, iam not sure though my
knowledge of the mdct-windowing thingy isnt too good
someone simply would have to feed a constant (=BIAS) vector through
the windowing + MDCT to see if the resulting vector is (approximately) sparse
or not
alternatively you could optimize the windowing stuff in SSE(2) too then there
would be no extra special case :)
[...]
> + ::"r"(15<<23)
> + );
> + for(i=0;i<len;i+=4) {
> + asm volatile(
> + "movdqa %1, %%xmm0 \n\t"
> + "paddd %%xmm7, %%xmm0 \n\t"
i think that can be avoided by simply multiplying the windows by 1<<15
> + "cvtps2dq %%xmm0, %%xmm0 \n\t"
if that is replaced by cvtps2pi and the code below changed accordingly then
the code should run on SSE1 cpus, if its slower a seperate SSE1 variant
could be added too, thats of course just an idea, iam happy with SSE2 code
too, just my cpu here isnt :)
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is
More information about the ffmpeg-devel
mailing list