[FFmpeg-devel] [PATCH] Dsputilize some functions from APE decode 1/2 - Altivec implementation
Kostya
kostya.shishkov
Thu Jul 10 19:09:17 CEST 2008
On Thu, Jul 10, 2008 at 08:43:12AM -0600, Loren Merritt wrote:
> On Thu, 10 Jul 2008, Kostya wrote:
>
> > On Tue, Jul 08, 2008 at 03:18:12PM -0600, Loren Merritt wrote:
> >> Entirely untested (I don't have a ppc), but this looks like it should be
> >> faster. Your other functions would benefit from similar.
> >> For that matter, a whole lot of dsp functions put lvsl inside the loop
> >> when it should be constant (assuming stride%16==0).
> >>
> >> --Loren Merritt
> >
> > It does not work as supposed modifying only the start of output array.
> > Thanks for trying anyway.
>
> That's what I get for trying to write without a compiler.
> should be
> + pv1 += 2;
> + pv2 += 2;
>
> --Loren Merritt
Now it works fine but on my G4 under macosx it gives such numbers:
clocks for 10 million cycles on arrays of length 256
unoptimized gcc 3.3
Mine: 6726
Yours: 7220
gcc-3.3
Mine: 960
Yours: 1468
unoptimized gcc 4.0.1
Mine: 6935
Yours: 7682
gcc-4.0 -O3
Mine: 1113
Yours: 1498
I guess it was MMX that was designed with black magic in mind.
More information about the ffmpeg-devel
mailing list