[FFmpeg-devel] [PATCH] UltraSPARC VIS optimized yuv2rgb
Balatoni Denes
dbalatoni
Mon Jul 16 23:55:09 CEST 2007
Hi!
Monday 16 July 2007 23:11-kor Diego Biurrun ezt ?rta:
> >
> > --- libswscale.old/yuv2rgb.c.orig 1970-01-01 01:00:00.000000000 +0100
> > +++ libswscale/yuv2rgb.c.orig 2007-07-16 22:06:51.000000000 +0200
>
> Huh?
Sorry, my bad.
> > --- libswscale.old/yuv2rgb_vis.c 1970-01-01 01:00:00.000000000 +0100
> > +++ libswscale/yuv2rgb_vis.c 2007-07-16 22:16:16.000000000 +0200
> > @@ -0,0 +1,208 @@
> > +
> > +static short int __attribute__((aligned(8))) sparc_coeffs[4*10]=
> > +{
> > + 596, 596, 596, 596, // 16*1.164*32
> > + 8266, 8266, 8266, 8266, //128*2.018*32
> > + 1602, 1602, 1602, 1602, //128*0.391*32
> > + 3330, 3330, 3330, 3330, //128*0.813*32
> > + 6537, 6537, 6537, 6537, //128*1.596*32
> > + 9535, 9535, 9535, 9535, //1.164*32*256
> > + 6660, 6660, 6660, 6660, //0.813*32*256
> > + 13074,13074,13074,13074, //1.596*32*256
> > + 16531,16531,16531,16531, //2.018*32*256
> > + 3203, 3203, 3203, 3203, //0.391*32*256
> > +};
>
> This could be vertically aligned.
Well, it was, than you said indent it by four. So it is, and I won't change it
now :|
> > +#define YUV2RGB_KERNEL \
> > + /* ^^^^ f0=Y f3=u f5=v */ \
> > + "fmul8x16 %%f3, %%f48, %%f6 \n\t" \
> > + "fmul8x16 %%f19, %%f48, %%f22 \n\t" \
> > + "fmul8x16 %%f5, %%f44, %%f8 \n\t" \
> > + "fmul8x16 %%f21, %%f44, %%f24 \n\t" \
> > + "fmul8x16 %%f0, %%f42, %%f0 \n\t" \
> > + "fmul8x16 %%f16, %%f42, %%f16 \n\t" \
> > + "fmul8x16 %%f3, %%f50, %%f2 \n\t" \
> > + "fmul8x16 %%f19, %%f50, %%f18 \n\t" \
> > + "fmul8x16 %%f5, %%f46, %%f4 \n\t" \
> > + "fmul8x16 %%f21, %%f46, %%f20 \n\t" \
>
> You could align all the operands, this would make the code more
> readable.
I will not, this way it can be later deinterleaved, when it has been proven
that the interleaved version is not faster (which I am afraid is the case,
btw).
> Diego
bye
Denes
More information about the ffmpeg-devel
mailing list