[FFmpeg-devel] [PATCH] SPARC VIS simple_idct try#7

Michael Niedermayer michaelni
Thu Aug 30 05:03:59 CEST 2007


Hi

On Thu, Aug 30, 2007 at 02:13:20AM +0200, Balatoni Denes wrote:
> Hi Michael!
> 
> Just a question (and a half question):
> 
> Thursday 30 August 2007 01:25-kor Michael Niedermayer ezt ?rta:
> > > +    /* 3. column */\
> > > +        "3:                             \n\t"\
> > > +        "for %%f8, %%f10, %%f60         \n\t"\
> > > +        "fcmpd %%fcc0, %%f62, %%f60     \n\t"\
> >
> > the for and fcmp can similarely be moved up, you have to switch to fcc1
> > though to avoid a conflict with the above ones
> > this applies to the other for/fcmpd as well
> 
> Why do I have to switch to fcc1, there is plenty of space to place the fcmpds 
> without conflict ? 

well the previous instruction is a branch using fcc0 so if fcmpd is moved
before that it will break the code, using fcc1 appeared to me to be a
solution


> Also checking for equality is %fcc0.

iam no sparc expert i only know what the docs say and they say:
fbe{,a}{,pt|,pn}    %fccn, label

and that looks like the equal branch instruction can use any of the 4 fcc
and the cmp can as well set all


> 
> >
> > [...]
> >
> > > +        TRANSPOSE
> > > +        IDCT4ROWS
> > > +        SCALEROWS
> > > +        PUTPIXELSCLAMPED("0")
> > > +        LOAD("%2+64")
> > > +        TRANSPOSE
> > > +        IDCT4ROWS
> > > +        SCALEROWS
> > > +        PUTPIXELSCLAMPED("4")
> >
> > the SCALEROWS is unneeded, the fpack16 can do the downshift and a single
> > addition to the 0,0 coefficient before the idct or first column after the
> > transpose can compensate for the rounding difference
> 
> Indeed, I missed this. However that one add has to be after multiplication - 
> because while in the C simple idct all coefficients are multiplied by 
> 1/sqrt(2), here they are not (correct me if I am wrong, but this is slightly 
> more accurate imho).

maybe, maybe not, i dont know, it would be insterresting to test

for example 2*2 is a better choice than (int)sqrt(8)*(int)sqrt(8)

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The educated differ from the uneducated as much as the living from the
dead. -- Aristotle 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070830/513edd8a/attachment.pgp>



More information about the ffmpeg-devel mailing list