[FFmpeg-devel] [PATCH] move h264 loopfilter strength code to yasm

Michael Niedermayer michaelni
Sat Sep 25 03:42:24 CEST 2010


On Sat, Sep 25, 2010 at 03:40:01AM +0200, Michael Niedermayer wrote:
> On Fri, Sep 24, 2010 at 07:33:11PM -0400, Ronald S. Bultje wrote:
> > Hi,
> > 
> > On Sep 24, 2010, at 5:33 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > > On Fri, Sep 24, 2010 at 05:10:35PM -0400, Ronald S. Bultje wrote:
> > > [...]
> > >> -                        "psubw          (%2), %%mm1 \n"
> > >> -                        "psubw         8(%2), %%mm2 \n"
> > >> -                        "psubw       160(%2), %%mm3 \n"
> > >> -                        "psubw       168(%2), %%mm4 \n"
> > >> +                        "psubw          (%3), %%mm1 \n"
> > >> +                        "psubw         8(%3), %%mm2 \n"
> > >> +                        "psubw       160(%3), %%mm3 \n"
> > >> +                        "psubw       168(%3), %%mm4 \n"
> > >>                         "packsswb      %%mm2, %%mm1 \n"
> > >>                         "packsswb      %%mm4, %%mm3 \n"
> > >>                         "paddb         %%mm6, %%mm1 \n"
> > >> @@ -111,26 +111,28 @@
> > >>                         "por           %%mm1, %%mm0 \n"
> > >>                         "pshufw $0x4E, %%mm0, %%mm1 \n"
> > >>                         "pminub        %%mm1, %%mm0 \n"
> > >> -                        ::"r"(d_idx),
> > >> -                          "r"(ref[0]+b_idx),
> > >> -                          "r"(mv[0]+b_idx)
> > >> +                        ::"r"(ref[0]+b_idx),
> > >> +                          "r"(ref[0]+b_idx+d_idx),
> > >> +                          "r"(mv[0]+b_idx),
> > >> +                          "r"(mv[0]+b_idx+d_idx)
> > > 
> > > this doesnt look correct
> > > 
> > > and patches should be tested before submitting ideally, i tend to review
> > > based on the assumtion that the code has been tested
> > > (like operands and constraints i dont need to check if they match because
> > > it wouldnt work would they not match)
> > 
> > Yeah, I over-enthusiastically screwed up here, sorry. First patch should still be ok, I'll ask on gcc-list how to write a constant without the $. Without that, it'll be hard to get the last 10 cycles off, I'm affraid...
> 
>  try %a0 and %c0 with "i" it produces a constant without $
>  %n0 will produce a negated one
> 
> now where was that sawn off shotgun ...
> 
> gcc rule #1, the documentation is the source and binary forget the manual

and to preemt mans, yes i also doubt that will work in icc
maybe someone could test?

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If a bugfix only changes things apparently unrelated to the bug with no
further explanation, that is a good sign that the bugfix is wrong.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100925/4bef1357/attachment.pgp>



More information about the ffmpeg-devel mailing list