[FFmpeg-devel] [PATCH] move h264 loopfilter strength code to yasm

Ronald S. Bultje rsbultje
Fri Sep 24 21:20:49 CEST 2010


Hi,

On Fri, Sep 24, 2010 at 12:26 PM, Daniel Verkamp <daniel at drv.nu> wrote:
> On Fri, Sep 24, 2010 at 9:04 AM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>> So removing pand (which doesn't do anything in the one case, and can
>> be replaced by a pxor in the other). With the attached patch #2, I get
>> this:
>> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//cc8uAjPS.s:315:bad
>> register name `%%mm0'
>> /var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//cc8uAjPS.s:520:bad
>> register name `%%mm0'
>>
>> What does that mean?
>
> If you omit all of the optional colon-separated arguments to asm, the
> % symbols before register names in the asm no longer need to be
> escaped with a second % (I suppose since there can be no substitution
> when there are no operand constraints). ?You can add an empty : or
> just drop the doubled % to avoid this.

OK, that fixes it. Oddly, it's the same speed, even though
#instructions is less. OK, so next then. Attached patch is supposed to
be part of a patch that decreases the insane amount of registers used
for temporary stuff that could be loaded directly (so instead of doing
(%0) where %0="m"(var[idx1]), use (%0,%1) with %0="r"(var) and
%1="r"(idx1). This works and is not slower (eventually it will be
faster when it saves a few registers, this is work-in-progress.

The second patch ("test") tries to use d_idx as a global (which it is,
in effect). Why doesn't this work?

-                "por  (%0,%1), %%mm1 \n" // nnz[b] || nnz[bn]
+                "por  %1(%0), %%mm1 \n" // nnz[b] || nnz[bn]
                 ::"r"(nnz+b_idx),
-                  "r"(d_idx)
+                  "g"(d_idx)

/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:341:junk
`(%rax)' after expression
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:341:suffix
or operands invalid for `por'
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:393:junk
`(%rax)' after register
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:430:junk
`(%rax)' after expression
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:430:suffix
or operands invalid for `por'
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:466:junk
`(%rax)' after expression
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:466:suffix
or operands invalid for `por'
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:502:junk
`(%rax)' after expression
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:502:suffix
or operands invalid for `por'
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:622:junk
`(%rax)' after expression
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:622:suffix
or operands invalid for `por'
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:712:junk
`(%rcx)' after expression
/var/folders/Rz/RzQTCSLsFPWQeOEO5EXsJE+++TI/-Tmp-//ccVhAktc.s:712:suffix
or operands invalid for `por'

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix-lfstrength-inline-asm-moreloop.patch
Type: application/octet-stream
Size: 2786 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100924/c8e3846d/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test
Type: application/octet-stream
Size: 696 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100924/c8e3846d/attachment-0001.obj>



More information about the ffmpeg-devel mailing list