[FFmpeg-devel] [PATCH] move h264 loopfilter strength code to yasm
Ronald S. Bultje
Fri Sep 24 13:17:06 CEST 2010
On Thu, Sep 23, 2010 at 11:18 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Thu, Sep 23, 2010 at 06:13:30PM -0400, Ronald S. Bultje wrote:
>> $subj. This could likely be done in inline asm as well but I still
>> can't write that.
> can i help you to learn it?
- we need a way to clobber xmm registers. This turns out to be very
difficult and I haven't really looked at it very seriously. Help get
something like Reimar's patch committed without breaking any of the
fate systems. You still are maintainer and there is a patch, there's
no reason why this can't be finished.
- write a smallish tutorial, e.g. "how to write a copy_pixels() in asm
using mmx/sse" plus maybe a list of useful macros like TRANSPOSE (so I
can grep for them and see how it changes register order). I didn't
learn yasm, I looked around for people to teach me, and Jason did. And
hey, it works, I sort of get this stuff now. Inline asm has some
tricks that yasm does not have, like three colons at the end of each
block and these modifiers that are there. I think I can sort of read
them, write a few examples on how to use this properly, efficiently
and how to not shoot yourself in the foot. Or a
good-practices-with-inline-asm guide. Once I have something to start
with, it shouldn't be too hard. After starting with yasm (this is 1-2
months ago), I tried adding one extra argument to a function (I think
SSE2 MC) to read one extra register using "r"(src+stride*3) or
something like that, and _it just didn't work_. It wouldn't compile,
giving weird compile errors about invalid constraints or something
like that. That's extremely frustrating, especially when nobody on IRC
understands the error (or how to write such code) either.
If you get started with these two, I can send a patch that does the
same as the original but without moving it away from inline asm.
More information about the ffmpeg-devel