[FFmpeg-devel] [PATCH] update doc/optimization.txt

Michael Niedermayer michaelni
Tue Sep 21 20:43:56 CEST 2010

On Tue, Sep 21, 2010 at 12:20:00PM -0400, Ronald S. Bultje wrote:
> Hi,
> On Tue, Sep 21, 2010 at 11:29 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Tue, Sep 21, 2010 at 10:32:38AM -0400, Ronald S. Bultje wrote:
> >> On Tue, Sep 21, 2010 at 10:29 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> > On Tue, Sep 21, 2010 at 10:15:00AM -0400, Ronald S. Bultje wrote:
> >> >> On Tue, Sep 21, 2010 at 10:11 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> >> > also id like to repeat what i said earlier, that iam all in favor of yasm
> >> >> > when an optimization needs it
> >> >>
> >> >> That sounds promising. So the doc update to mention yasm is OK then? ;-).
> >> >
> >> > its saying something else ;)
> >>
> >> It doesn't say that people should convert inline asm to yasm, or that
> >> there is a preference for one or the other. What does it say that it
> >> shouldn't (or the other way around)?
> >
> > it doesnt tell people which to use and why
> > there are arguments for either ...
> Attached any better? (It's still mostly undefined.)
> Ronald

>  optimization.txt |   27 ++++++++++++++++++++++++---
>  1 file changed, 24 insertions(+), 3 deletions(-)
> 3aebbea14722c8709b2e1dd54cd60ffecef85044  doc.diff
> Index: doc/optimization.txt
> ===================================================================
> --- doc/optimization.txt	(revision 25135)
> +++ doc/optimization.txt	(working copy)
> @@ -157,17 +157,38 @@
>  __asm__(
>      "1: ....
>      ...
> -    "jump_instruciton ....
> +    "jump_instruction ....
>  Do not use C loops:
>  do{

comit this or ill go crazy seeing it every time in unrelated patches ;)

>      __asm__(
>          ...
>  }while()
> -Use __asm__() instead of intrinsics. The latter requires a good optimizing compiler
> -which gcc is not.
> +Do not use multiple inline asm blocks in a single C function. The compiler is
> +not required to maintain register values between asm blocks, and depending on
> +this behaviour can break with any future version of gcc.

> +It also breaks on
> +64-bit Windows.

this is not strictly correct so i suggest you drop it
but iam fine if you correctly document win64 problems but it seems alot of work
and bordering on being off topic

> +For x86, use yasm or __asm__(), do not use intrinsics. The latter requires a
> +good optimizing compiler which gcc is not.
> +Inline asm vs. external asm
> +---------------------------
> +Both inline asm (__asm__("..") in a .c file, handled by a compiler such as gcc)
> +and external asm (.s or .asm files, handled by an assembler such as yasm/nasm)
> +are accepted in FFmpeg. Which one to use differs per specific case.
> +
> +- if your code is intended to be inlined in a C function, inline asm is always
> +   better, because external asm cannot be inlined
> +- if your code calls external functions, yasm is always better

so far i can accept it

> + , because of the
> +   red zone on the stack in x86-64 which means undefined behaviour if gcc is
> +   not aware of a function call within the inline asm code block

i think you confuse some things here
1. no modern kernel runs its interupt handlers on the user applications stack
   such kernel would be exploitable
2. running signal handlers on the user app stack is a very stupid idea but it
   maybe is still done, i dont know
3. you can change the stack pointer any way you like, so you can make sure
   that a function you will call and any signal handlers have whatever stack
   they needs, this maybe is a bit messy (if you want to access things that may
   be accessed through the stack pointer, but its not a thing that cant be done
4. the compiler may choose to optimize the stack pointer so that it doesnt
   point to things you can just use, gcc might limit this to functions that
   dont call other functions and it might limit that to +-128 byte aka red
   zone of the "correct" stack pointer.
   Ive used the stack pointer as general purpose register a decade ago in
   windows and it worked fine, ive also done this on linux and it worked fine
   too but i dont think ive ever tested this with signal handlers.
   either way its not too hard to add/sub 128 from the stack pointer to move it
   to a clear area before the call

that said, of course yasm is the choice here iam just saying it _can_ be done
in inline asm too

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The greatest way to live with honor in this world is to be what we pretend
to be. -- Socrates
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100921/dbbdb1eb/attachment.pgp>

More information about the ffmpeg-devel mailing list