[FFmpeg-devel] [PATCH] update doc/optimization.txt
Ronald S. Bultje
Wed Sep 22 14:37:07 CEST 2010
On Tue, Sep 21, 2010 at 2:43 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Tue, Sep 21, 2010 at 12:20:00PM -0400, Ronald S. Bultje wrote:
>> ?optimization.txt | ? 27 ++++++++++++++++++++++++---
>> ?1 file changed, 24 insertions(+), 3 deletions(-)
>> 3aebbea14722c8709b2e1dd54cd60ffecef85044 ?doc.diff
>> Index: doc/optimization.txt
>> --- doc/optimization.txt ? ? ?(revision 25135)
>> +++ doc/optimization.txt ? ? ?(working copy)
>> @@ -157,17 +157,38 @@
>> ? ? ?"1: ....
>> ? ? ?...
>> - ? ?"jump_instruciton ....
>> + ? ?"jump_instruction ....
>> ?Do not use C loops:
> comit this or ill go crazy seeing it every time in unrelated patches ;)
OK, committed this piece.
>> ? ? ?__asm__(
>> ? ? ? ? ?...
>> -Use __asm__() instead of intrinsics. The latter requires a good optimizing compiler
>> -which gcc is not.
>> +Do not use multiple inline asm blocks in a single C function. The compiler is
>> +not required to maintain register values between asm blocks, and depending on
>> +this behaviour can break with any future version of gcc.
>> +It also breaks on
>> +64-bit Windows.
> this is not strictly correct so i suggest you drop it
> but iam fine if you correctly document win64 problems but it seems alot of work
> and bordering on being off topic
OK, removed then.
>> +For x86, use yasm or __asm__(), do not use intrinsics. The latter requires a
>> +good optimizing compiler which gcc is not.
>> +Inline asm vs. external asm
>> +Both inline asm (__asm__("..") in a .c file, handled by a compiler such as gcc)
>> +and external asm (.s or .asm files, handled by an assembler such as yasm/nasm)
>> +are accepted in FFmpeg. Which one to use differs per specific case.
>> +- if your code is intended to be inlined in a C function, inline asm is always
>> + ? better, because external asm cannot be inlined
>> +- if your code calls external functions, yasm is always better
> so far i can accept it
>> + , because of the
>> + ? red zone on the stack in x86-64 which means undefined behaviour if gcc is
>> + ? not aware of a function call within the inline asm code block
> i think you confuse some things here
> 1. no modern kernel runs its interupt handlers on the user applications stack
> ? such kernel would be exploitable
> 2. running signal handlers on the user app stack is a very stupid idea but it
> ? maybe is still done, i dont know
> 3. you can change the stack pointer any way you like, so you can make sure
> ? that a function you will call and any signal handlers have whatever stack
> ? they needs, this maybe is a bit messy (if you want to access things that may
> ? be accessed through the stack pointer, but its not a thing that cant be done
> 4. the compiler may choose to optimize the stack pointer so that it doesnt
> ? point to things you can just use, gcc might limit this to functions that
> ? dont call other functions and it might limit that to +-128 byte aka red
> ? zone of the "correct" stack pointer.
> ? Ive used the stack pointer as general purpose register a decade ago in
> ? windows and it worked fine, ive also done this on linux and it worked fine
> ? too but i dont think ive ever tested this with signal handlers.
> ? either way its not too hard to add/sub 128 from the stack pointer to move it
> ? to a clear area before the call
> that said, of course yasm is the choice here iam just saying it _can_ be done
> in inline asm too
I can just state it without reason and leave it as-is (no problem for
me), unless you have suggestions on how to re-word it.
What about the last of the three bulletpoints?
- in many cases, both can be used and it just depends on the preference of the
person writing the asm. For new asm, the choice is up to you. For existing
asm, you'll likely want to maintain whatever form it is currently in unless
there is a good reason to change it.
You should love that piece. ;-). It basically says I won't convert all
too much more inline asm to external asm (yasm) unless I have a good
reason. I might even have to understand inline asm at some point.
Also, should I mention (in general tips) that functions that use huge
structs (e.g. MpegEncContext) are A) "discouragable" in general and B)
better written in inline asm than yasm because of the difficulty of
predicting struct offsets?
More information about the ffmpeg-devel