[FFmpeg-devel] [PATCH] update doc/optimization.txt
Ronald S. Bultje
Tue Sep 21 16:07:40 CEST 2010
On Tue, Sep 21, 2010 at 9:55 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Tue, Sep 21, 2010 at 09:48:43AM -0400, Ronald S. Bultje wrote:
>> On Tue, Sep 21, 2010 at 9:30 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> > On Tue, Sep 21, 2010 at 05:37:40AM -0700, Jason Garrett-Glaser wrote:
>> >> > interresting strawman argument
>> >> > noone was talking about cases that cannot easily be done in inline asm
>> >> > not that calling from inline would be impossible or anything but i surely
>> >> > agree that for these 0.1% of asm yasm is likely the better choice
>> >> You mean this 99%. ?It's only 0.1% because you don't think about
>> >> optimizations that can't be done under your current system.
>> > please elaborate on what other optimizations are possible in yasm that cannot
>> > be done in inline asm.
>> The biggest one is that I can create a "double-width" version in SSE*
>> (usually SSE2) and a "single-width" version in MMX* (usually MMX2) of
>> a function (e.g. subpel MC, weighted prediction, intra prediction, or
>> something) in a single go. I don't need to write the function twice.
>> Optimizing one will optimize both. This is incredibly handy if you're
>> writing new asm code.
> thats just a source difference through macro/preprocessor use not a
> optimization that yasm can do that inline cannot.
> And actually its unlikely that this is optimal. SSE2 and later cpus are
> unlikely to have the same optimal instruction sequence that pre SSE2 cpus had.
> so 2 functions do make sense.
> and it wouldnt be impossible to merge them with the C preprocessor but thats
> not something i would consider a good idea, just theoretical possible
I think the point here is that this "C preprocessor" does not yet exist.
For yasm, we're talking about x86inc.asm. It exists. It's fantastic. I
seriously can't believe anyone would write asm without it. SWAP should
get the nobel prize for computing science (and makes code faster
and/or easier to understand).
More information about the ffmpeg-devel