[FFmpeg-devel] [PATCH] update doc/optimization.txt
Tue Sep 21 16:08:51 CEST 2010
Michael Niedermayer <michaelni at gmx.at> writes:
> On Tue, Sep 21, 2010 at 09:48:43AM -0400, Ronald S. Bultje wrote:
>> On Tue, Sep 21, 2010 at 9:30 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> > On Tue, Sep 21, 2010 at 05:37:40AM -0700, Jason Garrett-Glaser wrote:
>> >> > interresting strawman argument
>> >> > noone was talking about cases that cannot easily be done in inline asm
>> >> > not that calling from inline would be impossible or anything but i surely
>> >> > agree that for these 0.1% of asm yasm is likely the better choice
>> >> You mean this 99%. ?It's only 0.1% because you don't think about
>> >> optimizations that can't be done under your current system.
>> > please elaborate on what other optimizations are possible in yasm that cannot
>> > be done in inline asm.
>> The biggest one is that I can create a "double-width" version in SSE*
>> (usually SSE2) and a "single-width" version in MMX* (usually MMX2) of
>> a function (e.g. subpel MC, weighted prediction, intra prediction, or
>> something) in a single go. I don't need to write the function twice.
>> Optimizing one will optimize both. This is incredibly handy if you're
>> writing new asm code.
> thats just a source difference through macro/preprocessor use not a
> optimization that yasm can do that inline cannot.
> And actually its unlikely that this is optimal. SSE2 and later cpus are
> unlikely to have the same optimal instruction sequence that pre SSE2 cpus had.
> so 2 functions do make sense.
All SSE2 CPUs execute out of order, so the exact sequence doesn't matter.
mans at mansr.com
More information about the ffmpeg-devel