[FFmpeg-devel] [HACK] 50% faster H.264 decoding

Mon Aug 23 02:31:24 CEST 2010

On 08/23/2010 01:26 AM, Jason Garrett-Glaser wrote:
>> ASM optimizations are single target and could be outright wrong when
>> recycled (e.g x86 vs amd64).
> 
> You have no idea how the yasm abstraction layer works and should stop
> posting about things you aren't familiar with.

Point me something, I'll be glad to learn. I'm not sure you are that
familiar with link time optimizations as well.

Having "abstration layers" (macro to rename registers and instruction
alias?) still doesn't change the fact that asm is closer to the cpu and
when you write it you expect it to behaves is a way that is right for
the target cpu. If the instruction costs, load delay and such change a
different implementation of the same arch, isn't just a matter of
recompiling with a compiler aware of it: you have to rewrite at least
part of your asm in order to have it perform equally well.

That's fine if you don't have any other option working comparably, since
we all know that specific C intrinsics for amd64 and neon aren't
reliable at all nowadays and even inline asm has its share of unsolved
bugs, but that doesn't mean that stand alone assembly is the
one-true-way or it's a nice feature the fact using yasm would prevent
link time optimization on toolchains providing that.

lu

-- 

Luca Barbato
Gentoo/linux
http://dev.gentoo.org/~lu_zero