[Ffmpeg-devel] [PATCH] SSE counterpart of ff_imdct_calc_3dn2
Michael Niedermayer
michaelni
Thu Aug 24 12:23:42 CEST 2006
Hi
On Thu, Aug 24, 2006 at 06:39:52AM +0200, Luca Barbato wrote:
> Rich Felker wrote:
>
> >
> > And I still insist that this statement is fundamentally false. Better
> > than what? Whatever code gcc generates with the intrinsics, you can
> > always generate the same or better code if you just write it yourself.
>
> I don't know how slow each possible op is for each cpu, gcc should for
> most of the documented ones....
gcc does not know it (exactly) either, as
1. the docs have the tendency to be somewhat "over optimistic" in the values
IIRC there are some cases where the claimed througput of some instructions
cannot be achieved as some stages of the pipeline simply cant handle it
so gcc devels IMO should benchmark the instructions themselfs not
blindly belive what the docs say
2. "modern" cpus are complex and considering every part is not possible, its
not possible because you dont know how long a read or write will need
not know if a branch will be predicted or not and not know at what
address various things will be, yes that matters, put 3 variables exactly
4096 byte appart and access them in a loop you will have 100% cache misses
on several cpus, and if not try 5 such variables or try a larger power of
2 spacing, these are all things the author will likely know more about
then gcc as she wrote the code and understands it gcc doesnt
3. different "revisions" of cpus need different amounts of time to execute
stuff, the various P4s for example, gcc does not know about that though
4. the docs only mention some information, theres alot missing for example
no doc ive seen from amd/intel explained how the cpus reorder instructions
furthermore if you dont know how fast approximately each op is then your
code will suck and it wont make a difference if gcc reorders things or not
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is
More information about the ffmpeg-devel
mailing list