[MPlayer-dev-eng] [RFC] disable fastmemcpy on x86-64 by default
Attila Kinali
attila at kinali.ch
Sun May 27 23:38:57 CEST 2007
On Sun, 27 May 2007 23:11:45 +0200
Michael Niedermayer <michaelni at gmx.at> wrote:
> > Interesting are benchmark 2 and 5, which both are faster with
> > the patch.
I missed here benchmark 4, which is also slightly faster.
Interesting to note: benchmark 2 and 5 are faster in the VC(!)
bechmark 4 in the VO. Which suggest that benchmark 2 and 5 use dr.
(couldn't see anything in the -v log, and i'm to tired to check the code)
> hmm, theres something odd ...
> where is this code using any memcpy at all?
> doesnt mga vo always use mem2agpcpy() ?
> it seems the patch disabled this and uses plain memcpy() for it
Yes, from libvo/fast_memcpy.h:
---schnipp---
#ifdef USE_FASTMEMCPY
[...]
#else /* USE_FASTMEMCPY */
#define mem2agpcpy(a,b,c) memcpy(a,b,c)
#endif
---schnapp---
And due to the patch, USE_FASTMEMCPY isn't set anymore.
> and MIN_LEN is 2k and mem2agpcpy is just done per line which is
> less then 2k so it practically falls back to rep movsb
I tried a series of files which had shorter line sizes than
the ones liste, none of which showed any speedup. But i didn't
check for any use of dr.
> this is in impressive series of bugs ...
>
> if my hypothesis is correct then reimar will have some work to do ;)
BTW: is there any reason why fast_memcpy&co are not defined inline?
I even think that in the case w/o runtime cpu detection, we could
just do a #define fast_memcpy fast_memcpy_XXX (where XXX is one of MMX,SSE...)
and get completely rid not only of a call, but of a stack frame too.
Attila Kinali
--
Linux ist... wenn man einfache Dinge auch mit einer kryptischen
post-fix Sprache loesen kann
-- Daniel Hottinger
More information about the MPlayer-dev-eng
mailing list