[Ffmpeg-devel] fastmemcpy in ffmpeg
Gunnar von Boehn
gunnar
Mon Sep 25 15:55:57 CEST 2006
Hi Ulrich,
Ulrich von Zadow wrote:
> Gunnar von Boehn wrote:
>
>>Hi
>>
>>Diego Biurrun wrote:
>>
>>>On Mon, Sep 25, 2006 at 10:47:40AM +0200, Michel Bardiaux wrote:
>>>
>>>
>>>>Silvano Galliani (kysucix) wrote:
>>>>
>>>>
>>>>>Is there some plan to include and use fastmemcpy implementation from
>>>>>mplayer?
>>
>>I've once collected and benchmarked a number of memcopy routines both
>>for x86 and PowerPC.
>>
>>http://www.greyhound-data.com/gunnar/glibc/
>
>
> I only found ppc data on the site, which, while very interesting, is
> only half of what you promised ;-). Did I just miss a link?
The charts of the detailed benchmarks only shows PPC CPUs, that is
right. But it did some tests on x86 (Intel/AMD) CPUs as well.
This overview chart shows the difference between Linux memcpy and
optimized version on different CPUs including x86.
http://www.greyhound-data.com/gunnar/glibc/membench_memcpy.gif
A simple but very effective way to double the memcopy speed
is to prefetch (stream) the source in while copying. Just by
adding one prefetch instruction to the normal Linux memcpy you can speed
it up a lot 50%.
These few very simple rules work on nearly all CPU architectures
The below two points are needed for best burst write speed
- unrole the copy to write in cache lines
- align the copy so that the write is aligned to the cacheline borders
Prefetch the source some cache lines ahead to prevent source data
stalls. The best prefetch distance depends on the CPU, typically very
good numbers are 3-7 lines ahead.
On x86 its good to use MMX registers/instructions for prefetching and
copying the data (by using MMX instructions you can memcpy without
totally loosing (read overwriting) your data cache during a memcopy.
My focus was on PPC but some x86 routines should be included in the
source of the Linux benchmarks. I can extract these routines from the
sources for you if needed.
For copies bigger than 128 bytes a CPU optimized routines will usually
be about twice as fast as the normal glibc Linux version.
Cheers
Gunnar
More information about the ffmpeg-devel
mailing list