[Ffmpeg-devel] gcc4 support & MMX fixups (from Debian)
Måns Rullgård
mru
Wed Feb 1 21:59:26 CET 2006
Michael Niedermayer <michaelni at gmx.at> writes:
> Hi
>
> On Wed, Feb 01, 2006 at 01:56:21AM +0100, Pawe?? Sikora wrote:
>> Dnia Wednesday, 1 of February 2006 01:39, Aurelien Jacobs napisa??:
>> > Pawe?? Sikora <pluto at pld-linux.org> wrote:
>>
>> > > hmmm, the 4.1/4.0 fixed_transpose4x4 are equal but benchmarks differs.
>> > > maybe orig_transpose4x4 has different prologue?
>> >
>> > seems so.
>> >
>> > > [ 4.1 / -O2 ]
>> > > orig_transpose4x4:
>> > > leal (%rdx,%rdx), %r9d
>> > > leal (%rcx,%rcx), %eax
>> > > movslq %edx,%r11
>> > > movslq %ecx,%r8
>> > > movslq %r9d,%r10
>> > > addl %edx, %r9d
>> > > movslq %eax,%rdx
>> > > addl %ecx, %eax
>> > > movslq %r9d,%r9
>> > > cltq
>>
>> > [ 4.0 / -O2 ]
>> > orig_transpose4x4:
>> > leal (%rdx,%rdx), %r8d
>> > movslq %edx,%r10
>> > leaq (%rcx,%rcx,2), %rax
>> > movslq %r8d,%r9
>> > addl %edx, %r8d
>> > movslq %r8d,%r8
>>
>> yeah, the 4.1 gives worse code and my first benchmark can be send
>> to /dev/null. moreover the second fix (s/int/long/) simplifies x86-64
>> prologue and gives measurable gain.
>
> maybe we should typedef int int64_t; on x86-64? arrays where space matters
> should be of the intXX_t type or similar anyway
Did you by any chance mean #define int int64_t? The typedef you
suggest is illegal, and will break things.
--
M?ns Rullg?rd
mru at inprovide.com
More information about the ffmpeg-devel
mailing list