[FFmpeg-devel] [PATCH] IFF: Change decodeplane8 while to do ... while for getting 1% speedup
Sebastian Vater
cdgs.basty
Mon May 10 00:50:56 CEST 2010
M?ns Rullg?rd a ?crit :
> Sebastian Vater <cdgs.basty at googlemail.com> writes:
>
>
>> Benchmarking it resulted in a 1% speedup using MRLake.iff.
>>
>
> Just a note on benchmarking. Any perceived speedup of less than 5% or
> so must be carefully checked by statistical methods. Run-to-run
> variance is often of that magnitude, especially if there is anything
> else running on the system.
>
Since it's averaged for each call (which is around 4096 times for
MRLake.iff), that issue shouldn't be a problem here.
After all I get differences +/- 10 dezicycles each run...I think it's
accurate enough.
BTW, for those interested, here is the difference between while { ... }
and do { ... } while):
1.
START_TIMER;
2.
const uint64_t *lut = plane8_lut[plane];
3.
while (buf_size--) {
4.
const uint64_t v = AV_RN64A(dst) | lut[*buf++];
5.
AV_WN64A(dst, v);
6.
dst += 8;
7.
}
8.
9.
STOP_TIMER("decodeplane8");
10.
9d0: 0f b6 75 00 movzbl 0x0(%ebp),%esi
11.
9d4: 83 c5 01 add $0x1,%ebp
12.
9d7: 8b 4c 24 54 mov 0x54(%esp),%ecx
13.
9db: 8b 5f 04 mov 0x4(%edi),%ebx
14.
9de: 8b 07 mov (%edi),%eax
15.
9e0: 8b 54 f1 04 mov 0x4(%ecx,%esi,8),%edx
16.
9e4: 0b 04 f1 or (%ecx,%esi,8),%eax
17.
9e7: 09 da or %ebx,%edx
18.
9e9: 89 07 mov %eax,(%edi)
19.
9eb: 89 57 04 mov %edx,0x4(%edi)
20.
9ee: 83 c7 08 add $0x8,%edi
21.
9f1: 3b 6c 24 2c cmp 0x2c(%esp),%ebp
22.
9f5: 75 d9 jne 9d0
<decode_frame_ilbm+0x580>
23.
24.
START_TIMER;
25.
const uint64_t *lut = plane8_lut[plane];
26.
do {
27.
const uint64_t v = AV_RN64A(dst) | lut[*buf++];
28.
AV_WN64A(dst, v);
29.
dst += 8;
30.
} while (--buf_size);
31.
STOP_TIMER("decodeplane8");
32.
33.
9b0: 0f b6 75 00 movzbl 0x0(%ebp),%esi
34.
9b4: 83 c5 01 add $0x1,%ebp
35.
9b7: 8b 4c 24 44 mov 0x44(%esp),%ecx
36.
9bb: 8b 07 mov (%edi),%eax
37.
9bd: 8b 57 04 mov 0x4(%edi),%edx
38.
9c0: 0b 04 f1 or (%ecx,%esi,8),%eax
39.
9c3: 0b 54 f1 04 or 0x4(%ecx,%esi,8),%edx
40.
9c7: 89 07 mov %eax,(%edi)
41.
9c9: 89 57 04 mov %edx,0x4(%edi)
42.
9cc: 83 c7 08 add $0x8,%edi
43.
9cf: 83 eb 01 sub $0x1,%ebx
44.
9d2: 75 dc jne 9b0
<decode_frame_ilbm+0x560>
--
Best regards,
:-) Basty/CDGS (-:
More information about the ffmpeg-devel
mailing list