[FFmpeg-devel] [PATCH] IFF: New heavy optimization of decodeplane32
Sebastian Vater
cdgs.basty
Mon May 10 15:24:56 CEST 2010
Based on our discussions here and on IRC today, I did a heavy
optimization patch for decodeplane32 now.
So, have fun reviewing it ;-)
Testing with Ooze.iff resulted in (original was around 55000 dezicycles):
basty at cdgs-basty:~/src/ffmpeg/build$ ./ffplay ../patches/Ooze.iff
FFplay version git-svn-r23070, Copyright (c) 2003-2010 the FFmpeg developers
built on May 9 2010 23:52:10 with gcc 4.2.4 (Ubuntu 4.2.4-1ubuntu4)
configuration: --disable-avfilter
libavutil 50.15. 2 / 50.15. 2
libavcodec 52.67. 0 / 52.67. 0
libavformat 52.62. 0 / 52.62. 0
libavdevice 52. 2. 0 / 52. 2. 0
libswscale 0.10. 0 / 0.10. 0
[IFF @ 0x8b3b820]Estimating duration from bitrate, this may be inaccurate
Input #0, IFF, from '../patches/Ooze.iff':
Duration: N/A, bitrate: N/A
Stream #0.0: Video: iff_byterun1, rgba, 666x536, PAR 1:1 DAR
333:268, 90k tbr, 90k tbn, 90k tbc
79480 dezicycles in decodeplane32, 1 runs, 0 skips
54340 dezicycles in decodeplane32, 2 runs, 0 skips
40167 dezicycles in decodeplane32, 4 runs, 0 skips
31891 dezicycles in decodeplane32, 8 runs, 0 skips
32240 dezicycles in decodeplane32, 16 runs, 0 skips
26869 dezicycles in decodeplane32, 32 runs, 0 skips
22631 dezicycles in decodeplane32, 64 runs, 0 skips
20534 dezicycles in decodeplane32, 128 runs, 0 skips
19453 dezicycles in decodeplane32, 256 runs, 0 skips
19017 dezicycles in decodeplane32, 512 runs, 0 skips
18698 dezicycles in decodeplane32, 1022 runs, 2 skips
18549 dezicycles in decodeplane32, 2046 runs, 2 skips
18505 dezicycles in decodeplane32, 4090 runs, 6 skips sq= 0B f=0/0
18469 dezicycles in decodeplane32, 8183 runs, 9 skips
2.32 A-V: 0.000 s:0.0 aq= 0KB vq= 0KB sq= 0B f=0/0 0/0
Disassembly output of main loop is:
6d3: 0f b6 07 movzbl (%edi),%eax
6d6: c0 e8 02 shr $0x2,%al
6d9: 83 e0 3c and $0x3c,%eax
6dc: 8b 14 83 mov (%ebx,%eax,4),%edx
6df: 09 11 or %edx,(%ecx)
6e1: 8b 54 83 04 mov 0x4(%ebx,%eax,4),%edx
6e5: 09 51 04 or %edx,0x4(%ecx)
6e8: 8b 54 83 08 mov 0x8(%ebx,%eax,4),%edx
6ec: 09 51 08 or %edx,0x8(%ecx)
6ef: 8b 44 83 0c mov 0xc(%ebx,%eax,4),%eax
6f3: 09 41 0c or %eax,0xc(%ecx)
6f6: 0f b6 07 movzbl (%edi),%eax
6f9: 83 c7 01 add $0x1,%edi
6fc: c1 e0 02 shl $0x2,%eax
6ff: 83 e0 3f and $0x3f,%eax
702: 8b 14 83 mov (%ebx,%eax,4),%edx
705: 09 51 10 or %edx,0x10(%ecx)
708: 8b 54 83 04 mov 0x4(%ebx,%eax,4),%edx
70c: 09 51 14 or %edx,0x14(%ecx)
70f: 8b 54 83 08 mov 0x8(%ebx,%eax,4),%edx
713: 09 51 18 or %edx,0x18(%ecx)
716: 8b 44 83 0c mov 0xc(%ebx,%eax,4),%eax
71a: 09 41 1c or %eax,0x1c(%ecx)
71d: 83 c1 20 add $0x20,%ecx
720: 83 ee 01 sub $0x1,%esi
723: 75 ae jne 6d3 <decode_frame_byterun1+0x2f3>
--
Best regards,
:-) Basty/CDGS (-:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: iff-decoder-fix-heavy-dp32.patch
Type: text/x-patch
Size: 2827 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100510/db4743a4/attachment.bin>
More information about the ffmpeg-devel
mailing list