[FFmpeg-devel] [PATCH] IFF: New heavy optimization of decodeplane32

Sebastian Vater cdgs.basty
Mon May 10 15:24:56 CEST 2010


Based on our discussions here and on IRC today, I did a heavy
optimization patch for decodeplane32 now.

So, have fun reviewing it ;-)

Testing with Ooze.iff resulted in (original was around 55000 dezicycles):
basty at cdgs-basty:~/src/ffmpeg/build$ ./ffplay ../patches/Ooze.iff
FFplay version git-svn-r23070, Copyright (c) 2003-2010 the FFmpeg developers
  built on May  9 2010 23:52:10 with gcc 4.2.4 (Ubuntu 4.2.4-1ubuntu4)
  configuration: --disable-avfilter
  libavutil     50.15. 2 / 50.15. 2
  libavcodec    52.67. 0 / 52.67. 0
  libavformat   52.62. 0 / 52.62. 0
  libavdevice   52. 2. 0 / 52. 2. 0
  libswscale     0.10. 0 /  0.10. 0
[IFF @ 0x8b3b820]Estimating duration from bitrate, this may be inaccurate
Input #0, IFF, from '../patches/Ooze.iff':
  Duration: N/A, bitrate: N/A
    Stream #0.0: Video: iff_byterun1, rgba, 666x536, PAR 1:1 DAR
333:268, 90k tbr, 90k tbn, 90k tbc
79480 dezicycles in decodeplane32, 1 runs, 0 skips
54340 dezicycles in decodeplane32, 2 runs, 0 skips
40167 dezicycles in decodeplane32, 4 runs, 0 skips
31891 dezicycles in decodeplane32, 8 runs, 0 skips
32240 dezicycles in decodeplane32, 16 runs, 0 skips
26869 dezicycles in decodeplane32, 32 runs, 0 skips
22631 dezicycles in decodeplane32, 64 runs, 0 skips
20534 dezicycles in decodeplane32, 128 runs, 0 skips
19453 dezicycles in decodeplane32, 256 runs, 0 skips
19017 dezicycles in decodeplane32, 512 runs, 0 skips
18698 dezicycles in decodeplane32, 1022 runs, 2 skips
18549 dezicycles in decodeplane32, 2046 runs, 2 skips
18505 dezicycles in decodeplane32, 4090 runs, 6 skips sq=    0B f=0/0
18469 dezicycles in decodeplane32, 8183 runs, 9 skips
   2.32 A-V:  0.000 s:0.0 aq=    0KB vq=    0KB sq=    0B f=0/0   0/0

Disassembly output of main loop is:
 6d3:   0f b6 07                movzbl (%edi),%eax
 6d6:   c0 e8 02                shr    $0x2,%al
 6d9:   83 e0 3c                and    $0x3c,%eax
 6dc:   8b 14 83                mov    (%ebx,%eax,4),%edx
 6df:   09 11                   or     %edx,(%ecx)
 6e1:   8b 54 83 04             mov    0x4(%ebx,%eax,4),%edx
 6e5:   09 51 04                or     %edx,0x4(%ecx)
 6e8:   8b 54 83 08             mov    0x8(%ebx,%eax,4),%edx
 6ec:   09 51 08                or     %edx,0x8(%ecx)
 6ef:   8b 44 83 0c             mov    0xc(%ebx,%eax,4),%eax
 6f3:   09 41 0c                or     %eax,0xc(%ecx)
 6f6:   0f b6 07                movzbl (%edi),%eax
 6f9:   83 c7 01                add    $0x1,%edi
 6fc:   c1 e0 02                shl    $0x2,%eax
 6ff:   83 e0 3f                and    $0x3f,%eax
 702:   8b 14 83                mov    (%ebx,%eax,4),%edx
 705:   09 51 10                or     %edx,0x10(%ecx)
 708:   8b 54 83 04             mov    0x4(%ebx,%eax,4),%edx
 70c:   09 51 14                or     %edx,0x14(%ecx)
 70f:   8b 54 83 08             mov    0x8(%ebx,%eax,4),%edx
 713:   09 51 18                or     %edx,0x18(%ecx)
 716:   8b 44 83 0c             mov    0xc(%ebx,%eax,4),%eax
 71a:   09 41 1c                or     %eax,0x1c(%ecx)
 71d:   83 c1 20                add    $0x20,%ecx
 720:   83 ee 01                sub    $0x1,%esi
 723:   75 ae                   jne    6d3 <decode_frame_byterun1+0x2f3>

-- 

Best regards,
                   :-) Basty/CDGS (-:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: iff-decoder-fix-heavy-dp32.patch
Type: text/x-patch
Size: 2827 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100510/db4743a4/attachment.bin>



More information about the ffmpeg-devel mailing list