[FFmpeg-devel] [PATCH] Optimization of original IFF codec

Sebastian Vater cdgs.basty
Mon Apr 26 20:08:59 CEST 2010


Hi Mans!

M?ns Rullg?rd a ?crit :
> Sebastian Vater <cdgs.basty at googlemail.com> writes:
>
>   
>> Hi Mans!
>>
>> M?ns Rullg?rd a ?crit :
>>     
>>> This is inefficient.  You are building the table afresh on each call
>>> to the function.  Make the table static const, dropping the shift, and
>>> instead shift the table value inside the loop.
>>>   
>>>       
>> I just benchmarked both, my solution is way faster:
>>     
>
> I don't believe that, simply because it has more work to do.  How did
> you benchmark it?
>   
Why? The init is done only once per call, but moving the bit-shift in
the inner-loop will shift every inner-loop iteration.

Benchmarking was done by putting START_TIMER _before_ lut init and
STOP_TIMER at the very end of decodeplane:
     GetBitContext gb;
     unsigned i;
+  START_TIMER;
     const unsigned b = (buf_size * 8) + bps - 1;
     const unsigned b32 = b & ~3;
     const uint32_t lut[] = {0x0000000,
[...]

BTW, the same for decodeplane32, with my patch:
basty at cdgs-basty:~/src/ffmpeg/build$ ./ffplay ../patches/Ooze.iff
FFplay version git-fb63232, Copyright (c) 2003-2010 the FFmpeg developers
  built on Apr 26 2010 19:54:23 with gcc 4.2.4 (Ubuntu 4.2.4-1ubuntu4)
  configuration:
  libavutil     50.14. 0 / 50.14. 0
  libavcodec    52.66. 0 / 52.66. 0
  libavformat   52.61. 0 / 52.61. 0
  libavdevice   52. 2. 0 / 52. 2. 0
  libswscale     0.10. 0 /  0.10. 0
[IFF @ 0x8b32790]Estimating duration from bitrate, this may be inaccurate
Input #0, IFF, from '../patches/Ooze.iff':
  Duration: N/A, bitrate: N/A
    Stream #0.0: Video: iff_byterun1, rgba, 666x536, PAR 1:1 DAR
333:268, 90k tbr, 90k tbn, 90k tbc
37660 dezicycles in decodeplane32, 1 runs, 0 skips
29235 dezicycles in decodeplane32, 2 runs, 0 skips
24687 dezicycles in decodeplane32, 4 runs, 0 skips
22337 dezicycles in decodeplane32, 8 runs, 0 skips
21055 dezicycles in decodeplane32, 16 runs, 0 skips
20382 dezicycles in decodeplane32, 32 runs, 0 skips
20107 dezicycles in decodeplane32, 64 runs, 0 skips
19890 dezicycles in decodeplane32, 128 runs, 0 skips
20120 dezicycles in decodeplane32, 256 runs, 0 skips
20015 dezicycles in decodeplane32, 512 runs, 0 skips
19850 dezicycles in decodeplane32, 1024 runs, 0 skips
19774 dezicycles in decodeplane32, 2048 runs, 0 skips
19780 dezicycles in decodeplane32, 4094 runs, 2 skips sq=    0B f=0/0
19751 dezicycles in decodeplane32, 8187 runs, 5 skips
   1.93 A-V:  0.000 s:0.0 aq=    0KB vq=    0KB sq=    0B f=0/0   0/0

decodeplane32 with making all 4 lut's static and do the shift in the
inner-loop:
basty at cdgs-basty:~/src/ffmpeg/build$ ./ffplay ../patches/Ooze.iff
FFplay version git-fb63232, Copyright (c) 2003-2010 the FFmpeg developers
  built on Apr 26 2010 19:54:23 with gcc 4.2.4 (Ubuntu 4.2.4-1ubuntu4)
  configuration:
  libavutil     50.14. 0 / 50.14. 0
  libavcodec    52.66. 0 / 52.66. 0
  libavformat   52.61. 0 / 52.61. 0
  libavdevice   52. 2. 0 / 52. 2. 0
  libswscale     0.10. 0 /  0.10. 0
[IFF @ 0x8b32790]Estimating duration from bitrate, this may be inaccurate
Input #0, IFF, from '../patches/Ooze.iff':
  Duration: N/A, bitrate: N/A
    Stream #0.0: Video: iff_byterun1, rgba, 666x536, PAR 1:1 DAR
333:268, 90k tbr, 90k tbn, 90k tbc
42790 dezicycles in decodeplane32, 1 runs, 0 skips
35080 dezicycles in decodeplane32, 2 runs, 0 skips
30992 dezicycles in decodeplane32, 4 runs, 0 skips
28612 dezicycles in decodeplane32, 8 runs, 0 skips
27280 dezicycles in decodeplane32, 16 runs, 0 skips
26460 dezicycles in decodeplane32, 32 runs, 0 skips
26191 dezicycles in decodeplane32, 64 runs, 0 skips
25971 dezicycles in decodeplane32, 128 runs, 0 skips
25890 dezicycles in decodeplane32, 256 runs, 0 skips
25989 dezicycles in decodeplane32, 512 runs, 0 skips
25888 dezicycles in decodeplane32, 1024 runs, 0 skips
25839 dezicycles in decodeplane32, 2048 runs, 0 skips
25813 dezicycles in decodeplane32, 4093 runs, 3 skips sq=    0B f=0/0
25816 dezicycles in decodeplane32, 8182 runs, 10 skips
   1.64 A-V:  0.000 s:0.0 aq=    0KB vq=    0KB sq=    0B f=0/0   0/0

-- 

Best regards,
                   :-) Basty/CDGS (-:

Warum ich spirituell bin? Ganz einfach, weil ich lieber nach
der Formel des Weltfriedens statt nach der Weltformel suche.




More information about the ffmpeg-devel mailing list