[FFmpeg-devel] [PATCH + RFC] Faster ff_celp_lp_synthesis_filterf() (and failed SSE SIMD version)

Vitor Sessak vitor1001
Sun Dec 13 20:55:08 CET 2009


Hi,

ff_celp_lp_synthesis_filterf() is used for QCELP and RA288, and in the 
future ARMNB, SIPR and WMAVoice. It always shows up as one of the most 
costly functions when profiling, so here is my try to optimize it.

Unpatched:

14048 dezicycles in old_c, 65478 runs, 58 skips
13902 dezicycles in old_c, 130976 runs, 96 skips
13843 dezicycles in old_c, 261942 runs, 202 skips
13801 dezicycles in old_c, 523888 runs, 400 skips
13779 dezicycles in old_c, 1047756 runs, 820 skips
13761 dezicycles in old_c, 2095523 runs, 1629 skips

Patched:

7830 dezicycles in new_c, 65509 runs, 27 skips
7764 dezicycles in new_c, 131011 runs, 61 skips
7726 dezicycles in new_c, 262026 runs, 118 skips
7717 dezicycles in new_c, 524004 runs, 284 skips
7695 dezicycles in new_c, 1048035 runs, 541 skips
7678 dezicycles in new_c, 2096178 runs, 974 skips

Patched and compiled with -mfpmath=sse

6452 dezicycles in new_c_float_sse, 65511 runs, 25 skips
6421 dezicycles in new_c_float_sse, 131015 runs, 57 skips
6393 dezicycles in new_c_float_sse, 262044 runs, 100 skips
6380 dezicycles in new_c_float_sse, 524112 runs, 176 skips
6372 dezicycles in new_c_float_sse, 1048203 runs, 373 skips
6366 dezicycles in new_c_float_sse, 2096375 runs, 777 skips

My try to write a SSE SIMD version of it (attached):

6797 dezicycles in sse, 65510 runs, 26 skips
6750 dezicycles in sse, 131022 runs, 50 skips
6722 dezicycles in sse, 262052 runs, 92 skips
6702 dezicycles in sse, 524093 runs, 195 skips
6692 dezicycles in sse, 1048193 runs, 383 skips
6684 dezicycles in sse, 2096412 runs, 740 skips

Any comments/suggestion welcome.

-Vitor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lp_synthesis.diff
Type: text/x-patch
Size: 3592 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20091213/6ccd08ae/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: a.c
Type: text/x-csrc
Size: 3992 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20091213/6ccd08ae/attachment.c>



More information about the ffmpeg-devel mailing list