[FFmpeg-cvslog] celp: optimise ff_celp_lp_synthesis_filter()

Mans Rullgard git at videolan.org
Mon Aug 13 14:49:43 CEST 2012


ffmpeg | branch: master | Mans Rullgard <mans at mansr.com> | Sat Aug 11 04:18:53 2012 +0100| [fddc5b9bea39968ed1f45c667869428865de7626] | committer: Mans Rullgard

celp: optimise ff_celp_lp_synthesis_filter()

Adding instead of subtracting the products in the loop allows the
compiler to generate more efficient multiply-accumulate instructions
when 16-bit multiply-subtract is not available. ARM has only
multiply-accumulate for 16-bit operands.  In general, if only one
variant exists, it is usually accumulate rather than subtract.

In the same spirit, using the dedicated saturation function enables
use of any special optimised versions of this.

Signed-off-by: Mans Rullgard <mans at mansr.com>

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=fddc5b9bea39968ed1f45c667869428865de7626
---

 libavcodec/celp_filters.c |   15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/libavcodec/celp_filters.c b/libavcodec/celp_filters.c
index 4e5bcda..d764d19 100644
--- a/libavcodec/celp_filters.c
+++ b/libavcodec/celp_filters.c
@@ -63,17 +63,16 @@ int ff_celp_lp_synthesis_filter(int16_t *out, const int16_t *filter_coeffs,
     int i,n;
 
     for (n = 0; n < buffer_length; n++) {
-        int sum = rounder;
+        int sum = -rounder, sum1;
         for (i = 1; i <= filter_length; i++)
-            sum -= filter_coeffs[i-1] * out[n-i];
+            sum += filter_coeffs[i-1] * out[n-i];
 
-        sum = ((sum >> 12) + in[n]) >> shift;
+        sum1 = ((-sum >> 12) + in[n]) >> shift;
+        sum  = av_clip_int16(sum1);
+
+        if (stop_on_overflow && sum != sum1)
+            return 1;
 
-        if (sum + 0x8000 > 0xFFFFU) {
-            if (stop_on_overflow)
-                return 1;
-            sum = (sum >> 31) ^ 32767;
-        }
         out[n] = sum;
     }
 



More information about the ffmpeg-cvslog mailing list