[FFmpeg-cvslog] avcodec/scpr: optimize shift loop.
Michael Niedermayer
git at videolan.org
Mon Sep 11 20:45:14 EEST 2017
ffmpeg | branch: release/3.3 | Michael Niedermayer <michael at niedermayer.cc> | Fri Sep 8 23:29:13 2017 +0200| [b590758298cc6f7bac710ebaecb99a4de878c7f8] | committer: Michael Niedermayer
avcodec/scpr: optimize shift loop.
Speeds code up from 50sec to 15sec
Fixes Timeout
Fixes: 3242/clusterfuzz-testcase-5811951672229888
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: James Almer <jamrial at gmail.com>
Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
(cherry picked from commit 981f04b2ae2d6e0355386aaff39840eb5d390a36)
Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=b590758298cc6f7bac710ebaecb99a4de878c7f8
---
libavcodec/scpr.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/libavcodec/scpr.c b/libavcodec/scpr.c
index b4cc7df07f..78a6d5c0cd 100644
--- a/libavcodec/scpr.c
+++ b/libavcodec/scpr.c
@@ -824,8 +824,19 @@ static int decode_frame(AVCodecContext *avctx, void *data, int *got_frame,
if (ret < 0)
return ret;
+ // scale up each sample by 8
for (y = 0; y < avctx->height; y++) {
- for (x = 0; x < avctx->width * 4; x++) {
+ // If the image is sufficiently aligned, compute 8 samples at once
+ if (!(((uintptr_t)dst) & 7)) {
+ uint64_t *dst64 = (uint64_t *)dst;
+ int w = avctx->width>>1;
+ for (x = 0; x < w; x++) {
+ dst64[x] = (dst64[x] << 3) & 0xFCFCFCFCFCFCFCFCULL;
+ }
+ x *= 8;
+ } else
+ x = 0;
+ for (; x < avctx->width * 4; x++) {
dst[x] = dst[x] << 3;
}
dst += frame->linesize[0];
More information about the ffmpeg-cvslog
mailing list