[FFmpeg-devel] [PATCH 3/3] avcodec/scpr: optimize shift loop.
James Almer
jamrial at gmail.com
Sat Sep 9 00:43:06 EEST 2017
On 9/8/2017 6:29 PM, Michael Niedermayer wrote:
> Speeds code up from 50sec to 15sec
>
> Fixes Timeout
> Fixes: 3242/clusterfuzz-testcase-5811951672229888
>
> Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> ---
> libavcodec/scpr.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/scpr.c b/libavcodec/scpr.c
> index 37fbe7a106..2ef63a7bf8 100644
> --- a/libavcodec/scpr.c
> +++ b/libavcodec/scpr.c
> @@ -827,7 +827,16 @@ static int decode_frame(AVCodecContext *avctx, void *data, int *got_frame,
> return ret;
>
> for (y = 0; y < avctx->height; y++) {
> - for (x = 0; x < avctx->width * 4; x++) {
> + if (!(((uintptr_t)dst) & 7)) {
> + uint64_t *dst64 = (uint64_t *)dst;
> + int w = avctx->width>>1;
> + for (x = 0; x < w; x++) {
> + dst64[x] = (dst64[x] << 3) & 0xFCFCFCFCFCFCFCFCULL;
Shouldn't this be used only if HAVE_FAST_64BIT is true, and a version
shifting four bytes at a time used otherwise? That's how we do almost
everywhere else.
The chances for anyone bothering writing simd for this decoder are
almost none, so adding C optimized loops is ok in this case.
> + }
> + x *= 8;
> + } else
> + x = 0;
How does this fix the timeout if the new code is only run if the pointer
is eight byte aligned? (or four once you add that).
> + for (; x < avctx->width * 4; x++) {
> dst[x] = dst[x] << 3;
> }
> dst += frame->linesize[0];
>
More information about the ffmpeg-devel
mailing list