[FFmpeg-devel] [PATCH 3/3] avcodec/scpr: optimize shift loop.
James Almer
jamrial at gmail.com
Sat Sep 9 01:15:43 EEST 2017
On 9/8/2017 6:47 PM, Kieran Kunhya wrote:
> On Fri, 8 Sep 2017 at 22:29 Michael Niedermayer <michael at niedermayer.cc>
> wrote:
>
>> Speeds code up from 50sec to 15sec
>>
>> Fixes Timeout
>> Fixes: 3242/clusterfuzz-testcase-5811951672229888
>>
>> Found-by: continuous fuzzing process
>> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>> Signed-off-by
>> <https://github.com/google/oss-fuzz/tree/master/projects/ffmpegSigned-off-by>:
>> Michael Niedermayer <michael at niedermayer.cc>
>> ---
>> libavcodec/scpr.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/libavcodec/scpr.c b/libavcodec/scpr.c
>> index 37fbe7a106..2ef63a7bf8 100644
>> --- a/libavcodec/scpr.c
>> +++ b/libavcodec/scpr.c
>> @@ -827,7 +827,16 @@ static int decode_frame(AVCodecContext *avctx, void
>> *data, int *got_frame,
>> return ret;
>>
>> for (y = 0; y < avctx->height; y++) {
>> - for (x = 0; x < avctx->width * 4; x++) {
>> + if (!(((uintptr_t)dst) & 7)) {
>> + uint64_t *dst64 = (uint64_t *)dst;
>> + int w = avctx->width>>1;
>> + for (x = 0; x < w; x++) {
>> + dst64[x] = (dst64[x] << 3) & 0xFCFCFCFCFCFCFCFCULL;
>> + }
>> + x *= 8;
>> + } else
>> + x = 0;
>> + for (; x < avctx->width * 4; x++) {
>> dst[x] = dst[x] << 3;
>> }
>> dst += frame->linesize[0];
>> --
>> 2.14.1
>>
>
> This is as clear as mud.
It reads eight bytes at a time if the buffer is sufficiently aligned,
then finishes reading the remaining bytes one at a time.
If the buffer is unaligned, it reads everything one byte at a time like
it used to.
See ff_h2645_extract_rbsp() and add_bytes_c() for another example of
this optimization.
More information about the ffmpeg-devel
mailing list