[FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

Michael Niedermayer michael at niedermayer.cc
Wed Jan 16 00:33:03 EET 2019


On Sun, Dec 30, 2018 at 07:15:49PM +0100, Marton Balint wrote:
> 
> 
> On Fri, 28 Dec 2018, Michael Niedermayer wrote:
> 
> >On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:
> >>
> >>
> >>On Wed, 26 Dec 2018, Paul B Mahol wrote:
> >>
> >>>On 12/26/18, Michael Niedermayer <michael at niedermayer.cc> wrote:
> >>>>On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
> >>>>>On 12/25/18, Michael Niedermayer <michael at niedermayer.cc> wrote:
> >>>>>>Fixes: Timeout
> >>>>>>Fixes:
> >>>>>>11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>>>>>Before: Executed
> >>>>>>clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>>>>>in 11294 ms
> >>>>>>After : Executed
> >>>>>>clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>>>>>in 4249 ms
> >>>>>>
> >>>>>>Found-by: continuous fuzzing process
> >>>>>>https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> >>>>>>Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> >>>>>>---
> >>>>>> libavutil/imgutils.c | 6 ++++++
> >>>>>> 1 file changed, 6 insertions(+)
> >>>>>>
> >>>>>>diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> >>>>>>index 4938a7ef67..cc38f1e878 100644
> >>>>>>--- a/libavutil/imgutils.c
> >>>>>>+++ b/libavutil/imgutils.c
> >>>>>>@@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
> >>>>>>dst_size,
> >>>>>>uint8_t *clear,
> >>>>>>         }
> >>>>>>     } else if (clear_size == 4) {
> >>>>>>         uint32_t val = AV_RN32(clear);
> >>>>>>+        uint64_t val8 = val * 0x100000001ULL;
> >>>>>>+        for (; dst_size >= 32; dst_size -= 32) {
> >>>>>>+            AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
> >>>>>>+            AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> >>>>>>+            dst += 32;
> >>>>>>+        }
> >>>>>>         for (; dst_size >= 4; dst_size -= 4) {
> >>>>>>             AV_WN32(dst, val);
> >>>>>>             dst += 4;
> >>>>>>--
> >>>>>>2.20.1
> >>>>>>
> >>>>>
> >>>>>NAK, implement special memset function instead.
> >>>>
> >>>>I can move the added loop into a seperate function, if thats what you
> >>>>suggest ?
> >>>
> >>>No, don't do that.
> >>>
> >>>>All the code is already in a "special" memset though, this is
> >>>>memset_bytes()
> >>>>
> >>>
> >>>I guess function is less useful if its static. So any duplicate should
> >>>be avoided in codebase.
> >>
> >>Isn't av_memcpy_backptr does almost exactly what is needed here? That can
> >>also be optimized further if needed.
> >
> >av_memcpy_backptr() copies data with overlap, its more like a recursive
> >memmove().
> 
> So? As far as I see the memset_bytes function in imgutils.c can be replaced
> with this:
> 
>     if (clear_size > dst_size)
>         clear_size = dst_size;
>     memcpy(dst, clear, clear_size);
>     av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);
> 
> I am not against an av_memset_bytes API addition, but I believe it should
> share code with av_memcpy_backptr to avoid duplication.

ive implemented this, it does not seem to be really faster in the testcase

patches below for reference:
From 30549c4e674abce48608d99ed3e7f8ccbd557ada Mon Sep 17 00:00:00 2001
From: Michael Niedermayer <michael at niedermayer.cc>
Date: Tue, 25 Dec 2018 23:15:20 +0100
Subject: [PATCH] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

Fixes: Timeout
Fixes: 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
Before: Executed clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920 in 11294 ms
After : Executed clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920 in 4249 ms

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
---
 libavutil/imgutils.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
index 4938a7ef67..6c0d3950de 100644
--- a/libavutil/imgutils.c
+++ b/libavutil/imgutils.c
@@ -529,6 +529,14 @@ static void memset_bytes(uint8_t *dst, size_t dst_size, uint8_t *clear,
         }
     } else if (clear_size == 4) {
         uint32_t val = AV_RN32(clear);
+#if HAVE_FAST_64BIT
+        uint64_t val8 = val * 0x100000001ULL;
+        for (; dst_size >= 32; dst_size -= 32) {
+            AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
+            AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
+            dst += 32;
+        }
+#endif
         for (; dst_size >= 4; dst_size -= 4) {
             AV_WN32(dst, val);
             dst += 4;
-- 
2.20.1


From 8e5140bf92d7e41090bfca1c6163f9c428402904 Mon Sep 17 00:00:00 2001
From: Michael Niedermayer <michael at niedermayer.cc>
Date: Tue, 25 Dec 2018 23:15:20 +0100
Subject: [PATCH] avutil/imgutils: Optimize memset_bytes() by using
 av_memcpy_backptr()

This is strongly based on code by Marton Balint

Fixes: Timeout
Fixes: 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
Before: Executed clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920 in 11294 ms
After:  Executed clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920 in 10948 ms

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
---
 libavutil/imgutils.c | 26 +++++---------------------
 1 file changed, 5 insertions(+), 21 deletions(-)

diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
index 4938a7ef67..cf06afde3f 100644
--- a/libavutil/imgutils.c
+++ b/libavutil/imgutils.c
@@ -521,28 +521,12 @@ static void memset_bytes(uint8_t *dst, size_t dst_size, uint8_t *clear,
     if (clear_size == 1) {
         memset(dst, clear[0], dst_size);
         dst_size = 0;
-    } else if (clear_size == 2) {
-        uint16_t val = AV_RN16(clear);
-        for (; dst_size >= 2; dst_size -= 2) {
-            AV_WN16(dst, val);
-            dst += 2;
-        }
-    } else if (clear_size == 4) {
-        uint32_t val = AV_RN32(clear);
-        for (; dst_size >= 4; dst_size -= 4) {
-            AV_WN32(dst, val);
-            dst += 4;
-        }
-    } else if (clear_size == 8) {
-        uint32_t val = AV_RN64(clear);
-        for (; dst_size >= 8; dst_size -= 8) {
-            AV_WN64(dst, val);
-            dst += 8;
-        }
+    } else {
+        if (clear_size > dst_size)
+            clear_size = dst_size;
+        memcpy(dst, clear, clear_size);
+        av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);
     }
-
-    for (; dst_size; dst_size--)
-        *dst++ = clear[pos++ % clear_size];
 }
 
 // Maximum size in bytes of a plane element (usually a pixel, or multiple pixels
-- 
2.20.1

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Let us carefully observe those good qualities wherein our enemies excel us
and endeavor to excel them, by avoiding what is faulty, and imitating what
is excellent in them. -- Plutarch
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20190115/50593271/attachment.sig>


More information about the ffmpeg-devel mailing list