[FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()
Michael Niedermayer
michael at niedermayer.cc
Sun Jan 20 22:14:59 EET 2019
On Sat, Jan 19, 2019 at 12:28:25AM +0100, Marton Balint wrote:
>
>
> On Thu, 17 Jan 2019, Michael Niedermayer wrote:
>
> >On Wed, Jan 16, 2019 at 08:00:22PM +0100, Marton Balint wrote:
> >>
> >>
> >>On Tue, 15 Jan 2019, Michael Niedermayer wrote:
> >>
> >>>On Sun, Dec 30, 2018 at 07:15:49PM +0100, Marton Balint wrote:
> >>>>
> >>>>
> >>>>On Fri, 28 Dec 2018, Michael Niedermayer wrote:
> >>>>
> >>>>>On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:
> >>>>>>
> >>>>>>
> >>>>>>On Wed, 26 Dec 2018, Paul B Mahol wrote:
> >>>>>>
> >>>>>>>On 12/26/18, Michael Niedermayer <michael at niedermayer.cc> wrote:
> >>>>>>>>On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
> >>>>>>>>>On 12/25/18, Michael Niedermayer <michael at niedermayer.cc> wrote:
> >>>>>>>>>>Fixes: Timeout
> >>>>>>>>>>Fixes:
> >>>>>>>>>>11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>>>>>>>>>Before: Executed
> >>>>>>>>>>clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>>>>>>>>>in 11294 ms
> >>>>>>>>>>After : Executed
> >>>>>>>>>>clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
> >>>>>>>>>>in 4249 ms
> >>>>>>>>>>
> >>>>>>>>>>Found-by: continuous fuzzing process
> >>>>>>>>>>https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> >>>>>>>>>>Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> >>>>>>>>>>---
> >>>>>>>>>>libavutil/imgutils.c | 6 ++++++
> >>>>>>>>>>1 file changed, 6 insertions(+)
> >>>>>>>>>>
> >>>>>>>>>>diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
> >>>>>>>>>>index 4938a7ef67..cc38f1e878 100644
> >>>>>>>>>>--- a/libavutil/imgutils.c
> >>>>>>>>>>+++ b/libavutil/imgutils.c
> >>>>>>>>>>@@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
> >>>>>>>>>>dst_size,
> >>>>>>>>>>uint8_t *clear,
> >>>>>>>>>> }
> >>>>>>>>>> } else if (clear_size == 4) {
> >>>>>>>>>> uint32_t val = AV_RN32(clear);
> >>>>>>>>>>+ uint64_t val8 = val * 0x100000001ULL;
> >>>>>>>>>>+ for (; dst_size >= 32; dst_size -= 32) {
> >>>>>>>>>>+ AV_WN64(dst , val8); AV_WN64(dst+ 8, val8);
> >>>>>>>>>>+ AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
> >>>>>>>>>>+ dst += 32;
> >>>>>>>>>>+ }
> >>>>>>>>>> for (; dst_size >= 4; dst_size -= 4) {
> >>>>>>>>>> AV_WN32(dst, val);
> >>>>>>>>>> dst += 4;
> >>>>>>>>>>--
> >>>>>>>>>>2.20.1
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>NAK, implement special memset function instead.
> >>>>>>>>
> >>>>>>>>I can move the added loop into a seperate function, if thats what you
> >>>>>>>>suggest ?
> >>>>>>>
> >>>>>>>No, don't do that.
> >>>>>>>
> >>>>>>>>All the code is already in a "special" memset though, this is
> >>>>>>>>memset_bytes()
> >>>>>>>>
> >>>>>>>
> >>>>>>>I guess function is less useful if its static. So any duplicate should
> >>>>>>>be avoided in codebase.
> >>>>>>
> >>>>>>Isn't av_memcpy_backptr does almost exactly what is needed here? That can
> >>>>>>also be optimized further if needed.
> >>>>>
> >>>>>av_memcpy_backptr() copies data with overlap, its more like a recursive
> >>>>>memmove().
> >>>>
> >>>>So? As far as I see the memset_bytes function in imgutils.c can be replaced
> >>>>with this:
> >>>>
> >>>> if (clear_size > dst_size)
> >>>> clear_size = dst_size;
> >>>> memcpy(dst, clear, clear_size);
> >>>> av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);
> >>>>
> >>>>I am not against an av_memset_bytes API addition, but I believe it should
> >>>>share code with av_memcpy_backptr to avoid duplication.
> >>>
> >>>ive implemented this, it does not seem to be really faster in the testcase
> >>
> >>I guess it is not faster because you have not applied your original
> >>optimalization to fill32 in libavutil/mem.c. Could you compare speed after
> >>optimizing that the same way your original patch did it with imgutils
> >>memset_bytes?
> >
> >sure, that makes it faster:
>
> Thanks, both patches LGTM.
will apply
thanks
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Observe your enemies, for they first find out your faults. -- Antisthenes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20190120/fc9b122e/attachment.sig>
More information about the ffmpeg-devel
mailing list