[FFmpeg-devel] [PATCH 4/5] startcode: Don't return false positives

Michael Niedermayer michael at niedermayer.cc
Tue Jun 4 14:17:39 EEST 2019


On Mon, Jun 03, 2019 at 10:59:00PM +0000, Andreas Rheinhardt wrote:
> Michael Niedermayer:
> > On Sun, Jun 02, 2019 at 12:47:18AM +0200, Andreas Rheinhardt wrote:
> >> Until now the function ff_startcode_find_candidate_c did not really
> >> search for startcodes (the startcode 0x00 0x00 0x01 (used in
> >> MPEG-1/2/4, VC-1 and H.264/5) is the only startcode meant here). Instead
> >> it searched for zero bytes and returned the earliest position of a zero
> >> byte. This of course led to lots of false positives - millions per GB
> >> of video.
> >> This has been changed: The first position of the buffer that
> >> may be part of a four-byte startcode is now returned. This excludes zero
> >> bytes that are known not to belong to a startcode, but zero bytes at the
> >> end of a buffer that might be part of a startcode whose second part is in
> >> the next buffer are also returned. This is in accordance with the
> >> expectations of the current callers of ff_startcode_find_candidate_c,
> >> namely the H.264 parser and the VC-1 parser. This is also the reason why
> >> "find_candidate" in the name of the function has been kept.
> >>
> >> Getting rid of lots of function calls with their accompanying overhead
> >> of course brings a speedup with it: Here are some benchmarks for
> >> a 30.2 Mb/s transport stream A (10 iterations of 8192 runs each) and
> >> another 7.4 Mb/s transport stream B (10 iterations of 131072 runs each)
> >> on an x64 Haswell:
> >>
> >>   |   vanilla   | aligned | current: aligned,
> >>   | (unaligned) |  reads  | no false positives
> >> --|-------------|---------|-------------------
> >> A |    411578   |  424503 |    323355
> >> B |     55476   |   58326 |     44147
> >>
> >> vanilla refers to the state before the switch to aligned reads;
> >> "aligned reads" refers to the state after the switch to aligned reads.
> >>
> >> (Calls to h264_find_frame_end have been timed; given that the amount
> >> of calls to ff_startcode_find_candidate_c has been reduced considerably,
> >> timing them directly would be worthless.)
> >>
> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt at gmail.com>
> >> ---
> >>  libavcodec/h264dsp.h   |  7 +--
> >>  libavcodec/startcode.c | 98 +++++++++++++++++++++++++++++++++++++-----
> >>  libavcodec/vc1dsp.h    |  6 +--
> >>  3 files changed, 91 insertions(+), 20 deletions(-)
> >>
> >> diff --git a/libavcodec/h264dsp.h b/libavcodec/h264dsp.h
> >> index cbea3173c6..51de7a4e21 100644
> >> --- a/libavcodec/h264dsp.h
> >> +++ b/libavcodec/h264dsp.h
> >> @@ -108,11 +108,8 @@ typedef struct H264DSPContext {
> >>      void (*h264_add_pixels4_clear)(uint8_t *dst, int16_t *block, int stride);
> >>  
> >>      /**
> >> -     * Search buf from the start for up to size bytes. Return the index
> >> -     * of a zero byte, or >= size if not found. Ideally, use lookahead
> >> -     * to filter out any zero bytes that are known to not be followed by
> >> -     * one or more further zero bytes and a one byte. Better still, filter
> >> -     * out any bytes that form the trailing_zero_8bits syntax element too.
> >> +     * Search buf from the start for up to size bytes. Return the first 0x00
> >> +     * that might be part of a (three or four) byte startcode.
> >>       */
> >>      int (*startcode_find_candidate)(const uint8_t *buf, int size);
> >>  } H264DSPContext;
> >> diff --git a/libavcodec/startcode.c b/libavcodec/startcode.c
> >> index b027c191c0..f6105289f1 100644
> >> --- a/libavcodec/startcode.c
> >> +++ b/libavcodec/startcode.c
> >> @@ -22,7 +22,8 @@
> >>   * @file
> >>   * Accelerated start code search function for start codes common to
> >>   * MPEG-1/2/4 video, VC-1, H.264/5
> >> - * @author Michael Niedermayer <michaelni at gmx.at>
> >> + * @author Michael Niedermayer <michaelni at gmx.at>,
> >> + * @author Andreas Rheinhardt <andreas.rheinhardt at gmail.com>
> >>   */
> >>  
> >>  #include "startcode.h"
> >> @@ -31,20 +32,60 @@
> >>  
> >>  int ff_startcode_find_candidate_c(const uint8_t *buf, int size)
> >>  {
> >> -    const uint8_t *start = buf, *end = buf + size;
> >> +    const uint8_t *start = buf, *end = buf + size, *temp;
> >>  
> >>  #define INITIALIZATION(mod) do {                                           \
> >> -    for (; buf < end && (uintptr_t)buf % mod; buf++)                       \
> >> -        if (!*buf)                                                         \
> >> -            return buf - start;                                            \
> >> +        if (end - start <= mod + 1)                                        \
> >> +            goto near_end;                                                 \
> >> +        /* From this point on until the end of the MAIN_LOOP,              \
> >> +         * buf is the earliest possible position of a 0x00                 \
> >> +         * immediately preceding a startcode's 0x01, i.e.                  \
> >> +         * everything from start to buf (inclusive) is known               \
> >> +         * to not contain a startcode's 0x01. */                           \
> >> +        buf += 1;                                                          \
> > 
> >> +        temp = (const uint8_t *)((uintptr_t)(buf + mod - 1) / mod * mod);  \
> > 
> > smells like FFALIGN()
> > 
> Will change. (I knew there had to be such a macro, but I only looked
> in libavutil/common.h and so I didn't find it. And now I see that
> macro.h (containing FFALIGN) is actually included in common.h.)
> 
> Have you made any benchmarks of your own (preferably on non X86
> systems)? I'd like to add them to the commit messages as well.

nope, i just tested on x86 and mips IIRC but didnt look at any speeds just
if things still work


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Avoid a single point of failure, be that a person or equipment.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20190604/352c0d54/attachment.sig>


More information about the ffmpeg-devel mailing list