[FFmpeg-devel] [PATCH 4/5] startcode: Don't return false positives

Andreas Rheinhardt andreas.rheinhardt at gmail.com
Tue Jun 4 01:59:00 EEST 2019


Michael Niedermayer:
> On Sun, Jun 02, 2019 at 12:47:18AM +0200, Andreas Rheinhardt wrote:
>> Until now the function ff_startcode_find_candidate_c did not really
>> search for startcodes (the startcode 0x00 0x00 0x01 (used in
>> MPEG-1/2/4, VC-1 and H.264/5) is the only startcode meant here). Instead
>> it searched for zero bytes and returned the earliest position of a zero
>> byte. This of course led to lots of false positives - millions per GB
>> of video.
>> This has been changed: The first position of the buffer that
>> may be part of a four-byte startcode is now returned. This excludes zero
>> bytes that are known not to belong to a startcode, but zero bytes at the
>> end of a buffer that might be part of a startcode whose second part is in
>> the next buffer are also returned. This is in accordance with the
>> expectations of the current callers of ff_startcode_find_candidate_c,
>> namely the H.264 parser and the VC-1 parser. This is also the reason why
>> "find_candidate" in the name of the function has been kept.
>>
>> Getting rid of lots of function calls with their accompanying overhead
>> of course brings a speedup with it: Here are some benchmarks for
>> a 30.2 Mb/s transport stream A (10 iterations of 8192 runs each) and
>> another 7.4 Mb/s transport stream B (10 iterations of 131072 runs each)
>> on an x64 Haswell:
>>
>>   |   vanilla   | aligned | current: aligned,
>>   | (unaligned) |  reads  | no false positives
>> --|-------------|---------|-------------------
>> A |    411578   |  424503 |    323355
>> B |     55476   |   58326 |     44147
>>
>> vanilla refers to the state before the switch to aligned reads;
>> "aligned reads" refers to the state after the switch to aligned reads.
>>
>> (Calls to h264_find_frame_end have been timed; given that the amount
>> of calls to ff_startcode_find_candidate_c has been reduced considerably,
>> timing them directly would be worthless.)
>>
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt at gmail.com>
>> ---
>>  libavcodec/h264dsp.h   |  7 +--
>>  libavcodec/startcode.c | 98 +++++++++++++++++++++++++++++++++++++-----
>>  libavcodec/vc1dsp.h    |  6 +--
>>  3 files changed, 91 insertions(+), 20 deletions(-)
>>
>> diff --git a/libavcodec/h264dsp.h b/libavcodec/h264dsp.h
>> index cbea3173c6..51de7a4e21 100644
>> --- a/libavcodec/h264dsp.h
>> +++ b/libavcodec/h264dsp.h
>> @@ -108,11 +108,8 @@ typedef struct H264DSPContext {
>>      void (*h264_add_pixels4_clear)(uint8_t *dst, int16_t *block, int stride);
>>  
>>      /**
>> -     * Search buf from the start for up to size bytes. Return the index
>> -     * of a zero byte, or >= size if not found. Ideally, use lookahead
>> -     * to filter out any zero bytes that are known to not be followed by
>> -     * one or more further zero bytes and a one byte. Better still, filter
>> -     * out any bytes that form the trailing_zero_8bits syntax element too.
>> +     * Search buf from the start for up to size bytes. Return the first 0x00
>> +     * that might be part of a (three or four) byte startcode.
>>       */
>>      int (*startcode_find_candidate)(const uint8_t *buf, int size);
>>  } H264DSPContext;
>> diff --git a/libavcodec/startcode.c b/libavcodec/startcode.c
>> index b027c191c0..f6105289f1 100644
>> --- a/libavcodec/startcode.c
>> +++ b/libavcodec/startcode.c
>> @@ -22,7 +22,8 @@
>>   * @file
>>   * Accelerated start code search function for start codes common to
>>   * MPEG-1/2/4 video, VC-1, H.264/5
>> - * @author Michael Niedermayer <michaelni at gmx.at>
>> + * @author Michael Niedermayer <michaelni at gmx.at>,
>> + * @author Andreas Rheinhardt <andreas.rheinhardt at gmail.com>
>>   */
>>  
>>  #include "startcode.h"
>> @@ -31,20 +32,60 @@
>>  
>>  int ff_startcode_find_candidate_c(const uint8_t *buf, int size)
>>  {
>> -    const uint8_t *start = buf, *end = buf + size;
>> +    const uint8_t *start = buf, *end = buf + size, *temp;
>>  
>>  #define INITIALIZATION(mod) do {                                           \
>> -    for (; buf < end && (uintptr_t)buf % mod; buf++)                       \
>> -        if (!*buf)                                                         \
>> -            return buf - start;                                            \
>> +        if (end - start <= mod + 1)                                        \
>> +            goto near_end;                                                 \
>> +        /* From this point on until the end of the MAIN_LOOP,              \
>> +         * buf is the earliest possible position of a 0x00                 \
>> +         * immediately preceding a startcode's 0x01, i.e.                  \
>> +         * everything from start to buf (inclusive) is known               \
>> +         * to not contain a startcode's 0x01. */                           \
>> +        buf += 1;                                                          \
> 
>> +        temp = (const uint8_t *)((uintptr_t)(buf + mod - 1) / mod * mod);  \
> 
> smells like FFALIGN()
> 
Will change. (I knew there had to be such a macro, but I only looked
in libavutil/common.h and so I didn't find it. And now I see that
macro.h (containing FFALIGN) is actually included in common.h.)

Have you made any benchmarks of your own (preferably on non X86
systems)? I'd like to add them to the commit messages as well.

- Andreas


More information about the ffmpeg-devel mailing list