[FFmpeg-devel] [PATCH] avcodec/utils/avpriv_find_start_code: optimization. If HAVE_FAST_UNALIGNED is true, handle "1 + sizeof(long)" bytes per step.

zhaoxiu.zeng zhaoxiu.zeng at gmail.com
Thu Jan 1 18:27:51 CET 2015


在 2015/1/1 11:49, Michael Niedermayer 写道:
> On Thu, Jan 01, 2015 at 10:13:58AM +0800, zhaoxiu.zeng wrote:
>>  libavcodec/utils.c | 68 ++++++++++++++++++++++++++++++++++++++++++++----------
>>  1 file changed, 56 insertions(+), 12 deletions(-)
>>
>> diff --git a/libavcodec/utils.c b/libavcodec/utils.c
>> index 1ec5cae..14a43e2 100644
>> --- a/libavcodec/utils.c
>> +++ b/libavcodec/utils.c
>> @@ -3772,30 +3772,74 @@ const uint8_t *avpriv_find_start_code(const uint8_t *av_restrict p,
>>                                        uint32_t *av_restrict state)
>>  {
>>      int i;
>> +    uint32_t stat;
>>  
>>      av_assert0(p <= end);
>>      if (p >= end)
>>          return end;
>>  
>> +    stat = *state;
>>      for (i = 0; i < 3; i++) {
>> -        uint32_t tmp = *state << 8;
>> -        *state = tmp + *(p++);
>> -        if (tmp == 0x100 || p == end)
>> +        uint32_t tmp = stat << 8;
>> +        stat = tmp + *(p++);
>> +        if (tmp == 0x100 || p == end) {
>> +            *state = stat;
>>              return p;
>> +        }
>>      }
>>  
>> -    while (p < end) {
>> -        if      (p[-1] > 1      ) p += 3;
>> -        else if (p[-2]          ) p += 2;
>> -        else if (p[-3]|(p[-1]-1)) p++;
>> -        else {
>> +#if HAVE_FAST_UNALIGNED
>> +#if HAVE_FAST_64BIT
>> +    for (; p + 6 <= end; p += 9) {
>> +        uint64_t t = AV_RN64A(p - 2);
>> +        if (!((t - 0x0100010001000101ULL) & ~(t | 0x7fff7fff7fff7f7fULL)))
>> +            continue;
>> +#else
>> +    for (; p + 2 <= end; p += 5) {
>> +        uint32_t t = AV_RN32A(p - 2);
>> +        if (!((t - 0x01000101U) & ~(t | 0x7fff7f7fU)))
>> +            continue;
>> +#endif
>> +        /* find the first zero byte in t */
>> +#if HAVE_BIGENDIAN
>> +        while (t >> (sizeof(t) * 8 - 8)) {
>> +            t <<= 8;
>> +            p++;
>> +        }
>> +#else
>> +        while (t & 0xff) {
>> +            t >>= 8;
>> +            p++;
>> +        }
>> +#endif
> 
> this maybe can be simplified by using ff_startcode_find_candidate_c()
> 
There is a little different. ff_startcode_find_candidate_c find the first "0x00", but we only care "0x00 0x00".
Use 0x0100010001000101ULL not 0x0101010101010101ULL to reduce the hit ratio of lonely "0x00", so it can be faster
if there are some lonely "0x00" in the buffer.
> 
> [...]
> 



More information about the ffmpeg-devel mailing list