[FFmpeg-devel] [PATCH 1/3] Add av_build_index

Tue Aug 17 13:55:26 CEST 2010

On Sun, Aug 15, 2010 at 08:43:16AM +0200, Michael Chinen wrote:
> Hi,
> 
> On Sun, Aug 15, 2010 at 2:32 AM, Michael Chinen <mchinen at gmail.com> wrote:
> [...]
> >> With these 3 plus FLAC parser patches, some formats should provide
> >> sample-accurate seeking, right? Which are those? What would have to be
> >> done to, say, the (ASF demuxer +) WMA decoder (or any other audio
> >> format) to make it sample-accurate?
> >
> > ogg, FLAC, mp3 are the audio formats provide sample accurate seeking.
> > asf doesn't work well yet, and I'm looking into it.
> 
> Okay, this updated patch plus this 0004 patch for asfdec.c which
> depends on 0001 (but is independent of 0002 and 0003) will give
> support for asfdec.c and allow formats without parsers to work by
> introducing a new virtual function into AVInputFormat for flushing.
> 
> There are also some bugfixes detailed in the patch log.
> 
> I wasn't sure how to send 0004 since it would change the email subject
> and break the thread.
> Please let me know if there is a more preferred way.
> 
> Michael

>  avformat.h |   34 +++++
>  utils.c    |  407 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  2 files changed, 424 insertions(+), 17 deletions(-)
> 85e730e8c6800c0bf1996310942d5b8405242666  0001-Add-av_build_index.patch
> From 91192c8e856a7baa64cae45d725d3e5d8f8b02d2 Mon Sep 17 00:00:00 2001
> From: Michael Chinen <mchinen at gmail.com>
> Date: Fri, 13 Aug 2010 05:43:02 +0200
> Subject: [PATCH 1/4] Add av_build_index
>  Also adds static av_seek_frame_table function and integration into existing seek api
> 
> Second patch submission touchups:
> make sure first timestamp is set
> build index correctly for formats that don't have parsers or AVFMT_GENERIC_INDEX
> flush and update timestamp for non-parallel building.
> Add read_flush to AVInputFormat
> Provides av_build_index support for AVInputFormats without parsers that hold state
> ---
>  libavformat/avformat.h |   34 ++++
>  libavformat/utils.c    |  407 ++++++++++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 424 insertions(+), 17 deletions(-)
> 
> diff --git a/libavformat/avformat.h b/libavformat/avformat.h
> index 452aea6..13184c9 100644
> --- a/libavformat/avformat.h
> +++ b/libavformat/avformat.h
> @@ -408,6 +408,11 @@ typedef struct AVInputFormat {
>  
>      const AVMetadataConv *metadata_conv;
>  
> +    /**
> +     * Reset the format's state to one that does not rely on previous input.
> +     */
> +    int (*read_flush)(struct AVFormatContext *s);

iam not sure about this one, can we postpone this hunk please to later

> +
>      /* private fields */
>      struct AVInputFormat *next;
>  } AVInputFormat;
> @@ -586,6 +591,19 @@ typedef struct AVStream {
>       * Number of frames that have been demuxed during av_find_stream_info()
>       */
>      int codec_info_nb_frames;
> +
> +    /* new av_seek_frame_table() support */
> +#define AV_SEEKTABLE_BUILDING   0x0001 /**< if set the index is being built   */

> +#define AV_SEEKTABLE_CBR        0x0002 /**< if set the file is constant bit
> +                                            rate and the index is implicit    */

the doxy is a bit terse

[...]
> @@ -1337,8 +1338,28 @@ void ff_reduce_index(AVFormatContext *s, int stream_index)
>  
>      if((unsigned)st->nb_index_entries >= max_entries){
>          int i;
> -        for(i=0; 2*i<st->nb_index_entries; i++)
> -            st->index_entries[i]= st->index_entries[2*i];
> +        if(st->seek_table_flags & AV_SEEKTABLE_BUILDING) {
> +            int j            = 0;
> +            int nb_reduced   = max_entries / 2;
> +            int64_t ts_inc   = (st->index_entries[st->nb_index_entries - 1].timestamp
> +                                - st->index_entries[0].timestamp) / nb_reduced;
> +            int64_t start_ts = st->index_entries[0].timestamp;
> +            /* reinsert for even distribution of timestamps */
> +            for (i = 0; i < nb_reduced; i++) {
> +                if (j >= st->nb_index_entries)
> +                    break;
> +                st->index_entries[i] = st->index_entries[j++];
> +                while (j < st->nb_index_entries &&
> +                       (st->index_entries[j].timestamp <
> +                        start_ts + (i + 1) * ts_inc)) {
> +                    j++;
> +                }
> +            }
> +            av_log(s,AV_LOG_DEBUG,"reduced index to size %d (max = %d)\n", i, max_entries);
> +        } else {
> +            for (i = 0; 2 * i < st->nb_index_entries; i++)
> +                st->index_entries[i] = st->index_entries[2*i];
> +        }

this is reimplementing existing code in a different way.
you can use the existing code or improve the existing code and use it
but you cant duplicate it.

>          st->nb_index_entries= i;
>      }
>  }
> @@ -1349,6 +1370,10 @@ int av_add_index_entry(AVStream *st,
>      AVIndexEntry *entries, *ie;
>      int index;
>  
> +    /* Don't add index if we already have a complete one */
> +    if (st->seek_table_flags & AV_SEEKTABLE_FINISHED)
> +        return av_index_search_timestamp(st, timestamp, AVSEEK_FLAG_ANY);
> +
>      if((unsigned)st->nb_index_entries + 1 >= UINT_MAX / sizeof(AVIndexEntry))
>          return -1;
>  

why is this hunk needed?

> @@ -1361,21 +1386,31 @@ int av_add_index_entry(AVStream *st,
>  
>      st->index_entries= entries;
>  
> -    index= av_index_search_timestamp(st, timestamp, AVSEEK_FLAG_ANY);
> +    /* If we are building the table, the indicies added in order, so we don't
> +       have to do expensive searching.                                        */
> +    if (st->seek_table_flags & AV_SEEKTABLE_BUILDING) {
> +        index = st->nb_index_entries++;
> +        ie = &st->index_entries[index];
> +    } else {
> +        index = av_index_search_timestamp(st, timestamp, AVSEEK_FLAG_ANY);

improve av_index_search_timestamp() if there is a problem

>  
> -    if(index<0){
> -        index= st->nb_index_entries++;
> -        ie= &entries[index];
> -        assert(index==0 || ie[-1].timestamp < timestamp);
> -    }else{
> -        ie= &entries[index];
> -        if(ie->timestamp != timestamp){
> -            if(ie->timestamp <= timestamp)
> -                return -1;
> -            memmove(entries + index + 1, entries + index, sizeof(AVIndexEntry)*(st->nb_index_entries - index));
> -            st->nb_index_entries++;
> -        }else if(ie->pos == pos && distance < ie->min_distance) //do not reduce the distance
> -            distance= ie->min_distance;
> +        if (index < 0){
> +            index = st->nb_index_entries++;
> +            ie = &entries[index];
> +            assert(index == 0 || ie[-1].timestamp < timestamp);
> +        } else {
> +            ie= &entries[index];
> +            if (ie->timestamp != timestamp){
> +                if (ie->timestamp <= timestamp)

reindention

[...]

> +        av_log(s, AV_LOG_DEBUG,
> +               "SEEK TABLE: copying over parallel frames\n");
> +        for(i = 0; i < s->nb_streams; i++) {
> +            st = s->streams[i];
> +            av_free(st->index_entries);
> +            st->index_entries                = st->parallel_index_entries;
> +            st->nb_index_entries             = st->parallel_nb_index_entries;
> +            st->index_entries_allocated_size = st->parallel_index_entries_allocated_size;
> +            st->parallel_index_entries       = NULL;
> +            st->seek_table_flags |= AV_SEEKTABLE_FINISHED;
> +        }
> +    }
> +

> +    st = s->streams[stream_index];
> +    if (st->seek_table_flags & AV_SEEKTABLE_FINISHED)
> +        return av_seek_frame_table(s, stream_index, timestamp, flags);

why do we need this?
once the index is complete the existing seek functions should use the index

> +
>      /* first, we try the format specific seek */
>      if (s->iformat->read_seek)
>          ret = s->iformat->read_seek(s, stream_index, timestamp, flags);
> @@ -3686,3 +3813,249 @@ int ff_write_chained(AVFormatContext *dst, int dst_stream, AVPacket *pkt,
>      return av_write_frame(dst, &local_pkt);
>  }
>  
> +#define FF_DETECT_CBR_FPS_GUESS 10 /**< fps to assume if unknown */

this doesnt look very nice

> +static int av_detect_cbr(AVFormatContext *s, AVStream *st) {
> +    //check the stream's codec to see if a number of frames are listed,
> +    //but we can't assume that it is available.
> +    if (s->data_offset > 0 && s->file_size > 0 &&
> +        s->data_offset != s->file_size && st->codec && st->duration > 0) {
> +        double sec_duration = st->duration * av_q2d(st->time_base);

floats should be avoided to have exact and reproduceable across platform
behavior.

[...]

> +        /* Delete old index entries to get a clean table. */
> +        for(i = 0; i < build_ic->nb_streams; i++) {
> +            int64_t start_ts;
> +            build_ic->streams[i]->seek_table_flags |= AV_SEEKTABLE_BUILDING;
> +            build_ic->streams[i]->nb_index_entries = 0;
> +            AVStream* seek_st;

mixes declarations and statements
also please try tools/patcheck

> +            seek_st = build_ic->streams[i];
> +            start_ts = seek_st->start_time != AV_NOPTS_VALUE ?
> +                       seek_st->start_time :
> +                      (seek_st->first_dts  != AV_NOPTS_VALUE ?
> +                       seek_st->first_dts  : 0);
> +
> +            av_update_cur_dts(build_ic, build_ic->streams[i],
> +                              start_ts);
> +        }
> +
> +        orig_flags = build_ic->flags;
> +        build_ic->flags |= AVFMT_FLAG_GENPTS;
> +        for (;;) {
> +            do {
> +                ret = av_read_frame(build_ic, &pkt);
> +            } while (ret == AVERROR(EAGAIN));
> +            if (ret < 0)
> +                break;
> +            av_free_packet(&pkt);
> +        }

duplicates code from av_seek_frame_generic()

> +        build_ic->flags = orig_flags;
> +        for(i = 0; i < build_ic->nb_streams; i++)
> +            build_ic->streams[i]->seek_table_flags ^= AV_SEEKTABLE_BUILDING;

i think |= and &=~ are more clear in terms of readability

> +
> +        ret = 0;
> +    cleanup:
> +        /* If built on a parallel context, mark the original context. */
> +        if (flags & AV_BUILD_INDEX_PARALLEL) {
> +            if (build_ic) {
> +                for (i = 0; i < build_ic->nb_streams; i++) {
> +                    if (ret >= 0) {
> +                        av_log(s, AV_LOG_DEBUG,
> +                               "SEEK TABLE: marking %i frames from"
> +                               " parallel stream ready for copy\n",
> +                               build_ic->streams[i]->nb_index_entries);
> +                        s->streams[i]->parallel_index_entries =
> +                            build_ic->streams[i]->index_entries;
> +                        s->streams[i]->parallel_nb_index_entries =
> +                            build_ic->streams[i]->nb_index_entries;
> +                        s->streams[i]->parallel_index_entries_allocated_size =
> +                            build_ic->streams[i]->index_entries_allocated_size;

i think this should ignore the 80 char limit as it would be more readable then

> +                        build_ic->streams[i]->index_entries = NULL;
> +                    }
> +                    avcodec_close(build_ic->streams[i]->codec);
> +                }

> +                s->flags |= AVFMT_FLAG_TABLE_READY;

this is possibly not thread safe or at least asking for future bugs
a variable should not be written to by more than 1 thread without
locking. |= is not guranteed to be an atomic operation,
bits are not independent variables

> +                av_close_input_file(build_ic);
> +            }
> +        }
> +        if (ret < 0)
> +            return ret;
> +    }
> +

> +    /* Since we probably have moved the read cursor to the end,
> +       return seek to start of stream for non-parallel clients.
> +       Not sure if this the desired behavior, but it seems convinient. */
> +    if (!(flags & AV_BUILD_INDEX_PARALLEL) &&
> +        !(st->seek_table_flags & AV_SEEKTABLE_COPIED)) {
> +        int64_t start_ts;
> +
> +        ff_read_frame_flush(s);
> +        for(i = 0; i < s->nb_streams; i++)
> +            if (s->streams[i]->nb_index_entries)
> +                s->streams[i]->seek_table_flags |= AV_SEEKTABLE_FINISHED;
> +
> +        start_ts = st->start_time != AV_NOPTS_VALUE ? st->start_time :
> +                  (st->first_dts  != AV_NOPTS_VALUE ? st->first_dts  : 0);
> +        if ((ret = av_seek_frame(s, stream_index,
> +                                 start_ts, AVSEEK_FLAG_ANY)) < 0) {
> +            int data_offset = (s->data_offset < 0) ||
> +                (s->data_offset >= s->file_size) ? 0 : s->data_offset;
> +
> +            av_log(s, AV_LOG_DEBUG,
> +                   "SEEK TABLE: error seeking using new index: %i,"
> +                   " trying url_fseek\n",
> +                   ret);
> +
> +            /* last ditch effort to seek using the file pointer. */
> +            if ((ret = url_fseek(s->pb, data_offset, SEEK_SET)) < 0) {
> +                av_log(s,AV_LOG_DEBUG,
> +                       "SEEK TABLE: error seeking with url_fseek: %i\n",
> +                       ret);
> +                return ret;
> +            }
> +        }
> +    }

is all this needed?
or does a subset of this work as well?

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Incandescent light bulbs waste a lot of energy as heat so the EU forbids them.
Their replacement, compact fluorescent lamps, much more expensive, dont fit in
many old lamps, flicker, contain toxic mercury, produce a fraction of the light
that is claimed and in a unnatural spectrum rendering colors different than
in natural light. Ah and we now need to turn the heaters up more in winter to
compensate the lower wasted heat. Who wins? Not the environment, thats for sure
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100817/e0da36b9/attachment.pgp>