[FFmpeg-devel] [PATCH] lavf/mux: bypass interleave delay early when waiting on subtitle streams

Andreas Rheinhardt andreas.rheinhardt at gmail.com
Thu Mar 12 09:51:25 EET 2020


rcombs:
> This avoids long delays when converting live streams that have sparse subtitles
> ---
>  libavformat/avformat.h      |  9 +++++++++
>  libavformat/mux.c           | 25 +++++++++++++++++++++----
>  libavformat/options_table.h |  1 +
>  libavformat/version.h       |  2 +-
>  4 files changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/libavformat/avformat.h b/libavformat/avformat.h
> index 9b9b634ec3..da7e5755e8 100644
> --- a/libavformat/avformat.h
> +++ b/libavformat/avformat.h
> @@ -1957,6 +1957,15 @@ typedef struct AVFormatContext {
>       * - decoding: set by user
>       */
>      int max_probe_packets;
> +
> +    /**
> +     * Maximum buffering duration for interleaving sparse streams.
> +     *
> +     * @see max_interleave_delta
> +     *
> +     * Applies only to subtitle and data streams.
> +     */
> +    int64_t max_sparse_interleave_delta;
>  } AVFormatContext;
>  
>  #if FF_API_FORMAT_GET_SET
> diff --git a/libavformat/mux.c b/libavformat/mux.c
> index d88746e8c5..f2f272cf65 100644
> --- a/libavformat/mux.c
> +++ b/libavformat/mux.c
> @@ -1033,6 +1033,7 @@ int ff_interleave_packet_per_dts(AVFormatContext *s, AVPacket *out,
>      AVPacketList *pktl;
>      int stream_count = 0;
>      int noninterleaved_count = 0;
> +    int sparse_count = 0;
>      int i, ret;
>      int eof = flush;
>  
> @@ -1044,6 +1045,9 @@ int ff_interleave_packet_per_dts(AVFormatContext *s, AVPacket *out,
>      for (i = 0; i < s->nb_streams; i++) {
>          if (s->streams[i]->last_in_packet_buffer) {
>              ++stream_count;
> +        } else if (s->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_SUBTITLE ||
> +                   s->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_DATA) {
> +            ++sparse_count;
>          } else if (s->streams[i]->codecpar->codec_type != AVMEDIA_TYPE_ATTACHMENT &&
>                     s->streams[i]->codecpar->codec_id != AV_CODEC_ID_VP8 &&
>                     s->streams[i]->codecpar->codec_id != AV_CODEC_ID_VP9) {
> @@ -1054,10 +1058,12 @@ int ff_interleave_packet_per_dts(AVFormatContext *s, AVPacket *out,
>      if (s->internal->nb_interleaved_streams == stream_count)
>          flush = 1;
>  
> -    if (s->max_interleave_delta > 0 &&
> -        s->internal->packet_buffer &&
> +    if (s->internal->packet_buffer &&
>          !flush &&
> -        s->internal->nb_interleaved_streams == stream_count+noninterleaved_count
> +        ((s->max_interleave_delta > 0 &&
> +          s->internal->nb_interleaved_streams == stream_count+noninterleaved_count+sparse_count) ||
> +         (s->max_sparse_interleave_delta > 0 &&
> +          s->internal->nb_interleaved_streams == stream_count+sparse_count))

If max_sparse_interleave_delta has its default value and
max_interleave_delta has been explicitly set to zero (thereby
indicating that one should wait until one has one packet for each
stream), the check used here might output a packet even if one does
not have packets for some of the sparse streams. This is in
contradiction to the documentation of max_interleave_delta.

(Nit: Use stream_count+sparse_count+noninterleaved_count.)

>      ) {
>          AVPacket *top_pkt = &s->internal->packet_buffer->pkt;
>          int64_t delta_dts = INT64_MIN;
> @@ -1078,12 +1084,23 @@ int ff_interleave_packet_per_dts(AVFormatContext *s, AVPacket *out,
>              delta_dts = FFMAX(delta_dts, last_dts - top_dts);
>          }
>  
> -        if (delta_dts > s->max_interleave_delta) {
> +        if (s->max_interleave_delta > 0 &&
> +            delta_dts > s->max_interleave_delta &&
> +            s->internal->nb_interleaved_streams == stream_count+noninterleaved_count+sparse_count) {
>              av_log(s, AV_LOG_DEBUG,
>                     "Delay between the first packet and last packet in the "
>                     "muxing queue is %"PRId64" > %"PRId64": forcing output\n",
>                     delta_dts, s->max_interleave_delta);
>              flush = 1;
> +        } else if (s->max_sparse_interleave_delta > 0 &&
> +                   delta_dts > s->max_sparse_interleave_delta &&
> +                   s->internal->nb_interleaved_streams == stream_count+sparse_count) {
> +            av_log(s, AV_LOG_DEBUG,
> +                   "Delay between the first packet and last packet in the "
> +                   "muxing queue is %"PRId64" > %"PRId64" and all delayed "
> +                   "streams are sparse: forcing output\n",
> +                   delta_dts, s->max_sparse_interleave_delta);
> +            flush = 1;
>          }
>      }
>  
Do all of these additional checks have a measurable performance impact?

Why didn't you include attached picture-streams (whether or not they
are timed thumbnails) among the sparse streams? (They are counted
among the nb_interleaved_streams and cause lots of delay*.)

The muxing code has even more gotchas: a06fac35 added these exceptions
for VP8 and VP9 (which exempted them from max_interleave_delta which
is actually against said field's documented behaviour) because of
delay caused by libvpx (does this issue still exist?), yet it is
applied indiscriminately even on streamcopy. This and the fact that
the check for how to treat streams without packets will be pretty long
when the checks for sparse streams (incl. attached pictures) have been
added leads me to the following proposal: We add a new field
interleavement_type to AVStream that the caller may set before
avformat_write_header() (or avformat_init_output() if that is called)
and its possible values are
AV_INTERLEAVEMENT_TYPE_NORMAL (this is a normal stream, i.e.
max_interleave_delta applies to it)
AV_INTERLEAVEMENT_TYPE_SPARSE (this is a sparse stream;
max_sparse_interleave_delta applies to it)
AV_INTERLEAVEMENT_TYPE_ESSENTIAL (this is an essential stream and one
should not write any output if one doesn't have a packet for it
(unless explicit flushing/EOF); this is how VP8 and VP9 are treated
right now)
AV_INTERLEAVEMENT_TYPE_IGNORE (not having a packet of a stream of this
type does not block outputting packets; attachment streams must not be
set to anything else)
and a default value that if set will be replaced during initilization
by a sane default value.
This gives a caller more fine-grained control of interleavement and
e.g. allows ffmpeg to treat VP8 and VP9 special when it comes out of
libvpx and normal in other situations.

Finally, this patch is only about reducing delay when waiting for
subtitles and if I am not mistaken, it does not help with situations
where subtitles are available too early like in ticket #7064; and it
only minimally affects e.g. ticket #6037 (with this patch, more
packets are output before writing the trailer (the length of the
muxing queue when writing the trailer is about
max_sparse_interleave_delta instead of max_interleave_delta)), doesn't it?

- Andreas

*: And for some reason if I just use ffmpeg -i <file with attached
pic> -map 0 -c copy -t 1 -f null - it reads the whole file; but that
has to be a bug somewhere else (not the muxing code).


More information about the ffmpeg-devel mailing list