[FFmpeg-devel] [PATCH] libavformat/matroskaenc: Add WebM DASH support

Thu May 22 00:13:48 CEST 2014

On Wed, May 21, 2014 at 02:03:47PM -0700, Vignesh Venkatasubramanian wrote:
> On Mon, May 19, 2014 at 2:37 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Mon, May 19, 2014 at 10:09:59AM -0700, Vignesh Venkatasubramanian wrote:
> >> On Tue, May 13, 2014 at 10:27 PM, wm4 <nfxjfg at googlemail.com> wrote:
> >> > On Wed, 14 May 2014 01:23:35 +0200
> >> > Michael Niedermayer <michaelni at gmx.at> wrote:
> >> >
> >> >> On Tue, May 13, 2014 at 03:56:45PM -0700, Vignesh Venkatasubramanian wrote:
> >> >> > On Tue, May 13, 2014 at 3:50 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> >> > > On Tue, May 13, 2014 at 03:30:19PM -0700, Vignesh Venkatasubramanian wrote:
> >> >> > >> On Tue, May 13, 2014 at 10:07 AM, Vignesh Venkatasubramanian
> >> >> > >> <vigneshv at google.com> wrote:
> >> >> > >> > On Tue, May 13, 2014 at 6:08 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> >> > >> >> On Mon, May 12, 2014 at 11:25:08PM -0700, Vignesh Venkatasubramanian wrote:
> >> >> > >> >>> On Thu, May 8, 2014 at 10:11 AM, Vignesh Venkatasubramanian
> >> >> > >> >>> <vigneshv at google.com> wrote:
> >> >> > >> >>> > On Wed, May 7, 2014 at 3:37 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> >> > >> >>> >> On Wed, May 07, 2014 at 02:32:26PM -0700, Vignesh Venkatasubramanian wrote:
> >> >> > >> >>> >>> On Wed, May 7, 2014 at 1:40 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> >> > >> >>> >>> > On Wed, May 07, 2014 at 09:46:40AM -0700, Vignesh Venkatasubramanian wrote:
> >> >> > >> >>> >>> >> WebM DASH specification [1] requires the Clusters and Cues to be output in a
> >> >> > >> >>> >>> >> specific way. Adding a flag to matroskaenc that will enable support for
> >> >> > >> >>> >>> >> creating WebM/Mkv files conforming to the WebM DASH specification.
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >> [1] http://wiki.webmproject.org/adaptive-streaming/webm-dash-specification
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >> Signed-off-by: Vignesh Venkatasubramanian <vigneshv at google.com>
> >> >> > >> >>> >>> >> ---
> >> >> > >> >>> >>> >>  libavformat/matroskaenc.c | 68 ++++++++++++++++++++++++++++++++++-------------
> >> >> > >> >>> >>> >>  1 file changed, 50 insertions(+), 18 deletions(-)
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c
> >> >> > >> >>> >>> >> index e5e6116..3cf6e73 100644
> >> >> > >> >>> >>> >> --- a/libavformat/matroskaenc.c
> >> >> > >> >>> >>> >> +++ b/libavformat/matroskaenc.c
> >> >> > >> >>> >>> >> @@ -70,6 +70,7 @@ typedef struct mkv_seekhead {
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >>  typedef struct {
> >> >> > >> >>> >>> >>      uint64_t        pts;
> >> >> > >> >>> >>> >> +    int             stream_idx;
> >> >> > >> >>> >>> >>      int             tracknum;
> >> >> > >> >>> >>> >>      int64_t         cluster_pos;        ///< file offset of the cluster containing the block
> >> >> > >> >>> >>> >>      int64_t         relative_pos;       ///< relative offset from the position of the cluster containing the block
> >> >> > >> >>> >>> >> @@ -114,6 +115,8 @@ typedef struct MatroskaMuxContext {
> >> >> > >> >>> >>> >>      int cluster_size_limit;
> >> >> > >> >>> >>> >>      int64_t cues_pos;
> >> >> > >> >>> >>> >>      int64_t cluster_time_limit;
> >> >> > >> >>> >>> >> +    int is_dash;
> >> >> > >> >>> >>> >> +    int dash_track_number;
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >>      uint32_t chapter_id_offset;
> >> >> > >> >>> >>> >>      int wrote_chapters;
> >> >> > >> >>> >>> >> @@ -399,8 +402,8 @@ static mkv_cues * mkv_start_cues(int64_t segment_offset)
> >> >> > >> >>> >>> >>      return cues;
> >> >> > >> >>> >>> >>  }
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >> -static int mkv_add_cuepoint(mkv_cues *cues, int stream, int64_t ts, int64_t cluster_pos, int64_t relative_pos,
> >> >> > >> >>> >>> >> -                            int64_t duration)
> >> >> > >> >>> >>> >> +static int mkv_add_cuepoint(mkv_cues *cues, int stream, int tracknum, int64_t ts,
> >> >> > >> >>> >>> >> +                            int64_t cluster_pos, int64_t relative_pos, int64_t duration)
> >> >> > >> >>> >>> >>  {
> >> >> > >> >>> >>> >>      mkv_cuepoint *entries = cues->entries;
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >> @@ -413,7 +416,8 @@ static int mkv_add_cuepoint(mkv_cues *cues, int stream, int64_t ts, int64_t clus
> >> >> > >> >>> >>> >>      cues->entries = entries;
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >>      cues->entries[cues->num_entries].pts           = ts;
> >> >> > >> >>> >>> >> -    cues->entries[cues->num_entries].tracknum      = stream + 1;
> >> >> > >> >>> >>> >> +    cues->entries[cues->num_entries].stream_idx    = stream;
> >> >> > >> >>> >>> >> +    cues->entries[cues->num_entries].tracknum      = tracknum;
> >> >> > >> >>> >>> >>      cues->entries[cues->num_entries].cluster_pos   = cluster_pos - cues->segment_offset;
> >> >> > >> >>> >>> >>      cues->entries[cues->num_entries].relative_pos  = relative_pos;
> >> >> > >> >>> >>> >>      cues->entries[cues->num_entries++].duration    = duration;
> >> >> > >> >>> >>> >> @@ -443,7 +447,7 @@ static int64_t mkv_write_cues(AVIOContext *pb, mkv_cues *cues, mkv_track *tracks
> >> >> > >> >>> >>> >>          for (j = 0; j < num_tracks; j++)
> >> >> > >> >>> >>> >>              tracks[j].has_cue = 0;
> >> >> > >> >>> >>> >>          for (j = 0; j < cues->num_entries - i && entry[j].pts == pts; j++) {
> >> >> > >> >>> >>> >> -            int tracknum = entry[j].tracknum - 1;
> >> >> > >> >>> >>> >> +            int tracknum = entry[j].stream_idx;
> >> >> > >> >>> >>> >>              av_assert0(tracknum>=0 && tracknum<num_tracks);
> >> >> > >> >>> >>> >>              if (tracks[tracknum].has_cue)
> >> >> > >> >>> >>> >>                  continue;
> >> >> > >> >>> >>> >> @@ -646,8 +650,10 @@ static int mkv_write_tracks(AVFormatContext *s)
> >> >> > >> >>> >>> >>              get_aac_sample_rates(s, codec, &sample_rate, &output_sample_rate);
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >>          track = start_ebml_master(pb, MATROSKA_ID_TRACKENTRY, 0);
> >> >> > >> >>> >>> >> -        put_ebml_uint (pb, MATROSKA_ID_TRACKNUMBER     , i + 1);
> >> >> > >> >>> >>> >> -        put_ebml_uint (pb, MATROSKA_ID_TRACKUID        , i + 1);
> >> >> > >> >>> >>> >> +        put_ebml_uint (pb, MATROSKA_ID_TRACKNUMBER,
> >> >> > >> >>> >>> >> +                       mkv->is_dash ? mkv->dash_track_number : i + 1);
> >> >> > >> >>> >>> >> +        put_ebml_uint (pb, MATROSKA_ID_TRACKUID,
> >> >> > >> >>> >>> >> +                       mkv->is_dash ? mkv->dash_track_number : i + 1);
> >> >> > >> >>> >>> >>          put_ebml_uint (pb, MATROSKA_ID_TRACKFLAGLACING , 0);    // no lacing (yet)
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >>          if ((tag = av_dict_get(st->metadata, "title", NULL, 0)))
> >> >> > >> >>> >>> >> @@ -1425,7 +1431,8 @@ static void mkv_write_block(AVFormatContext *s, AVIOContext *pb,
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >>      put_ebml_id(pb, blockid);
> >> >> > >> >>> >>> >>      put_ebml_num(pb, size+4, 0);
> >> >> > >> >>> >>> >> -    avio_w8(pb, 0x80 | (pkt->stream_index + 1));     // this assumes stream_index is less than 126
> >> >> > >> >>> >>> >> +    // this assumes stream_index is less than 126
> >> >> > >> >>> >>> >> +    avio_w8(pb, 0x80 | (mkv->is_dash ? mkv->dash_track_number : (pkt->stream_index + 1)));
> >> >> > >> >>> >>> >>      avio_wb16(pb, ts - mkv->cluster_pts);
> >> >> > >> >>> >>> >>      avio_w8(pb, flags);
> >> >> > >> >>> >>> >>      avio_write(pb, data + offset, size);
> >> >> > >> >>> >>> >> @@ -1539,7 +1546,7 @@ static void mkv_flush_dynbuf(AVFormatContext *s)
> >> >> > >> >>> >>> >>      mkv->dyn_bc = NULL;
> >> >> > >> >>> >>> >>  }
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >> -static int mkv_write_packet_internal(AVFormatContext *s, AVPacket *pkt)
> >> >> > >> >>> >>> >> +static int mkv_write_packet_internal(AVFormatContext *s, AVPacket *pkt, int add_cue)
> >> >> > >> >>> >>> >>  {
> >> >> > >> >>> >>> >>      MatroskaMuxContext *mkv = s->priv_data;
> >> >> > >> >>> >>> >>      AVIOContext *pb = s->pb;
> >> >> > >> >>> >>> >> @@ -1596,8 +1603,13 @@ static int mkv_write_packet_internal(AVFormatContext *s, AVPacket *pkt)
> >> >> > >> >>> >>> >>          end_ebml_master(pb, blockgroup);
> >> >> > >> >>> >>> >>      }
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >> -    if ((codec->codec_type == AVMEDIA_TYPE_VIDEO && keyframe) || codec->codec_type == AVMEDIA_TYPE_SUBTITLE) {
> >> >> > >> >>> >>> >> -        ret = mkv_add_cuepoint(mkv->cues, pkt->stream_index, ts, mkv->cluster_pos, relative_packet_pos,
> >> >> > >> >>> >>> >> +    if ((codec->codec_type == AVMEDIA_TYPE_VIDEO && keyframe) ||
> >> >> > >> >>> >>> >> +        codec->codec_type == AVMEDIA_TYPE_SUBTITLE ||
> >> >> > >> >>> >>> >> +        add_cue) {
> >> >> > >> >>> >>> >> +        ret = mkv_add_cuepoint(mkv->cues,
> >> >> > >> >>> >>> >> +                               pkt->stream_index,
> >> >> > >> >>> >>> >> +                               mkv->is_dash ? mkv->dash_track_number : pkt->stream_index + 1,
> >> >> > >> >>> >>> >> +                               ts, mkv->cluster_pos, relative_packet_pos,
> >> >> > >> >>> >>> >>                                 codec->codec_type == AVMEDIA_TYPE_SUBTITLE ? duration : -1);
> >> >> > >> >>> >>> >>          if (ret < 0) return ret;
> >> >> > >> >>> >>> >>      }
> >> >> > >> >>> >>> >> @@ -1615,6 +1627,7 @@ static int mkv_write_packet(AVFormatContext *s, AVPacket *pkt)
> >> >> > >> >>> >>> >>      int64_t cluster_time;
> >> >> > >> >>> >>> >>      AVIOContext *pb;
> >> >> > >> >>> >>> >>      int ret;
> >> >> > >> >>> >>> >> +    int start_new_cluster;
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >>      if (mkv->tracks[pkt->stream_index].write_dts)
> >> >> > >> >>> >>> >>          cluster_time = pkt->dts - mkv->cluster_pts;
> >> >> > >> >>> >>> >> @@ -1632,11 +1645,26 @@ static int mkv_write_packet(AVFormatContext *s, AVPacket *pkt)
> >> >> > >> >>> >>> >>          cluster_size = avio_tell(pb);
> >> >> > >> >>> >>> >>      }
> >> >> > >> >>> >>> >>
> >> >> > >> >>> >>> >> -    if (mkv->cluster_pos != -1 &&
> >> >> > >> >>> >>> >> -        (cluster_size > mkv->cluster_size_limit ||
> >> >> > >> >>> >>> >> -         cluster_time > mkv->cluster_time_limit ||
> >> >> > >> >>> >>> >> -         (codec_type == AVMEDIA_TYPE_VIDEO && keyframe &&
> >> >> > >> >>> >>> >> -          cluster_size > 4 * 1024))) {
> >> >> > >> >>> >>> >> +    if (mkv->is_dash && codec_type == AVMEDIA_TYPE_VIDEO) {
> >> >> > >> >>> >>> >> +        // WebM DASH specification states that the first block of every cluster
> >> >> > >> >>> >>> >> +        // has to be a key frame. So for DASH video, we only create a cluster
> >> >> > >> >>> >>> >> +        // on seeing key frames.
> >> >> > >> >>> >>> >
> >> >> > >> >>> >>> >> +        start_new_cluster = keyframe;
> >> >> > >> >>> >>> >
> >> >> > >> >>> >>> > is it intended to have 1 frame per cluster for intra only videos ?
> >> >> > >> >>> >>>
> >> >> > >> >>> >>> yes. but this only done when -dash parameter is explicitly set. so
> >> >> > >> >>> >>> it's very unlikely that someone would want intra-only video for dash.
> >> >> > >> >>> >>
> >> >> > >> >>> >> Someone might already have a intra only video and want to transmit
> >> >> > >> >>> >> it without reencoding or
> >> >> > >> >>> >> a encoder might choose to encode multiple frames in a row to be
> >> >> > >> >>> >> encoded as intra frames (for example in a very high motion scene)
> >> >> > >> >>> >>
> >> >> > >> >>> >>
> >> >> > >> >>> >
> >> >> > >> >>> > that looks like a valid use case. i will rather limit it by
> >> >> > >> >>> > cluster_time_limit and force a keyframe with the encoder whenever a
> >> >> > >> >>> > cluster is to be created. will update the patch soon.
> >> >> > >> >>> >
> >> >> > >> >>>
> >> >> > >> >>> on second thought, using cluster_time_limit as a sole factor will
> >> >> > >> >>> produce incorrect DASH files in case of remuxing without encoding (as
> >> >> > >> >>> there might not be a key frame when the cluster_time_limit is
> >> >> > >> >>> reached).
> >> >> > >> >>>
> >> >> > >> >>> so there are two options here. one is to use cluster_time_limit as the
> >> >> > >> >>> sole factor (and force keyframes with the encode whenever the cluster
> >> >> > >> >>> time limit is reached). this will create incorrect files in case of
> >> >> > >> >>> remux. the second option is to use the existing patch. this will
> >> >> > >> >>> always produce correct DASH files (although it will be inefficient in
> >> >> > >> >>> the case of intra-only files).
> >> >> > >> >>>
> >> >> > >> >>> thoughts?
> >> >> > >> >>
> >> >> > >> >> I guess we should get it working & compliant first before trying to
> >> >> > >> >> make it behave better for less common cases
> >> >> > >> >> as this seems not as trivial and obvious to fix as it seemed first
> >> >> > >> >>
> >> >> > >> >
> >> >> > >> > ok fine.
> >> >> > >> >
> >> >> > >> >> did you had a chance to look at a fate test for this ?
> >> >> > >> >
> >> >> > >> > working on it. will update the patch when i have it.
> >> >> > >> >
> >> >> > >>
> >> >> > >> is there a way to check for container specific stuff in fate?
> >> >> > >> md5/framemd5/crc/framecrc all seem to operate on either individual
> >> >> > >> packets (not the container) or incrementally on individual packets.
> >> >> > >> since this patch is not going to change the actual contents of the
> >> >> > >> packet itself, none of this is useful in this case. i could just use
> >> >> > >> md5 on the whole file, but that will be way to fragile (for e.g.,
> >> >> > >> muxing application field on the track element is set to lavfxx.yy.zzz
> >> >> > >> and it will break when that changes). any other way to test this?
> >> >> > >
> >> >> > > what kind of check would you (ideall) want to have for this ?
> >> >> >
> >> >> > all we need to check is whether we are creating a new cluster on every
> >> >> > key frame when "-dash 1" is passed.
> >> >>
> >> >> maybe ffprobe and the mov demuxer could be extended to export this
> >> >> kind of information then the output from ffprobe could be used in a
> >> >> fate test.
> >> >> Could also be useful for debuging other mov/mp4 things
> >> >>
> >> >> using one of the unused AVPacket flags might be a easy option, not
> >> >> sure its the prettiest, maybe someone has a better idea
> >> >
> >> > Much less intrusive: add webm dash support to the mkv demuxer, and make
> >> > it blow up if the file is not compliant, and standard compliance is
> >> > enabled.
> >>
> >> i'm sorry. can you please explain this a little more? how exactly do i
> >> make it blow up if the file isn't compliant? because, this patch is
> >> always going to create a standard compliant file when the flag is
> >> passed.
> >
> > one way, check AV_EF_EXPLODE is set in the demuxer and fail by
> > returning AVERROR_... if the file is not compliant somehow
> >
> > then make sure AV_EF_EXPLODE is set for the fate test in question
> >
> 
> i still don't understand how to do this. the patch in it's current
> state will *always* produce a compliant file. so where exactly do i
> set the AV_EF_EXPLODE flag?

the patch you submited adds a feature to the muxer
a fate test would use that feature to mux a file and then demux it
to compare some crc or whatever
but as you noted, this wouldnt detect if the file was somehow broken
but stil demuxed correctly

so one idea would be to make the demuxer (optionally) more picky
and enable this so that when something is wrong it would fail hard
and not return a crc at all or return one for whaveter truncated
result there would be left after the failure.
This would be detected by fate
i dont think its mandatory to implement this if you consider it
not worth the troubble ...

> is there an example elsewhere that i can
> follow?

probably not directly or one could argue any validity check in any
demuxer is one, depending on the viewpoint

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you think the mosad wants you dead since a long time then you are either
wrong or dead since a long time.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140522/cdaec63b/attachment.asc>