[FFmpeg-devel] [PATCH 03/23] avformat/matroskaenc: Use random TrackUID

Andreas Rheinhardt andreas.rheinhardt at gmail.com
Sun Nov 24 11:08:00 EET 2019


Michael Niedermayer:
> On Wed, Nov 06, 2019 at 03:49:02AM +0100, Andreas Rheinhardt wrote:
>> Up until now, the TrackUID of a Matroska track which is supposed to be
>> random was not random at all: It always coincided with the TrackNumber
>> which is usually the 1-based index of the corresponding stream in the
>> array of AVStreams. This has been changed: It is now set via an AVLFG
>> if AVFMT_FLAG_BITEXACT is not set. Otherwise it is set like it is set
>> now (the only change happens if an explicit track number has been
>> choosen via dash_track_number, because the system used in the normal
>> situation is now used, too). In particular, no FATE tests need to be
>> updated.
>>
>> This also fixes a bug in case the dash_track_number option was used:
>> In this case the TrackUID was set to the track number, but the tags were
>> written with a TagTrackUID simply based upon the index, so that the tags
>> didn't apply to the track they ought to apply to.
>>
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt at gmail.com>
>> ---
>> mkv_get_uid() might be overkill, but I simply wanted to be sure that
>> there are no collisions.
>>
>>  libavformat/matroskaenc.c | 65 ++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 57 insertions(+), 8 deletions(-)
>>
>> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c
>> index de57e474be..b87d15b7ff 100644
>> --- a/libavformat/matroskaenc.c
>> +++ b/libavformat/matroskaenc.c
>> @@ -94,6 +94,7 @@ typedef struct mkv_cues {
>>  typedef struct mkv_track {
>>      int             write_dts;
>>      int             has_cue;
>> +    uint32_t        uid;
>>      int             sample_rate;
>>      int64_t         sample_rate_offset;
>>      int64_t         last_timestamp;
>> @@ -1199,8 +1200,7 @@ static int mkv_write_track(AVFormatContext *s, MatroskaMuxContext *mkv,
>>      track = start_ebml_master(pb, MATROSKA_ID_TRACKENTRY, 0);
>>      put_ebml_uint (pb, MATROSKA_ID_TRACKNUMBER,
>>                     mkv->is_dash ? mkv->dash_track_number : i + 1);
>> -    put_ebml_uint (pb, MATROSKA_ID_TRACKUID,
>> -                   mkv->is_dash ? mkv->dash_track_number : i + 1);
>> +    put_ebml_uint (pb, MATROSKA_ID_TRACKUID, mkv->tracks[i].uid);
>>      put_ebml_uint (pb, MATROSKA_ID_TRACKFLAGLACING , 0);    // no lacing (yet)
>>  
>>      if ((tag = av_dict_get(st->metadata, "title", NULL, 0)))
>> @@ -1650,7 +1650,8 @@ static int mkv_write_tags(AVFormatContext *s)
>>          if (!mkv_check_tag(st->metadata, MATROSKA_ID_TAGTARGETS_TRACKUID))
>>              continue;
>>  
>> -        ret = mkv_write_tag(s, st->metadata, MATROSKA_ID_TAGTARGETS_TRACKUID, i + 1);
>> +        ret = mkv_write_tag(s, st->metadata, MATROSKA_ID_TAGTARGETS_TRACKUID,
>> +                            mkv->tracks[i].uid);
>>          if (ret < 0) return ret;
>>      }
>>  
>> @@ -1658,13 +1659,15 @@ static int mkv_write_tags(AVFormatContext *s)
>>          for (i = 0; i < s->nb_streams; i++) {
>>              AVIOContext *pb;
>>              AVStream *st = s->streams[i];
>> +            mkv_track *track = &mkv->tracks[i];
>>              ebml_master tag_target;
>>              ebml_master tag;
>>  
>>              if (st->codecpar->codec_type == AVMEDIA_TYPE_ATTACHMENT)
>>                  continue;
>>  
>> -            mkv_write_tag_targets(s, MATROSKA_ID_TAGTARGETS_TRACKUID, i + 1, &tag_target);
>> +            mkv_write_tag_targets(s, MATROSKA_ID_TAGTARGETS_TRACKUID,
>> +                                  track->uid, &tag_target);
>>              pb = mkv->tags_bc;
>>  
>>              tag = start_ebml_master(pb, MATROSKA_ID_SIMPLETAG, 0);
>> @@ -1863,10 +1866,6 @@ static int mkv_write_header(AVFormatContext *s)
>>              version = 4;
>>      }
>>  
>> -    mkv->tracks = av_mallocz_array(s->nb_streams, sizeof(*mkv->tracks));
>> -    if (!mkv->tracks) {
>> -        return AVERROR(ENOMEM);
>> -    }
>>      ebml_header = start_ebml_master(pb, EBML_ID_HEADER, MAX_EBML_HEADER_SIZE);
>>      put_ebml_uint  (pb, EBML_ID_EBMLVERSION       ,           1);
>>      put_ebml_uint  (pb, EBML_ID_EBMLREADVERSION   ,           1);
>> @@ -2667,8 +2666,42 @@ static int webm_query_codec(enum AVCodecID codec_id, int std_compliance)
>>      return 0;
>>  }
>>  
>> +static uint32_t mkv_get_uid(const mkv_track *tracks, int i, AVLFG *c)
>> +{
>> +    uint32_t uid;
>> +
>> +    for (int j = 0, k; j < 5; j++) {
>> +        uid = av_lfg_get(c);
>> +        if (!uid)
>> +            continue;
>> +        for (k = 0; k < i; k++) {
>> +            if (tracks[k].uid == uid) {
>> +                /* Was something wrong with our random seed? */
>> +                av_lfg_init(c, av_get_random_seed());
>> +                break;
>> +            }
>> +        }
>> +        if (k == i)
>> +            return uid;
>> +    }
>> +
>> +    /* Test the numbers from 1 to i. */
>> +    for (int j = 1, k; j < i + 1; j++) {
>> +        for (k = 0; k < i; k++) {
>> +            if (tracks[k].uid == j)
>> +                break;
>> +        }
>> +        if (k == i)
>> +            return j;
>> +    }
>> +    /* Return i + 1. It can't coincide with another uid. */
>> +    return i + 1;
>> +}
>> +
>>  static int mkv_init(struct AVFormatContext *s)
>>  {
>> +    MatroskaMuxContext *mkv = s->priv_data;
>> +    AVLFG c;
>>      int i;
>>  
>>      if (s->nb_streams > MAX_TRACKS) {
>> @@ -2697,7 +2730,23 @@ static int mkv_init(struct AVFormatContext *s)
>>          s->internal->avoid_negative_ts_use_pts = 1;
>>      }
>>  
>> +    mkv->tracks = av_mallocz_array(s->nb_streams, sizeof(*mkv->tracks));
>> +    if (!mkv->tracks) {
>> +        return AVERROR(ENOMEM);
>> +    }
>> +
> 
>> +    if (!(s->flags & AVFMT_FLAG_BITEXACT))
>> +        av_lfg_init(&c, av_get_random_seed());
> 
> would there be a disadvantage if the configuration of a stream / its content
> is used a seed ?
> That way it would be more deterministic
> 
I see several disadvantages:
1. If one duplicates a stream (i.e. muxes it twice in the same file),
they would get the same uid (unless one takes additional measures).
2. If the packet content is used, one can determine the uid only after
having received a packet. This would mean that one has to postpone
writing the Tracks element until one has a packet and the same goes
for the tags given that the uid is used to convey what they apply to.
One could solve this problem in the seekable, non-livestreaming case
by reserving size to be overwritten later, but this is nontrivial and
I'd like to avoid that.
3. If the configuration record is used, then you won't get anything
remotely unique: Using the same encoder settings will likely produce
files with identical uids.

Furthermore, this approach means that I can reuse the code for both
attachments as well as tracks. (See the next patch in this series.)

- Andreas


More information about the ffmpeg-devel mailing list