[FFmpeg-devel] [PATCH 12/20] avformat/matroskaenc: Improve Cues in case of no video

Mon Apr 6 16:08:09 EEST 2020

Jan Chren (rindeal):
> On Sun, 5 Apr 2020 at 16:01, Andreas Rheinhardt
> <andreas.rheinhardt at gmail.com> wrote:
>>
>> The Matroska muxer currently only adds CuePoints in three cases:
>> a) For video keyframes. b) For the first audio frame in a new Cluster if
>> in DASH-mode. c) For subtitles. This means that ordinary Matroska audio
>> files won't have any Cues which impedes seeking.
>>
>> This commit changes this. For every track in a file without video track
>> it is checked and tracked whether a Cue entry has already been added
>> for said track for the current Cluster. This is used to add a Cue entry
>> for each first packet of each track in each Cluster.
>>
>> Implements #3149.
>>
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt at gmail.com>
>> ---
> 
> Fixed at last, danke schön! This was a very annoying bug.
> 
> One thing I noticed is, however, that the spec recommends "CuePoint
> Elements SHOULD reference audio keyframes at most once every 500
> milliseconds" [1], but this is not checked currently.
> 
> [1]: https://cellar-wg.github.io/matroska-specification/cues.html

I actually pondered whether I should opt for a time-based approach (like
mkvmerge with its 2s) or for the approach that I implemented here. I
chose the latter because with the former one has to perform two seeks
when seeking (when one makes use of the CueRelativePosition element) or
one has to parse a potentially nonnegligible amount of data from the
Cluster. It is the same reason one nowadays starts new Clusters upon
seeing a video keyframe.

That being said it is very hard to break this limit when using the
default values for cluster_size_limit (5 MiB) and cluster_time_limit.
You'd need to have 80 Mib/s to do that and that would be very uncommon
for non-videoo (even for short spikes).

And if the user uses non-default values, then it is because the user
explicitly wants to have smaller Clusters and such a user probably wants
to have more Cue entries. I don't see a reason why an arbitrary value
from the spec should override this (mkvmerge btw also allows to create
files that violate this recommendation if the user opts to add Cue
entries for every frame). I think it should rather be the recommendation
from the spec that should be adapted.

- Andreas