[FFmpeg-devel] [PATCH] mov.c log qt ref extenal essence metadata

Sat Mar 6 14:11:06 EET 2021

Am 2021-03-06 10:48, schrieb Jan Ekström:
> On Sat, Mar 6, 2021 at 10:38 AM emcodem <emcodem at ffastrans.com> wrote:
>> 
>> ---
>>  libavformat/mov.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>> 
>> In quicktime reference files, exposing the parsed info for external 
>> essences location can be very handy for users
>> 
> 
> Unfortunately, as per the discussion we had yesterday on #ffmpeg, Data
> References are not as simple as mov.c might make it feel like.
> 

Thanks again for the chat yesterday! I thought i better open the topic 
here so i can do the work async :D

> So, if we think that a single MOV/MP4 Track is:
> 1. A set of decode'able packets of a certain media type (as far as I
> can tell that has been the limitation, while other things can change
> during a Track the media type is one that doesn't).
> 2. Which is then presented according to a virtual time line (edit
> lists, which we will in this case ignore since they get applied on top
> of the decoded result on the presentation layer, and data references
> are on the packet set level).
> 
> Thus if we go through the layers:
> 1. We have samples (packets in FFmpeg parliance more or less, stsz
> defines the sizes of them and so forth).
> 2. Samples get put into chunks, which are basically tuple of
> (sample_description_index, data offset) - see stsc, stco, co64.
> 3. Sample Description can thought of as a tuple of (AVCodec, the
> extradata (if any) required, data reference index), there is a list of
> them in the Track's stsd box.
> 4. Then finally we get to the data reference list in the dref box of 
> the track.
> 
> Currently as far as I can tell from reading mov_read_stsd /
> ff_mov_read_stsd_entries, it does generate extradata buffer for each
> sample description, but effectively only keeps a single data reference
> around in the MOVStreamContext, skipping the whole chunk matching etc
> part of things :) (if I am reading the code correctly, which I might
> not be).
> 

Hmmm not sure why you refer to extradata, how is this connected to the 
dref besides both are stored on streamcontext level? (but yes, i also 
understand that it would be overwritten for the current pseudotrack in 
case it is called multiple times, not sure if that can happen tough)

> So yea, there's two questions:
> 1. Should this be exposed?

Well it is vital information. mov.c unfortunately misses the 
functionality that the original quicktime engine has: try to resolve the 
referenced path on multiple different locations (e.g. try every 
connected root device/driveletter), so it occcasionally fails to process 
qt ref files.
Now i am not that experienced that i can add this missing part cross-OS 
in C but exposing it is cheap and simple. When it's exposed, API users 
or scripters have a much easier locating the media and set cwd 
accordingly or even work with the referenced media directly.

> 2. If it should be exposed, how? A set of metadata this should not be,
> as this at the very least would end up being a weird set/list of byte
> offsets/sizes and references :)
> 
> So yea, sorry for things not actually being as simple as they look by
> the code in mov.c.

How could it end up as a weird set of byte offests/references? I mean i 
totally see your point that dref is more like on the same level as media 
type, so kind of top-level but i miss any example how to present an 
array of objects on that level.
What my code definitely misses is to add the dref_id, so i imagine the 
"key", e.g. dref_path should be better presented as 
sprintf("dref_path_%d",sc->dref_id).

What you think?