[FFmpeg-devel] [PATCH] Metadata

Aurelien Jacobs aurel
Tue Jan 6 00:55:06 CET 2009


Baptiste Coudurier wrote:

> Michael Niedermayer wrote:
> > On Mon, Jan 05, 2009 at 11:40:12AM -0800, Baptiste Coudurier wrote:
> >> Hi Michael,
> >>
> >> Michael Niedermayer wrote:
> >>> On Sat, Jan 03, 2009 at 03:26:05PM -0800, Baptiste Coudurier wrote:
> >>>> Hi Michael,
> >>>>
> >>>> Michael Niedermayer wrote:
> >>>>> [...]
> >>>>  >
> >>>>> + * 3. A tag whichs value is translated has the ISO 639 3-letter language code
> >>>>> + *    with a '-' between appended. So for example Author-ger=Michael, Author-eng=Mike
> >>>>> + *    the original/default language is in the unqualified "Author"
> >>>>> + *    A demuxer should set a default if it sets any translated tag.
> >>>>  >
> >>>>> [...]
> >>>>  >
> >>>>> +typedef struct {
> >>>>> +    char *key;
> >>>>> +    char *value;
> >>>>> +}AVMetaDataTag;
> >>>> Maybe it would be simpler and more extensible to have a "const char 
> >>>> **attributes" field where to store language, or anything else related to 
> >>>> the AVMetaDataTag entry. This would avoid parsing the '-'.
> >>>>
> >>>> What do people think ?
> >>> I am against it, let me explain why
> >>>
> >>> First, currently metadata support in svn is "too little" that is nothing
> >>> is really supported, no preserving of arbitrary tags, no way for users to
> >>> add anything but 5 standard tags ...
> >> I definitely agree.
> >>
> >>> Aurels variant, that had a language field and did use a tree based metadata
> >>> system allowing metadata about metadata is IMHO "too much" Its not something
> >>> anyone should need, nor is it really needed for language & metadata about
> >>> metadata, and still it wouldnt be able to handle all metadata about other
> >>> metadata like "the email address of the child of the author and producer"
> >>>
> >>> my sugestion of a simple key-value based system
> >>> can be stored in any container that supporte key-value string based
> >>> metadata, and still can represent language and metadata about other metadata.
> >>> Also it can very easily be implemented efficiently, currently all operations
> >>> are O(n) thus it would become slow if there are many tags. But if we would
> >>> use tree.c/h it would all just be O(log n) and its very easy to use tree.c/h
> >>> with it ...
> >>>
> >>> Now if we do add attributes
> >>> * The api to search for tags becomes more complex
> >>> * It is more difficult to use tree.c/h (it needs like qsort a sanely
> >>>   behaving comparission function, which is trivial for char*, less
> >>>   so with an additional attriute list, and even a lot less if we want
> >>>   to actually search for specific attributes)
> >>>
> >>> * No container i know supports arbitrary attributes, thus muxers would
> >>>   either have to convert the attribute list into a string or extract the
> >>>   2 or 3 they suport.
> >> Well, these are good point.
> >> To be clear, I'm not suggesting a tree metadata scheme, but a way to
> >> easily specifiy this key/value metadata details.
> >>
> >> Like language, type (comes from .mov so excpect '\r' as line separator,
> >> encoding is UTF8, etc...)
> >>
> > 
> >> Parsing for '-' is not convenient, 
> > 
> > either theres a single string, in which case some muxers have to parse for -
> > or
> > there are many fields, in which case some other muxers have to combine them
> > in a single string.
> 
> Which muxers ?
> How does .mkv stores lang metadata info if it does so ?

mkv is similar to mov. It stores tags lang in it's own field (ISO 639-2).

> All I see is that for .mov you would have to concatenate key name and
> lang, and muxer would have to split lang from metadata.

In fact you would just call av_metadata_set_with_lang() and
av_metadata_get_with_lang() instead of the basic API.
(possibly with better functions name)
Duplicating this concatenation/splitting in every (de)muxers
which need it would be bad.

> I don't know of any container that use "key"-"lang" at metadata scheme
> (nut maybe ?).

This is indeed the currently proposed format for nut.

> > The convertion doesnt dissapear and because of this IMHO i would prefer the
> > simpler internal repressentation.
> 
> How would you specify to the user that data stored in value is raw data
>  (like jpeg cover), encoded in UTF8/16, special like '\r' line ended ?
> 
> I believe we need a way to specify this to facilitate usage, and this
> could fall into attributes.

Ideally we should support only one common format (UTF8, etc...) so that
data can be re-muxed between any containers.

Aurel




More information about the ffmpeg-devel mailing list