[FFmpeg-devel] [PATCH] metadata conversion API

Baptiste Coudurier baptiste.coudurier
Sat Feb 28 22:24:41 CET 2009

On 2/28/2009 5:57 AM, Michael Niedermayer wrote:
> On Fri, Feb 27, 2009 at 06:15:16PM -0800, Baptiste Coudurier wrote:
>> Michael Niedermayer wrote:
>>> On Fri, Feb 27, 2009 at 04:44:29PM -0800, Baptiste Coudurier
>>> wrote:
>>>> On 2/27/2009 4:30 PM, Aurelien Jacobs wrote:
>>>>> Baptiste Coudurier wrote:
>>>>>> On 2/25/2009 5:13 PM, Aurelien Jacobs wrote:
>>>>>>> Hi,
>>>>>>> There is one last and important issue I want to address
>>>>>>> with the new metadata API. Old API allowed client apps
>>>>>>> and muxers to get a few select well known tags (title,
>>>>>>> author...). With the new API, there is no simple way to 
>>>>>>> do that, right now. For example, if you demux an ASF
>>>>>>> file, and want to get the name of the album,
>>>>>>> av_metadata_get(..., "album", ...) won't give you any
>>>>>>> results, because ASF stores this information in a tag 
>>>>>>> named "AlbumTitle". There are lots of examples with
>>>>>>> various demuxers, even for simple common tags. This also
>>>>>>> prevent correct remuxing between different containers.
>>>>>> First, thanks for your work Aurel, this is greatly
>>>>>> appreciated.
>>>>> You're welcome :-)
>>>>>> I have a few ideas: Would it be possible to also export
>>>>>> container information as "metadata" ?
>>>>> This is definitely possible.
>>>>>> Like aspect ratio, width, height. This would avoid
>>>>>> duplicating fields from AVCodecContext, and make this
>>>>>> information available as a simple way to user wanting for
>>>>>> example to use libavformat to retrieve media information.
>>>>> I think it highly depends on the target usage of the various
>>>>> fields you want to export. If those fields are mostly
>>>>> targeted for textual user information, using metadata is
>>>>> probably the way to go. But if the fields are mostly targeted
>>>>> for numerical parameters to the software, then metadata may
>>>>> not be the most practical way go.
>>>> It would be easy atoi/atoll value though. I will propose again
>>>> my solution to have some "type" field where we could set "INT",
>>>> "INT64", "UTF8", "RATIONAL", etc...
>>> what is the difference between a string representing INT and
>>> INT64 ? if atoi() is no problem then it also should be no problem
>>> to check if a string is a number. This has the advantage that it
>>> can be muxed in containers that do not support storing such
>>> information.
>> The advantage is a direct mapping atoi/atoll.
> i do not understand what you mean
>> If the muxer wants to take the risk to store INT64 into INT it will
>> check the value.
> elaborate please, what you say makes no sense at all to me
> a muxer always needs to check the value it parses of a unknown 
> string, some metadata might be marked as INT but really wont be
> nummeric at all and what you suggest sounds like if(type==INT64){ x=
> atol(str) if(x<INT_MIN || x>INT_MAX) ... }else{ x= atio(str) }
> while no matter if there are seperate types or not the muxer has to
> do x= strtoll(str, tail) if(tail==str || tail!= str+strlen(str) ||
> x<MIN || x>MAX) ...
> atol/atoi should not be used, even the libc docs say this "The `atoi'
> function is also considered obsolete; use `strtol' instead."

Ok, for strol/strtoll, what I meant, I'll try to explain better is:
if you export "mov/duration" along with INT64, user knows easily that
number stored as string is potentially INT64, so he can act accordingly,
with a table, like done using AVOptions.

>>> Or how would you store these types? If they are lost on remuxing
>>> or their types are randomized then they arent particularely
>>> usefull IMHO
>>>> I don't think it might be too bloated, and would be really
>>>> generic and useful. We have too many fields in AVCodecContext
>>>> IMHO.
>>> I dont mind at all if information is exported using some
>>> name-value system but there are some things that are very
>>> important 1. it must be simple & clean & small & fast 2. These
>>> fields would be the public API and as such would be constraint by
>>> the same compatibility issues that apply to fields in
>>> AVCodecContext And it seems to me you do not realize this at all
>>> due to your example useage, but a demuxer could  under no
>>> circumstances export a random "width" "720" that differed in
>>> meaning from a "width" value of another demuxer.
>> "width" in .mov is "width" in .mov, not "width" in .mkv.
> yes but it should be called ".mov/width" and not "width" the reason
> is as i alraedy said because of the mess you create when remuxing
> this.

If your concern is exporting ".mov/width" or "mov/width" then yes I
agree this might be better, you could have said that it was the naming
which you were against.

>> Well I think 100% mapping between containers is _impossible_, and 
>> therefore I don't see a need to find a common denominator.
> Sounds like the common argument of: "a solution working in 100% of
> the cases is impossible so i will choose one that works in 0% instead
> of one that works in 99%."
> I mean what you just said is prety much you dont want the user to be
> able to use libavformat without having specific code for each
> demuxer.
> and again i like to repeat, that i dont mind you exporting the width 
> from mov but you CANNOT export this in a field that is ambigously
> named, that is ".mov/width" is fine "width" is not

Not really, what I meant is that a solution working 90% is better that
nothing. I want the user to be able to use lavf easily, and I want also
power users with knowledge to be able to do what they want.

>> I don't see the need to "restrict" a metadata to a meaning.
>> Someone who knows even more the format can retrieve this value if
>> he wishes and do whatever he wants with it.
> only if you export the whole set of parameters, do you plan to do
> this?

I plan to export as many values as I will need. Like "pasp" atom,
"colr", "fiel", for mov at least, I plan to export many values from MXF
demuxer too, these values cannot really be mapped to anything existing
anywhere else, unfortunately.

I'd like to export some values from the mpeg-2 bitstream if possible,
without adding an AVCodecContext field for each information.

> there are parametrs allowing a full afine or linear transformation
> of the image IIRC. just knowing the final width is not enough to the 
> user application author who knows mov better then the mov demuxer 
> maintainer as you seem to suggest ...

This may be true, the mechanism I propose permits to export easily
without adding new fields, information which can be useful to me.

The .mov width is one problem. I think, however, that a simple crop will
work in 90% of the cases, so access to this information will fix 90% of
the problem, this is better than nothing.

>> I certainly don't understand 100% of .mp4/.mov, I don't see why
>> user should be restricted by my ignorance, while I can export
>> everything in "udta" assuming the formating is standardized.
> The problem is you dont export the information a user would need you 
> just suggested to export the final width in a misnamed metadata tag

Well, you think it is misnamed, the term "width" is written in .mov and
.mp4 specs, so I don't think it is _mis_named, it may not be complete
enough to your taste and you want to add "mov/" before, Im not against
this, though checking AVInputFormat could be enough.

>> User can retrieve "width" in .mov if he wants, a simple check on
>> input format is enough.
> but why? if you export it as ".mov/width" the check wont be needed
> ...

I'm not against exporting "mov/width" or ".mov/width".

>> Yes, this would be part of public api, I don't see this a problem.
>>> Thus i belive that in the end this thing is adding alot of shiny
>>> layers that do nothing at all. But feel free to proof me wrong
>>> ...
>> Many people wants to use FFmpeg to display information about a
>> file, including metadata. This is IMHO a good and simple way to
>> achieve it
> AVOptions allows full enumeration of all fields with names so no new 
> possibilities are opened by making AVCodecContext.* switch to this 
> name/value system

AVOptions is usefull to set parameters to fields in a structure, I don't
thinks it would be the easiest to use to export any information.

> [...]
>>> So what would this all really solve? * Fewer inactive fields,
>>> currently large parts of AVCodecContext are unused for some
>>> codecs though the bigger codecs probably use most. * No
>>> psychological issue with adding field to a large struct, though
>>> people would have to add them to a equally long documentation
>>> that documented the exact meaning and format, who is allowed to
>>> set (app/lavc/lavf), ...
>>> What does it not solve? * exporting mov croping rectangle with a
>>> API that differs from mpeg1/2 and h264 croping rectangles.
>> It can solve this issue, if user wants to honor it, user can
>> because he has access to it. Granted he knows what the value means,
>> but again our ignorance should not prevent him from acheiving his
>> goal.
> see above, you first have to export the whole transformation matrix,
> the width/height is not enogugh to differentiate scale from crop no
> matter how smart the user is. Also a user knowing mov internals so
> well is likely going to write his own demuxer.

Exporting "mov/transformation_matrix" is possible.

>>> * allowing demuxer maintainers to export fields with arbirary
>>> name and value, insted each addition would need to be discussed
>>> to find the format for the value and the name that is best for
>>> most demuxers.
>> I don't agree with this, see above. "width" in .mov != "width" in
>> .mkv, IMHO that's an utopy to try to make everything fitting in the
>> same shape.
> than do NOT use the name "width" if it is as you say not "width"

It's ok if it's just about the name :>

>>> * more compatibility for apps, apps already can through
>>> AVOptions set and get by name and enumerate fields.
>> AVOptions uses OPT_<type> isn't it ? Why don't you want to apply
>> this to AVMetadata ?
> i explained it already above:
>>> [...] This has the advantage that it can be muxed in containers
>>> that do not support storing such information.
> [...]
>>> Or how would you store these types? If they are lost on remuxing
>>> or their types are randomized then they arent particularely
>>> usefull IMHO

Well, they are useful to gather information, print metadata and
debugging, maybe less useful for remuxing inter-container, however, mov
to mov could end in a pretty accurate way.

Exporting all information using AVFormatContext fields will lead to an
huge struct.

Baptiste COUDURIER                              GnuPG Key Id: 0x5C1ABAAA
Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
checking for life_signs in -lkenny... no
FFmpeg maintainer                                  http://www.ffmpeg.org

More information about the ffmpeg-devel mailing list