[FFmpeg-devel] [PATCH 1/3] spherical: Add tiled equirectangular type and projection-specific properties

Aaron Colwell acolwell at google.com
Wed Feb 15 21:54:34 EET 2017

Hi Vittorio,

I could reply to all your further comments and clarify some
miscommunication, but I'm just tired of arguing about this. As long as you
are convinced that you can round trip the information defined in the spec,
then I am fine with whatever you folks choose to do. I would rather make
some progress than nothing at all. I also appreciate your willingness to
create alternative patches and advocate for other use cases.

This may not be a blocker for this patch, but one issue with converting the
bounds to pixels like you do here is that resizing a video could result in
incorrect metadata being generated when muxing. If you keep the bounds in
the 0.0-1.0 fixed point space this problem doesn't happen since it is a
resolution independent representation. If you still want to use pixels here
then the resizing filter will need to become aware of AVSphericalMapping
and adjust it accordingly. I think cubemap padding in pixels may have a
similar issue. This is no way intended to be a blocking comment. It is just
meant to raise awareness on something you might not have considered.

Thank you for your patience with me.


On Wed, Feb 15, 2017 at 10:43 AM Vittorio Giovara <
vittorio.giovara at gmail.com> wrote:

> Hi Aaron,
> On Wed, Feb 15, 2017 at 11:48 AM, Aaron Colwell <acolwell at google.com>
> wrote:
> >> If the spec changes, it will be the contents of the equi/cbmp/mesh.
> >> By exporting them raw as extradata, said changes in the spec would
> >> require no changes to our implementation.
> >
> >
> > This is one of the main reasons I was suggesting this path. I think of
> these
> > extradata fields much like the extra data that codecs have. It really is
> > only important to the code that needs to render a specific projection.
> For
> > transcoding, you mainly just need to convey the information in a lossless
> > manner from demuxer to muxer.
> I understand that, but "transcoding" is not the only task performed here.
> Yes, I agree that for your single usecase of converting mp4 metadata
> to mkv metadata your solution a feasible one, but the code here also
> has
> 1. convey information to the users
> 2. fill in side data and frame data (remember this is a pkt side data
> that is exported to frames since spherical is a frame property, it
> doesn't matter that currently the spec is container level only for
> now)
> 3. offer it to any muxer for additional parsing
> You can't expect to fill in binary data and have all three places
> behave correctly, and handle multiple theoretical version while at it.
> You need an abstraction like the one provided by the API in this
> patch.
> > I anticipate the spec will change in the future. My plan is that no
> change
> > will break what is currently specified in the spec right now, but I
> > anticipate some changes will be made. Having a solution that can
> gracefully
> > handle this would be nice.
> A *binary* solution is certainly not nice, especially if you have
> multiple version.
> Besides, you should really make an effort to change the spec as little
> as possible now that it is public, as more and more software will
> depend on it and you can't realistically expect that everyone will be
> always up-to-date (ffmpeg included).
> >> >> Wouldn't it be a better idea to export the binary data of the
> >> >> equi/cbmp/mesh boxes into an extradata-like field in the
> >> >> AVSphericalMapping struct, and let the downstream application parse
> >> >> it instead?
> >> >
> >> > No I don't think so, lavf is an abstraction layer and one of its tasks
> >> > is to provide a (simple?) unified entry layer. and letting the user
> >> > parse binary data is IMO bad design and very fragile. Also it's not
> >> > impossible that another standard for tagging spherical metadata
> >> > emerges in the future: the current API could very easily wrap it,
> >> > while exporting the binary entry would be too specification-specific
> >> > and it would be tied too heavily on the google draft.
> >>
> >
> > I agree with Vittorio that having some form of abstraction is a good
> thing
> > and having binary data in places can be problematic. It feels like we
> could
> > find some middle ground here by providing helper functions that parse the
> > binary data into projection specific structs and back just like codecs
> code
> > tends to do. I feel like this provides a reasonable balance between
> having a
> > common set of fields where things actually have common semantics like
> > projection_type, yaw/pitch/roll, & extra_data while also providing a way
> to
> > get access to projection specific information in a simple way.
> I don't feel like this has anything to do with the current patchset.
> This set only allows for properly conveying information in a clean
> matten to the user (and a muxer if you will can be consider as a plain
> user). Having a single parsing routine is IMO a useless optimization,
> and can be discussed in another thread.
> > At the end of the day players really just need to care about a rendering
> > mesh so in some sense it would be nice to have that be the abstraction
> for
> > the player use case. That is basically what we have done in our internal
> > player implementations. That could easily be handled by helper functions,
> > but would be a bad representation for AVSphericalMapping because it would
> > make transcoding/transmuxing painful.
> Again, a "player" is not your single user here, you have probing tool
> that need to provide data understandable by human people. And please
> stop saying transcoding/transmuxing is painful, on the opposite it's
> incredibly simple, you just need to read AVSphericalMapping (exactly
> as any other user) and write down this information. If you think that
> new (non-binary) fields should be added to further simplify muxing,
> please do tell.
> >> Wait for Aaron's opinion before addressing reviews and pushing. He
> >> sent a different patchset himself and it wouldn't be nice to push
> >> yours without at least giving him a chance to comment.
> >>
> >
> > Thank you. I really appreciate this. I hope it is clear I'm really
> trying to
> > find some middle ground here. My team has been working on this stuff
> inside
> > Google for several years and I'm just trying to bring some of that
> > experience and learning into the open here. I defer to you because this
> is
> > your project and I am relatively sparse contributor.
> Please don't get me wrong, it's really commendable that Google is
> opening up more and more, after using open source for so many years.
> On the other hand, do understand that Google is not the only ffmpeg
> user in the world, and that it needs to respect additional, incredibly
> different, and possibly boring use cases all around that might matter
> to other users.
> --
> Vittorio

More information about the ffmpeg-devel mailing list