[FFmpeg-devel] [PATCH] libavutil: add clean aperture (CLAP) side data.

Sat May 30 13:41:51 EEST 2020

Hi,

On Fri 29 May 2020, 22:47 Neil Birkbeck, <neil.birkbeck at gmail.com> wrote:

> On Mon, May 11, 2020 at 9:37 PM Neil Birkbeck <neil.birkbeck at gmail.com>
> wrote:
>
> >
> >
> > On Wed, May 6, 2020 at 8:45 AM James Almer <jamrial at gmail.com> wrote:
> >
> >> On 5/6/2020 12:22 PM, Neil Birkbeck wrote:
> >> > On Tue, May 5, 2020 at 5:11 AM Kieran O Leary <
> kieran.o.leary at gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> I broke the threading with my last reply, i apologise. Here goes
> >> another
> >> >> attempt:
> >> >>
> >> >> On Tue, Apr 28, 2020 at 6:23 PM Neil Birkbeck <
> neil.birkbeck at gmail.com
> >> >
> >> >> wrote:
> >> >>
> >> >>> On Tue, Apr 28, 2020 at 3:18 AM Nicolas George <george at nsup.org>
> >> wrote:
> >> >>>
> >> >>>> Andreas Rheinhardt (12020-04-28):
> >> >>>>> That's expected. The patch provided only provides the structure in
> >> >>> which
> >> >>>>> the values are intended to be exported; it does not add any
> demuxing
> >> >> or
> >> >>>>> muxing capabilities for mov or mkv (as you can see from the fact
> >> that
> >> >>>>> none of these (de)muxers have been changed in the patch).
> >> >>>>
> >> >>>> Which is something I intended to comment on: adding code without
> >> users
> >> >>>> is rarely a good idea. I suggest we do not commit until at least
> one
> >> >>>> demuxer use it, preferably at least two. Otherwise, we may realize
> >> that
> >> >>>> “oh crap, it doesn't work” because of a tiny unforeseen detail.
> >> >>>
> >> >>>
> >> >>> Thanks for the feedback. I also have patches for the demux (MOV/MKV)
> >> and
> >> >>> mux (MOV/MKV).
> >> >>>
> >> >>> As there is still the alternative of using the fields in the
> >> >>> AVCodecParameters/AVCodecContext, my intention was to keep the first
> >> >> patch
> >> >>> small to resolve discussion on that point.
> >> >>>
> >> >>> I've included the patches, if you'd like to try test it, Kieren. I
> >> see on
> >> >>> your test file that there may be some slight rounding error making
> >> output
> >> >>> crop 704 not 703 (MKV file ends up with pixel_crop_{left,right} =
> 8).
> >> >>>
> >> >>> /ffprobe ../testdata/clap.mov 2>&1 | grep -A1 "Side"
> >> >>>     Side data:
> >> >>>       Clean aperture:[width 41472/59 height:576/1 h_offset:0/1
> >> >>> v_offset:0/1]
> >> >>> ./ffmpeg -i ../testdata/clap.mov  -vcodec copy -acodec copy
> >> /tmp/clap.mkv
> >> >>> ./ffprobe /tmp/clap.mkv 2>&1 | grep -A1 "Side"
> >> >>>     Side data:
> >> >>>       Clean aperture:[width 704/1 height:576/1 h_offset:0/1
> >> v_offset:0/1]
> >> >>>
> >> >>
> >> >> I have to look deeper into the MKV side of things and most likely
> >> raise it
> >> >> with the cellar mailing list so that better minds than mine can weigh
> >> in. I
> >> >> do see that the rounding up to 704 could be an issue alright.
> >> >> As for MOV, your patch appears to generate the same output clap
> values
> >> as
> >> >> the input, so that's really great! command line and mediainfo trace
> >> below:
> >> >>
> >> >
> >> > Thanks for testing, Kieran and for linking the discussion on the
> cellar
> >> > list.
> >> >
> >> > Any additional thoughts from ffmpeg devs on container-level SideData
> vs
> >> > adding the extra fields into AVCodecParameters/AVCodecContext and
> >> plumbing
> >> > into the frame instead? I anticipate some situations where there can
> be
> >> > interaction between cropping in bitstream and container-level
> cropping.
> >> > Maybe the best way forward is for me to share some sample patches for
> >> the
> >> > alternative to validate whether it supports the various use cases
> >> > (transmux, decode+crop, any other application level handling), and to
> >> > confirm the interface changes for the structs.
> >>
> >> One option could be to also introduce a frame side data for this new
> >> struct and have it replace the AVFrame fields, which would then be
> >> deprecated (But of course keep working until removed).
> >> This would allow to either inject stream side data from clap atoms and
> >> Matroska crop fields into packets (Which would then be propagated to
> >> frames to be applied during decoding), or use and export the bitstream
> >> cropping information as it's the case right now, all while preventing
> >> the addition of new fields to AVCodecParameters.
> >>
> >>
> > I agree that sharing the SideData with the frame could be nice for the
> > above reasons. I guess any interaction or conflict between bitstream &
> > container cropping could then get resolved at the decoder (e.g., assuming
> > that say SPS from h264 and container-level cropping should be composed).
> >
> >
> >> I would like a developer that makes use of this feature to also comment,
> >> especially seeing how the AVFrame fields and this clap side data are
> >> pretty different.
> >>
> >
> > The current CleanAperture representation was in part motivated to 1) keep
> > the representation capable of representing the CLAP atom (same
> > representation and rationals), and 2) to make it unambiguous that this
> was
> > container-level stream metadata. The representation is a tiny bit more
> > tedious when trying to actually perform a crop (e.g., to extract the
> > top-left corner offset). If sharing as AVFrame side data, a
> representation
> > closer to AVFrame's crop_{top,bottom,left,right} may be more natural.
> >
> > Within ffmpeg, I see that the existing codecs using AVFrame cropping
> > include:
> > -libavcodec/h264_slice.c: propagating the crop_fields from the SPS in
> h264
> > in
> > -libavcodec/hevc_refs.c: (sly to h264)
> > -libavcodec/agm.c: crop_{left, top} inferred from
> > avctx->coded_{width,height} - avctx->{width,height}
> > -libavcodec/videotoolbox.c: crop fields explicitly set to zero.
> > -libvacodec/vp3.c: crop_{left,top} set from coded bitstream’s
> offset_{x,y}
> > (right,bottom from coded_{width,height})
> >
> > Upon review, I see there is one other format that appears to export
> > similar cropping information into key/value pair metadata that could
> > potentially also use the stream-level side data: (libavformat/cinedec.c:
> > set_metadata_int(&st->metadata, "crop_left", avio_rl32(pb), 1))
> >
>
> I've changed the side data to be PixelCrop (instead of CleanAperture) given
> the intent to reuse as cropping elsewhere.
> -For now, I've kept the rational representation--although CLAP seems to be
> the only required case of it. Maybe Kieran could comment on the requirement
> of having maintaining the rationals for CLAP (only works on mov to mov
> transmuxing).
>
I'm no expert, but I think a lot of this just comes from video standards
that stipulate those rational numbers? I've cc'd tobias Rapp and Christoph
Gerstbauer of NOA just to bring this to their attention, as I'm pretty sure
that they use cropping values in AVI, so perhaps all of this could be
useful to them in some way.

K