[FFmpeg-devel] [PATCH] libavutil: add clean aperture (CLAP) side data.

Andreas Rheinhardt andreas.rheinhardt at gmail.com
Fri May 8 01:27:06 EEST 2020


Neil Birkbeck:
> On Tue, Apr 28, 2020 at 3:18 AM Nicolas George <george at nsup.org> wrote:
> 
>> Andreas Rheinhardt (12020-04-28):
>>> That's expected. The patch provided only provides the structure in which
>>> the values are intended to be exported; it does not add any demuxing or
>>> muxing capabilities for mov or mkv (as you can see from the fact that
>>> none of these (de)muxers have been changed in the patch).
>>
>> Which is something I intended to comment on: adding code without users
>> is rarely a good idea. I suggest we do not commit until at least one
>> demuxer use it, preferably at least two. Otherwise, we may realize that
>> “oh crap, it doesn't work” because of a tiny unforeseen detail.
> 
> 
> Thanks for the feedback. I also have patches for the demux (MOV/MKV) and
> mux (MOV/MKV).
> 
> As there is still the alternative of using the fields in the
> AVCodecParameters/AVCodecContext, my intention was to keep the first patch
> small to resolve discussion on that point.
> 
> I've included the patches, if you'd like to try test it, Kieren. I see on
> your test file that there may be some slight rounding error making output
> crop 704 not 703 (MKV file ends up with pixel_crop_{left,right} = 8).
> 
> /ffprobe ../testdata/clap.mov 2>&1 | grep -A1 "Side"
>     Side data:
>       Clean aperture:[width 41472/59 height:576/1 h_offset:0/1 v_offset:0/1]
> ./ffmpeg -i ../testdata/clap.mov  -vcodec copy -acodec copy /tmp/clap.mkv
> ./ffprobe /tmp/clap.mkv 2>&1 | grep -A1 "Side"
>     Side data:
>       Clean aperture:[width 704/1 height:576/1 h_offset:0/1 v_offset:0/1]
> 
> 

> 
> Write out stream-level AVCleanAperture side data in the muxer.

You should separate the mov and Matroska patches.

> 
> Signed-off-by: Neil Birkbeck <neil.birkbeck at gmail.com>
> ---
>  libavformat/matroskaenc.c | 37 +++++++++++++++++++++++++++++++++++++
>  libavformat/movenc.c      | 28 ++++++++++++++++++++--------
>  2 files changed, 57 insertions(+), 8 deletions(-)
> 
> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c
> index 784973a951..19ff29853e 100644
> --- a/libavformat/matroskaenc.c
> +++ b/libavformat/matroskaenc.c
> @@ -901,6 +901,38 @@ static int mkv_write_video_color(AVIOContext *pb, const AVCodecParameters *par,
>      return 0;
>  }
>  
> +static int mkv_write_video_crop(AVIOContext *pb, AVCodecParameters *par, AVStream *st)
> +{
> +    int cropb = 0, cropt = 0, cropl = 0, cropr = 0;
> +    double width, height, h_offset, v_offset;
> +    int side_data_size = 0;
> +    const AVCleanAperture *clap =
> +        (const AVCleanAperture *)av_stream_get_side_data(
> +            st, AV_PKT_DATA_CLEAN_APERTURE, &side_data_size);
> +    if (!clap)
> +        return 0;
> +
> +    width = av_q2d(clap->width);
> +    height = av_q2d(clap->height);
> +    h_offset = av_q2d(clap->horizontal_offset);
> +    v_offset = av_q2d(clap->vertical_offset);
> +    cropb = (int)(par->height - (par->height / 2. + v_offset + height / 2));
> +    cropt = (int)(par->height / 2. + v_offset - height / 2);
> +    cropr = (int)(par->width - (par->width / 2. + h_offset + width / 2));
> +    cropl = (int)(par->width / 2. + h_offset - width / 2);
> +    cropb = FFMAX(cropb, 0);
> +    cropt = FFMAX(cropt, 0);
> +    cropr = FFMAX(cropr, 0);
> +    cropl = FFMAX(cropl, 0);
> +    if (!cropr && !cropl && !cropt && !cropb)
> +        return 0;
> +    put_ebml_uint(pb, MATROSKA_ID_VIDEOPIXELCROPL, cropl);
> +    put_ebml_uint(pb, MATROSKA_ID_VIDEOPIXELCROPR, cropr);
> +    put_ebml_uint(pb, MATROSKA_ID_VIDEOPIXELCROPB, cropb);
> +    put_ebml_uint(pb, MATROSKA_ID_VIDEOPIXELCROPT, cropt);

There is no reason to write all four elements when only some of them are
different from zero.

> +    return 0;
> +}
> +
>  static int mkv_write_video_projection(AVFormatContext *s, AVIOContext *pb,
>                                        const AVStream *st)
>  {
> @@ -1287,6 +1319,11 @@ static int mkv_write_track(AVFormatContext *s, MatroskaMuxContext *mkv,
>          } else if (mkv->mode != MODE_WEBM)
>              put_ebml_uint(pb, MATROSKA_ID_VIDEODISPLAYUNIT, MATROSKA_VIDEO_DISPLAYUNIT_UNKNOWN);
>  
> +        // Write out pixel crop
> +        ret = mkv_write_video_crop(pb, par, st);
> +        if (ret < 0)
> +            return ret;

This is the wrong place for this: It needs to be performed before we
write the display dimensions and you need to return the cropped width
and height via pointer arguments, so that the code doing display
dimensions (which needs to be updated) knows about the cropping. This is
necessary, because cropping is supposed to happen before scaling: If the
code for writing DisplayWidth/Height didn't know about the real
dimensions to scale, it would write the DisplayWidth/Height values that
are appropriate for a video with the given pixel aspect ratio and the
uncropped pixel dimensions.

We should btw error out if someone sends stereo 3d data that implies
that a dimension is doubled if cropping is intended to be applied in the
same dimension, as this scenario is not really well defined.

And let me also add other points about the sorry state of the PixelCrop*
elements:
a) The display aspect ratio usually changes when using cropping; the
pixel aspect ratio doesn't change, yet unfortunately (and inexplicably)
Matroska does not support that. This means that if one sets the display
aspect ratio explicitly when using cropping, a player that ignores the
cropping will not only display the video with the area that ought to be
cropped away; it will also display it with a wrong pixel aspect ratio.

Given that almost all players don't support the cropping flags simply
saying that these players are broken and need to be fixed is not an option.

But there is a method to create videos that will be played with the
correct pixel aspect ratio by both types of players. It makes use of the
fact that (when using pixels as DisplayUnit (it's the default value
anyway)) the default value of DisplayWidth and DisplayHeight are given
by the dimensions of the cropped frame.
It works especially well with nonanamorphic videos. Consider the case of
a 1280x720 nonanamorphic video with 40 pixels on top and bottom cropping
on top and bottom (each) and 240 pixels on the left and right (each). If
one leaves the DisplayWidth/Height away, a player that ignores the crop
values will infer DisplayWidth to be 1280 and DisplayHeight to be 720,
whereas a player that supports the crop values will infer DisplayHeight
to be 640 and DisplayWidth to be 800. Both players will use quadratic
pixels.
It does not work so well with anamorphic videos. If one only wants to
crop in one dimension (say, the vertical one), then one can still
produce videos that work well with both players: Leave out the
DisplayHeight and set DisplayWidth to the common value suitable for
both. (This already has a downside: An absolutely precise display aspect
ratio is no longer attainable with this method in general.)
If one has anamorphic video and wants to crop in both dimensions, then
there is no way of making both types of players display the video with
the same pixel aspect ratio (unless the cropping happens to be so that
the ratio of the uncropped frame and the ratio of the cropped frame
coincide).

We already don't write the display dimensions when nonanamorphic and
your (revised) patch needs to preserve that (should be trivial). Given
that the method outlined above in case cropping is restricted to one
dimension might have side-effects I don't think it should be implemented
(would also be too niche); but one (needn't be you) should add a warning
for users in case players not supporting the crop elements might play
the output with the wrong pixel aspect ratio. (Said warning should
preferably be added in a separate commit.)

b) MKVToolNix behaviour wrt cropping is unfortunately buggy: mkvmerge
reads and propagates the cropping parameters upon remuxing, but it does
not take them into account when inferring the display dimensions (which
it explicitly sets); therefore it changes the pixel aspect ratio upon
remuxing (with the usual exception of the ratios of cropped and
uncropped frames agreeing).
And the GUIs help text does not mention that one should also modify the
display dimensions when editing the cropping parameters. E.g. both
videos from ticket 4489 (and also the original in VLC ticket 13982) are
anamorphic when the specs are followed. This was certainly not intended.
Certainly a large part of the files in existence that have crop values
set have them set in the wrong way. Adding a workaround would
unfortunately incur the possibility of misparsing valid files. My
current approach is to wait and see how many users complain before
deciding on whether to add a workaround.

> +
>          if (par->codec_id == AV_CODEC_ID_RAWVIDEO) {
>              uint32_t color_space = av_le2ne32(par->codec_tag);
>              put_ebml_binary(pb, MATROSKA_ID_VIDEOCOLORSPACE, &color_space, sizeof(color_space));

> 
> diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
> index 8e1326abf6..de520be247 100644
> --- a/libavformat/matroskadec.c
> +++ b/libavformat/matroskadec.c
> @@ -190,6 +190,10 @@ typedef struct MatroskaTrackVideo {
>      uint64_t display_height;
>      uint64_t pixel_width;
>      uint64_t pixel_height;
> +    uint64_t pixel_cropb;
> +    uint64_t pixel_cropt;
> +    uint64_t pixel_cropl;
> +    uint64_t pixel_cropr;
>      EbmlBin  color_space;
>      uint64_t display_unit;
>      uint64_t interlaced;
> @@ -480,10 +484,10 @@ static EbmlSyntax matroska_track_video[] = {
>      { MATROSKA_ID_VIDEOALPHAMODE,      EBML_UINT,  0, offsetof(MatroskaTrackVideo, alpha_mode) },
>      { MATROSKA_ID_VIDEOCOLOR,          EBML_NEST,  sizeof(MatroskaTrackVideoColor), offsetof(MatroskaTrackVideo, color), { .n = matroska_track_video_color } },
>      { MATROSKA_ID_VIDEOPROJECTION,     EBML_NEST,  0, offsetof(MatroskaTrackVideo, projection), { .n = matroska_track_video_projection } },
> -    { MATROSKA_ID_VIDEOPIXELCROPB,     EBML_NONE },
> -    { MATROSKA_ID_VIDEOPIXELCROPT,     EBML_NONE },
> -    { MATROSKA_ID_VIDEOPIXELCROPL,     EBML_NONE },
> -    { MATROSKA_ID_VIDEOPIXELCROPR,     EBML_NONE },
> +    { MATROSKA_ID_VIDEOPIXELCROPB,     EBML_UINT,  0, offsetof(MatroskaTrackVideo, pixel_cropb) },
> +    { MATROSKA_ID_VIDEOPIXELCROPT,     EBML_UINT,  0, offsetof(MatroskaTrackVideo, pixel_cropt) },
> +    { MATROSKA_ID_VIDEOPIXELCROPL,     EBML_UINT,  0, offsetof(MatroskaTrackVideo, pixel_cropl) },
> +    { MATROSKA_ID_VIDEOPIXELCROPR,     EBML_UINT,  0, offsetof(MatroskaTrackVideo, pixel_cropr) },
>      { MATROSKA_ID_VIDEODISPLAYUNIT,    EBML_UINT,  0, offsetof(MatroskaTrackVideo, display_unit), { .u= MATROSKA_VIDEO_DISPLAYUNIT_PIXELS } },
>      { MATROSKA_ID_VIDEOFLAGINTERLACED, EBML_UINT,  0, offsetof(MatroskaTrackVideo, interlaced),  { .u = MATROSKA_VIDEO_INTERLACE_FLAG_UNDETERMINED } },
>      { MATROSKA_ID_VIDEOFIELDORDER,     EBML_UINT,  0, offsetof(MatroskaTrackVideo, field_order), { .u = MATROSKA_VIDEO_FIELDORDER_UNDETERMINED } },
> @@ -2695,7 +2699,8 @@ static int matroska_parse_tracks(AVFormatContext *s)
>              MatroskaTrackPlane *planes = track->operation.combine_planes.elem;
>              int display_width_mul  = 1;
>              int display_height_mul = 1;
> -
> +            const int cropped_width = track->video.pixel_width - track->video.pixel_cropl - track->video.pixel_cropr;
> +            const int cropped_height = track->video.pixel_height - track->video.pixel_cropt - track->video.pixel_cropb;

Missing check for bogus values in which the total crop is more than is
available in that dimension. (Yes, I know, there are lots of missing
checks already in this function. But let's not add another one.)

>              st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
>              st->codecpar->codec_tag  = fourcc;
>              if (bit_depth >= 0)
> @@ -2703,6 +2708,29 @@ static int matroska_parse_tracks(AVFormatContext *s)
>              st->codecpar->width      = track->video.pixel_width;
>              st->codecpar->height     = track->video.pixel_height;
>  
> +            if (track->video.pixel_cropt || track->video.pixel_cropb ||
> +                track->video.pixel_cropl || track->video.pixel_cropr) {
> +                AVCleanAperture *metadata = av_clean_aperture_alloc();
> +                if (!metadata)
> +                    return AVERROR(ENOMEM);
> +                metadata->width.num = cropped_width;
> +                metadata->height.num = cropped_height;
> +                metadata->width.den = 1;
> +                metadata->height.den = 1;

Can be aligned on '='.

> +                av_reduce(&metadata->horizontal_offset.num,
> +                          &metadata->horizontal_offset.den,
> +                          (int)track->video.pixel_cropl - (int)track->video.pixel_cropr, 2, INT_MAX);
> +                av_reduce(&metadata->vertical_offset.num,
> +                          &metadata->vertical_offset.den,
> +                          (int)track->video.pixel_cropt - (int)track->video.pixel_cropb, 2, INT_MAX);
> +                ret = av_stream_add_side_data(st, AV_PKT_DATA_CLEAN_APERTURE,
> +                                              (uint8_t *)metadata, sizeof(*metadata));
> +                if (ret < 0) {
> +                    av_freep(&metadata);
> +                    return ret;
> +                }
> +            }
> +
>              if (track->video.interlaced == MATROSKA_VIDEO_INTERLACE_FLAG_INTERLACED)
>                  st->codecpar->field_order = mkv_field_order(matroska, track->video.field_order);
>              else if (track->video.interlaced == MATROSKA_VIDEO_INTERLACE_FLAG_PROGRESSIVE)
> @@ -2714,8 +2742,8 @@ static int matroska_parse_tracks(AVFormatContext *s)
>              if (track->video.display_unit < MATROSKA_VIDEO_DISPLAYUNIT_UNKNOWN) {
>                  av_reduce(&st->sample_aspect_ratio.num,
>                            &st->sample_aspect_ratio.den,
> -                          st->codecpar->height * track->video.display_width  * display_width_mul,
> -                          st->codecpar->width  * track->video.display_height * display_height_mul,
> +                          cropped_height * track->video.display_width  * display_width_mul,
> +                          cropped_width  * track->video.display_height * display_height_mul,

The display_width/height values used here might not be directly coded in
the file, but inferred. And the default value depends upon the crop
values, so this needs to be updated, too. Actually, if one has a
non-anamorphic Matroska video with DisplayWidth/Height values explicitly
coded and crop values set, then with your patches applied the demuxer
would correctly demux it and the muxer would leave DisplayWidth/Height
values out (so that it would still be non-anamorphic), but the demuxer
would then infer that the new file is anamorphic (unless the ratio of
the cropped and uncropped frames coincide, as usual).

Thank you for tackling this. I always wanted to add support for this and
would have done so if there had been a way to export the cropping
information.

- Andreas


More information about the ffmpeg-devel mailing list