[FFmpeg-devel] [PATCH] Extract QP from h264 encoded videos

Thu Aug 8 01:46:42 EEST 2019

On Tue, Aug 6, 2019 at 5:07 PM Mark Thompson <sw at jkqxz.net> wrote:

> This should be in libavcodec, not libavutil - it relates directly to
> codecs.  (Indeed, you've ended up having to define a special non-libavcodec
> enum of codecs below to make it work in libavutil at all.)
>
If this belongs in avcodec I can move it there, but I don't see a similar
data structure in that library.
I believe declaring different IDs for supported codecs here is a better
approach.

> > +enum AVQPArrIndexesH264 {  // varaible names in spec document
>
> I don't think giving these enums a tag has any value?
>
They are not used in the code, but keeping them makes the purpose of each
enum clearer.

> > +    AV_QP_ARR_SIZE_H264
> > +};
> > +
> > +enum AVQPArrIndexesVP9 {   // variable names in spec document
> > +    AV_QP_YAC_VP9 = 0,     // get_dc_quant[][base_q_idx]
>
> This isn't right - get_dc_quant() is a function of one variable, not a
> two-dimensional array (confused with dc_qlookup[][] somehow?).
>
Thank you, I think I meant: ac_q(get_qindex()).

Why are you including higher-level frame values for VP9 and AV1, but not
> including similar ones for H.264?
>
Again, I meant get_qindex(), it is supposed to represent the quant index
for the specified segment, not frame quant index.

What is the motivation for AV1 returning the exponential coefficient
> scaling values (range 4..29247) rather than the linear parameter (range
> 0..255) as you do for H.264?

Exposing both the values was a requirement by my team.

> +#define AV_QP_ARR_MAX_SIZE AV_QP_ARR_SIZE_AV1
>
> Fixing this for all time for a particular codec which happens to need the
> most space when it is defined doesn't seem like a good idea.  E.g. you
> can't support JPEG with only this number (it would need all entries in up
> to four tables).

It might be better if the structure size wasn't fixed forever by the first
> version of the API/ABI.  Perhaps an approach something like that used for
> AVRegionOfInterest would work?
>
Each instance of AVQuantizationParams has an array of qp values/indexes
(qp_type[]) for which I need a constant to allocate memory.
The approach AVRegionOfInterest uses does not solve that problem.

> +    /**
> > +     * x and y coordinates of the block in pixels
> > +     */
> > +    int x, y;
>
> Don't call these x/y coordinates because it not clear exactly what that
> means (what is the scale, where is the origin, which direction is positive,
> where in the block is being referred to, etc.).

Instead follow the same convention as other structures in FFmpeg and define
> them as the distance in pixels from the left or the top edge of the picture
> to the top-left corner of the block.
>
That's exactly their purpose, the distance in pixels from the top-left
corner of the frame, to the top-left corner of the block. I will make the
description clearer, thank you.

> On 30/07/2019 03:19, Juan De León wrote:
> > On Mon, Jul 29, 2019 at 12:48 PM Mark Thompson <sw at jkqxz.net> wrote:
> >>
> >> How do these values interact with cropping?
> >
> > I'm not sure I understand, could you elaborate?
>
> For codecs which include cropping such as H.26[45], the decoder may
> directly apply cropping from the stream (controlled by
> AVCodecContext.apply_cropping), possibly modified by alignment (with
> AV_CODEC_FLAG_UNALIGNED), and then sets the AVFrame cropping fields to
> reflect the remainder.

For example, in H.264 try setting the
> frame_crop_left_offset/frame_crop_top_offset fields in a stream to large
> values (h264_metadata can do this for an existing stream).  What do your
> x/y values then refer to in the result?  They could be negative to indicate
> macroblocks which are off the edges of the cropped picture, or they might
> be relative to the uncropped picture in which case you would need
> additional information to reconstruct which blocks they refer to in the
> frame you actually have.
>
The coordinates of the blocks should correspond to the coded picture,
quantization is still applied to cropped MBs outside of the frame so that
should be considered for the logging and avg calculation, similar to an
analyzer.

> > +    /**
> > +     * Stores an id corresponding to one of the supported codecs
> > +     */
> > +    enum AVExtractQPSupportedCodecs codec_id;
>
> enum AVCodecID, with this in libavcodec.
>
Like Michael said, this could cause conflict when extracting QP. It might
be better to leave it as a separate ID.

> +/**
> > + * Get the string describing the qp type for the given codec
> > + */
> > +const char* av_get_qp_type_string(enum AVExtractQPSupportedCodecs
> codec_id, int index);
>
> I'm not sure there is a good reason to embed this in the public API - what
> user is ever going to call this function?  Anyone using the enum values
> must already know exactly what each of them mean to do anything with them
> at all, so if they need string names they'll already have clearer ones than
> the cryptic short names you provide here.

I think it would probably be better to just include your string names in
> the showinfo filter (or some other) and not have it in the public API.
>
I'm using this for logging purposes in a filter that calculates
min/max/avg. I believe it's better to leave them in the public API than
limit them only to the filter.