[FFmpeg-devel] [PATCH v3 1/3] doc: Explain what "context" means

Sun May 5 10:29:10 EEST 2024

On date Monday 2024-04-29 10:10:35 +0100, Andrew Sayers wrote:
> On Mon, Apr 22, 2024 at 07:05:12PM +0200, Stefano Sabatini wrote:
[...]
> > I don't have a strong opinion, but I'd probably focus on providing a
> > typical example of a common API (check doc/examples). Also I see here
> > there is a strong focus on OOP, this might be counter-productive in
> > case the reader is not familiar with OOP terminology.
> > 
> > OTOH the content might be useful for readers coming from an OOP
> > background and terminology. I wonder if this content might be isolated
> > in a dedicated section, so that non-OOP readers can simply skip it.
> > 
> > But this is not a strong objection, and can be possibly reworked in a
> > later step.
> > 
> 
> This is really a document for FFmpeg newbies, so we need to assume as
> little prior knowledge as possible.  After a few days to think it
> over, I think we should avoid assuming...
> 
> Knowledge of object-oriented programming.  For example, this should be
> useful to a research mathematician with a project that involves codec
> algorithms.  So the next draft should feel less like "FFmpeg for OOP
> devs" and more like "FFmpeg for newbies (with some optional OOP
> background reading)".
> 
> Knowing that programming doesn't *have* to be object-oriented.
> OOP has become so ubiquitous nowadays, there are plenty of programmers
> who will insist everything is OOP if you just speak loudly and slowly.
> This is a harder problem in some ways, because someone who doesn't
> understand can always re-read until they do, while someone who jumps
> to the wrong conclusion will just keep reading and try to make things
> fit their assumption (e.g. my earlier messages in this thread!).
> So the "how it differs from OOP" stuff needs to stay fairly prominent.
> 

> Knowing anything about FFmpeg (or multimedia in general).  I like the
> idea of tweaking `doc/examples` to better introduce FFmpeg
> fundamentals, but explaining "context" is a steep enough learning
> curve on its own - introducing concepts like "stream" and "codec" at
> the same time seems like too much.

But even if you show the API that does not mean you need to explain
it entirely, you only need to highligth the structural relationships:

    // create an instance context, whatever it is
    c = avcodec_alloc_context3(codec);
    if (!c) {
        fprintf(stderr, "Could not allocate video codec context\n");
        exit(1);
    }

    // set context parametres directly
    c->bit_rate = 400000;
    /* resolution must be a multiple of two */
    c->width = 352;
    c->height = 288;
    /* frames per second */
    c->time_base = (AVRational){1, 25};
    c->framerate = (AVRational){25, 1};

    // use av_opt API to set the options?
    ...

    // open the codec context provided a codec
    ret = avcodec_open2(c, codec, NULL);
    if (ret < 0) {
        fprintf(stderr, "Could not open codec: %s\n", av_err2str(ret));
        exit(1);
    }

You might even replace avcodec_ with fooblargh_ and get the same
effect, with the addition that with avcodec_ you are already
familiarizing the user with the concrete API rather than with an
hypotetical one.

[...]

> I've also gone through the code looking for edge cases we haven't covered.
> Here are some questions trying to prompt an "oh yeah I forgot to mention
> that"-type answer.  Anything where the answer is more like "that should
> probably be rewritten to be clearer", let me know and I'll avoid confusing
> newbies with it.
> 

> av_ambient_viewing_environment_create_side_data() takes an AVFrame as its
> first argument, and returns a new AVAmbientViewingEnvironment.  What is the
> context object for that function - AVFrame or AVAmbientViewingEnvironment?

But this should be clear from the doxy:

/**
 * Allocate and add an AVAmbientViewingEnvironment structure to an existing
 * AVFrame as side data.
 *
 * @return the newly allocated struct, or NULL on failure
 */
AVAmbientViewingEnvironment *av_ambient_viewing_environment_create_side_data(AVFrame *frame);

Also, you are assuming that all the function should have a
context. That's not the case, as you don't always need to keep track
of a "context" when performing operations.

> 
> av_register_bitstream_filter() (deprecated 4.0, removed 5.0) took an
> `AVBitStreamFilter *` as its first argument, but I don't think you'd say
> the argument provided "context" for the function.  So would I be right in
> saying `AVBitStreamFilter *` is not a context, despite looking like one?

This was finally dropped so this is even missing. But again, it seems
you are assuming that all the functions should take a context, which
is not the case. In that case you had:
av_register_bistream_filter(filter)

which was registering the filter in the program global state.

> av_buffersink_*() all take a `const AVFilterContext *` argument.
> What does the difference between av_buffersink prefix and AVFilter type mean?

None, probabily it should have been named avfilter_buffersink since
this is a libavfilter API, seel below.

> av_channel_description_bprint() takes a `struct AVBPrint *` as its first
> argument, then `enum AVChannel`.  Is the context AVBPrint, AVChannel,
> or both?  Does it make sense for a function to have two contexts?

Again, this should be clear from the doxy:
/**
 * Get a human readable string describing a given channel.
 *
 * @param buf pre-allocated buffer where to put the generated string
 * @param buf_size size in bytes of the buffer.
 * @param channel the AVChannel whose description to get
 * @return amount of bytes needed to hold the output string, or a negative AVERROR
 *         on failure. If the returned value is bigger than buf_size, then the
 *         string was truncated.
 */
int av_channel_description(char *buf, size_t buf_size, enum AVChannel channel);

/**
 * bprint variant of av_channel_description().
 *
 * @note the string will be appended to the bprint buffer.
 */
void av_channel_description_bprint(struct AVBPrint *bp, enum AVChannel channel_id);

> Related to the previous question, does `av_cmp_q()` count as a function
> with two contexts?  Or no contexts?

And again, it looks like you are overgeneralizing and require that all
the functions take a context. In general, the C language is procedural
and so is the FFmpeg API. The fact that we are assuming some
constructs which might mimic or resemble OOP does not mean that all
the API was designed in that way.

> 
> Finally, a general question - functions of the form "avfoo" seem like they
> are more consistent than "av_foo".  Does the underscore mean anything?

This is due to the fact that despite the effort, implementing a
consistent API is difficult, also due to different
reviewers/contributors picking different conventions.

In some case we prefer av_ (most of libavutil) for the generic API,
while libavcodec/libavformat/libavfilter tend to use avLIB_, but there
might be exceptions.