[FFmpeg-devel] [PATCH v3 1/3] doc: Explain what "context" means

Andrew Sayers ffmpeg-devel at pileofstuff.org
Mon May 6 00:04:36 EEST 2024


I'm still travelling, so the following thoughts might be a bit
half-formed.  But I wanted to get some feedback before sitting down
for a proper think.

On Sun, May 05, 2024 at 09:29:10AM +0200, Stefano Sabatini wrote:
> On date Monday 2024-04-29 10:10:35 +0100, Andrew Sayers wrote:
> > On Mon, Apr 22, 2024 at 07:05:12PM +0200, Stefano Sabatini wrote:
> [...]
> > > I don't have a strong opinion, but I'd probably focus on providing a
> > > typical example of a common API (check doc/examples). Also I see here
> > > there is a strong focus on OOP, this might be counter-productive in
> > > case the reader is not familiar with OOP terminology.
> > > 
> > > OTOH the content might be useful for readers coming from an OOP
> > > background and terminology. I wonder if this content might be isolated
> > > in a dedicated section, so that non-OOP readers can simply skip it.
> > > 
> > > But this is not a strong objection, and can be possibly reworked in a
> > > later step.
> > > 
> > 
> > This is really a document for FFmpeg newbies, so we need to assume as
> > little prior knowledge as possible.  After a few days to think it
> > over, I think we should avoid assuming...
> > 
> > Knowledge of object-oriented programming.  For example, this should be
> > useful to a research mathematician with a project that involves codec
> > algorithms.  So the next draft should feel less like "FFmpeg for OOP
> > devs" and more like "FFmpeg for newbies (with some optional OOP
> > background reading)".
> > 
> > Knowing that programming doesn't *have* to be object-oriented.
> > OOP has become so ubiquitous nowadays, there are plenty of programmers
> > who will insist everything is OOP if you just speak loudly and slowly.
> > This is a harder problem in some ways, because someone who doesn't
> > understand can always re-read until they do, while someone who jumps
> > to the wrong conclusion will just keep reading and try to make things
> > fit their assumption (e.g. my earlier messages in this thread!).
> > So the "how it differs from OOP" stuff needs to stay fairly prominent.
> > 
> 
> > Knowing anything about FFmpeg (or multimedia in general).  I like the
> > idea of tweaking `doc/examples` to better introduce FFmpeg
> > fundamentals, but explaining "context" is a steep enough learning
> > curve on its own - introducing concepts like "stream" and "codec" at
> > the same time seems like too much.
> 
> But even if you show the API that does not mean you need to explain
> it entirely, you only need to highligth the structural relationships:
> 
>     // create an instance context, whatever it is
>     c = avcodec_alloc_context3(codec);
>     if (!c) {
>         fprintf(stderr, "Could not allocate video codec context\n");
>         exit(1);
>     }
> 
>     // set context parametres directly
>     c->bit_rate = 400000;
>     /* resolution must be a multiple of two */
>     c->width = 352;
>     c->height = 288;
>     /* frames per second */
>     c->time_base = (AVRational){1, 25};
>     c->framerate = (AVRational){25, 1};
> 
>     // use av_opt API to set the options?
>     ...
> 
>     // open the codec context provided a codec
>     ret = avcodec_open2(c, codec, NULL);
>     if (ret < 0) {
>         fprintf(stderr, "Could not open codec: %s\n", av_err2str(ret));
>         exit(1);
>     }
> 
> You might even replace avcodec_ with fooblargh_ and get the same
> effect, with the addition that with avcodec_ you are already
> familiarizing the user with the concrete API rather than with an
> hypotetical one.
> 
> [...]
> 
> > I've also gone through the code looking for edge cases we haven't covered.
> > Here are some questions trying to prompt an "oh yeah I forgot to mention
> > that"-type answer.  Anything where the answer is more like "that should
> > probably be rewritten to be clearer", let me know and I'll avoid confusing
> > newbies with it.
> > 
> 
> > av_ambient_viewing_environment_create_side_data() takes an AVFrame as its
> > first argument, and returns a new AVAmbientViewingEnvironment.  What is the
> > context object for that function - AVFrame or AVAmbientViewingEnvironment?
> 
> But this should be clear from the doxy:
> 
> /**
>  * Allocate and add an AVAmbientViewingEnvironment structure to an existing
>  * AVFrame as side data.
>  *
>  * @return the newly allocated struct, or NULL on failure
>  */
> AVAmbientViewingEnvironment *av_ambient_viewing_environment_create_side_data(AVFrame *frame);

I'm afraid it's not clear, at least to me.  I think you're saying the
AVFrame is the context because a "create" function can't have a
context any more than a C++ "new" can be called as a member.  But the
function's prefix points to the other conclusion, and neither signal
is clear enough on its own.

My current thinking is to propose separate patches renaming arguments
to `ctx` whenever I find functions I can't parse.  That's not as good
as a simple rule like "the first argument is always the context", but
better than adding a paragraph or two about how to read the docs.

> Also, you are assuming that all the function should have a
> context. That's not the case, as you don't always need to keep track
> of a "context" when performing operations.

It sounds like there's a subtle distinction to be made here: when a
struct is named "FooContext" or a variable is named `ctx`, the writer
has signalled that "context" is the (only) correct interpretation.
OTOH, when something acts like a context but isn't explicitly
labelled as such, it may or may not have been intended as a context,
and treating it as one may or may not be helpful - it's more "if it
helps" than an objective fact.  That means reasonable people can
disagree about edge cases - for example, it's genuinely unclear
whether av_register_bitstream_filter()'s argument is a context, and
TBH if I weren't writing this doc I would just avoid thinking about
av_ambient_viewing_environment_create_side_data() altogether.

> 
> > 
> > av_register_bitstream_filter() (deprecated 4.0, removed 5.0) took an
> > `AVBitStreamFilter *` as its first argument, but I don't think you'd say
> > the argument provided "context" for the function.  So would I be right in
> > saying `AVBitStreamFilter *` is not a context, despite looking like one?
> 
> This was finally dropped so this is even missing. But again, it seems
> you are assuming that all the functions should take a context, which
> is not the case. In that case you had:
> av_register_bistream_filter(filter)
> 
> which was registering the filter in the program global state.
>  
> > av_buffersink_*() all take a `const AVFilterContext *` argument.
> > What does the difference between av_buffersink prefix and AVFilter type mean?
> 
> None, probabily it should have been named avfilter_buffersink since
> this is a libavfilter API, seel below.
> 
> > av_channel_description_bprint() takes a `struct AVBPrint *` as its first
> > argument, then `enum AVChannel`.  Is the context AVBPrint, AVChannel,
> > or both?  Does it make sense for a function to have two contexts?
> 
> Again, this should be clear from the doxy:
> /**
>  * Get a human readable string describing a given channel.
>  *
>  * @param buf pre-allocated buffer where to put the generated string
>  * @param buf_size size in bytes of the buffer.
>  * @param channel the AVChannel whose description to get
>  * @return amount of bytes needed to hold the output string, or a negative AVERROR
>  *         on failure. If the returned value is bigger than buf_size, then the
>  *         string was truncated.
>  */
> int av_channel_description(char *buf, size_t buf_size, enum AVChannel channel);
> 
> /**
>  * bprint variant of av_channel_description().
>  *
>  * @note the string will be appended to the bprint buffer.
>  */
> void av_channel_description_bprint(struct AVBPrint *bp, enum AVChannel channel_id);

I think you're saying that I should look at which word appears more
often in the doxy ("channel") rather than which word appears first in
the argument list ("buf")?  As above, the solution might be to rename
the variable in a separate patch rather than teach people another
special case.

> 
> > Related to the previous question, does `av_cmp_q()` count as a function
> > with two contexts?  Or no contexts?
> 
> And again, it looks like you are overgeneralizing and require that all
> the functions take a context. In general, the C language is procedural
> and so is the FFmpeg API. The fact that we are assuming some
> constructs which might mimic or resemble OOP does not mean that all
> the API was designed in that way.
> 
> > 
> > Finally, a general question - functions of the form "avfoo" seem like they
> > are more consistent than "av_foo".  Does the underscore mean anything?
> 
> This is due to the fact that despite the effort, implementing a
> consistent API is difficult, also due to different
> reviewers/contributors picking different conventions.
> 
> In some case we prefer av_ (most of libavutil) for the generic API,
> while libavcodec/libavformat/libavfilter tend to use avLIB_, but there
> might be exceptions.

Given all of the above (and Zhao Zhili's useful examples), I think
the document might be better like:

Section 1: context is a good way to think about some code constructs.
If you've ever used a callback function with an arbitrary callback
argument, you've used a context.  <quick example of a hypothetical
callback function with a context variable>.

Section 2: some projects explicitly use "context" as a metaphor.
<comparison of curl and FFmpeg's md5 contexts>.  This should be
understood as a broad metaphor that different people use to mean
slightly different things.

Section 3: FFmpeg has developed various conventions around contexts.
<discussion of a non-AVClass example that follows most patterns but
breaks some others>.

Section 4: user-facing contexts use AVClass+AVOptions.  This is a
common and highly visible special case, but not the inevitable
endpoint of all contexts.  <discussion of an AVClass-enabled object>.


More information about the ffmpeg-devel mailing list