[FFmpeg-devel] [PATCH v2 1/3] doc: Explain what "context" means

Andrew Sayers ffmpeg-devel at pileofstuff.org
Sun Apr 21 01:17:57 EEST 2024


On Sat, Apr 20, 2024 at 06:48:32PM +0200, Stefano Sabatini wrote:
> On date Saturday 2024-04-20 13:19:41 +0100, Andrew Sayers wrote:
> > Based largely on the explanation by Stefano Sabatini:
> > https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325854.html
> > ---
> >  doc/jargon.md | 169 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 169 insertions(+)
> >  create mode 100644 doc/jargon.md
> > 
> > diff --git a/doc/jargon.md b/doc/jargon.md
> > new file mode 100644
> > index 0000000000..f967b5c8bc
> > --- /dev/null
> > +++ b/doc/jargon.md
> > @@ -0,0 +1,169 @@
> > +# Jargon
> > +
> > +Terms used throughout the code that developers may need to know.
> > +
> > + at anchor context
> > +
> 
> > +## Context
> > +
> 
> > +A design pattern that stores the context (e.g. configuration) for a series
> > +of operations in a "context" structure, and moves other information with
> > +a longer or shorter lifetime elsewhere.
> 
> I'd skip the mention of a design pattern since this is about the
> jargon.
> 
> So a simplified variant would be:
> 
> A "context" is a a structure used to store information
> (e.g. configuration and/or internal state) for a series of operations
> working on the same data.

I think there's a pattern to the problems I'm having in this thread -
*anchoring effects*.

If you ask someone "is 5 a big number?" then "is 5 thousand a big number?",
they'll probably say "yes" to the second question.  But if you ask them
"is 5 billian a big number?" then "is 5 thousand a big number?", they'll
probably say "no".  In each case, their concept of "bigness" has been
anchored by the first question you asked.

When I originally tried to learn FFmpeg back in the day, I got nowhere with my
default OOP mindset.  It wasn't until I thought to read the examples with a
procedural mindset that it started making any sense, and I think that has
*anchored* my mental model of FFmpeg to a mindset that made it hard to think
deeply about its object-oriented bits.

Yesterday I would have agreed this was just one piece of jargon that needed
pinning down.  But if other people have similarly mis-anchored themselves,
this question might need to be a bit easier for them to find.

> 
> > +
> > +Consider a code snippet to modify text then print it:
> > +
> > +```c
> > +/**
> > + * Contextual information about printing a series of messages
> > + */
> > +struct ModifyThenPrintContext {
> > +
> > +    /**
> > +     * Members of the context usually are usually part of its public API...
> > +     */
> > +    FILE *out;
> > +
> > +    /**
> > +     * ... but check the documentation just in case
> > +     */
> > +    [[deprecated]]
> > +    int no_longer_part_of_the_public_api;
> > +
> > +    /**
> > +     * The "internal context" is private to the context itself.
> > +     *
> > +     * Unlike the members above, the private context is not guaranteed
> > +     * and can change arbitrarily between versions.
> > +     */
> > +    void* priv_data;
> > +};
> > +
> > +/**
> > + * Long-lifetime information, reused by many contexts
> > + */
> > +enum ModifyThenPrintDialect {
> > +    MODIFY_THEN_PRINT_PLAIN_TEXT,
> > +    MODIFY_THEN_PRINT_REGEX,
> > +    MODIFY_THEN_PRINT_REGEX_PCRE
> > +};
> > +
> > +/**
> > + * Short-lifetime information, used repeatedly in a single context
> > + */
> > +struct ModifyThenPrintMessage {
> > +    char *str;
> > +    char *replace_this;
> > +    char *with_this;
> > +};
> > +
> > +/**
> > + * Allocate and initialize a ModifyThenPrintContext
> > + *
> > + * This creates a new pointer, then fills in some sensible defaults.
> > + *
> > + * We can reasonably assume this function will initialise `priv_data`
> > + * with a dialect-specific object, but shouldn't make any assumptions
> > + * about what that object is.
> > + *
> > + */
> > +int ModifyThenPrintContext_alloc_context(struct ModifyThenPrintContext **ctx,
> > +                                         FILE *out,
> > +                                         enum ModifyThenPrintDialect dialect);
> > +
> > +/**
> > + * Uninitialize and deallocate a ModifyThenPrintContext
> > + *
> > + * This does any work required by the private context in `priv_data`
> > + * (e.g. deallocating it), then deallocates the main context itself.
> > + *
> > + */
> > +int ModifyThenPrintContext_free(struct ModifyThenPrintContext *ctx);
> > +
> > +/**
> > + * Print a single message
> > + */
> > +int ModifyThenPrintContext_print(struct ModifyThenPrintContext *ctx,
> > +                                 struct ModifyThenPrintMessage *msg);
> > +
> > +int print_hello_world()
> > +{
> > +
> > +    int ret = 0;
> > +
> > +    struct ModifyThenPrintContext *ctx;
> > +
> > +    struct ModifyThenPrintMessage hello_world;
> > +
> > +    if ( ModifyThenPrintContext_alloc_context( &ctx, stdout, MODIFY_THEN_PRINT_REGEX ) < 0 ) {
> > +        ret = -1;
> > +        goto EXIT_WITHOUT_CLEANUP;
> > +    }
> > +
> > +    hello_world.replace_this = "Hi|Hullo";
> > +    hello_world.with_this    = "Hello";
> > +
> > +    hello_world.str = "Hi, world!\n";
> > +    if ( ModifyThenPrintContext_print( ctx, &hello_world ) < 0 ) {
> > +        ret = -1;
> > +        goto FINISH;
> > +    }
> > +
> > +    hello_world.str = "Hullo, world!\n";
> > +    if ( ModifyThenPrintContext_print( ctx, &hello_world ) < 0 ) {
> > +        ret = -1;
> > +        goto FINISH;
> > +    }
> > +
> > +    FINISH:
> > +    if ( ModifyThenPrintContext_free( ctx ) ) {
> > +        ret = -1;
> > +        goto EXIT_WITHOUT_CLEANUP;
> > +    }
> > +
> > +    EXIT_WITHOUT_CLEANUP:
> > +    return ret;
> > +
> > +}
> > +```
> > +
> 
> > +In the example above, the `ModifyThenPrintContext` object contains information
> > +that's needed for exactly the lifetime of the current job (i.e. how to modify
> > +and where to print).  Information with a longer or shorter lifetime is moved
> > +to `ModifyThenPrintDialect` and `ModifyThenPrintMessage`.
> 
> I still find this overly complex, I would rather use a typical example
> of AVCodecContext for encoding or decoding or something even simpler
> (for example md5.h).
> 
> About the internal "private" context, this is mostly relevant for
> FFmpeg development, and not really useful for API users (basically
> they don't even need to know about the private data).
> 
> For example all they need to know is that for AVCodecContext generic
> options they can set the fields in the context itself, or use
> AVOptions, but they can only use AVOptions for "private" options.
> 
> We are not still enforcing the use of AVOption to set all options,
> although we might want in the future.

I think you're saying that "context structure" is synonymous with "context",
and is FFmpeg's term for a common style of C structure; but that other projects
might use a different word, or write that style of struct without naming it at
all?  If so, I'd argue it's important to give people a non-FFmpeg-specific
*anchor*, but that we should expand the later FFmpeg-specific example, so they
have an idea of how it's used around here.

A quick grep of the source suggests that "private context" is an accepted
synonym for "internal context".  And it sounds like it fulfils the same purpose
as C++ "private" access.  If both statements are true, then yes it doesn't need
to go in the example, and the whole topic can be cut down to a line like "the
main context is for public members, the private context is for private members".
Sound good?

If we have public and private members, and then AVOption members are a third
thing, the document should probably address the natural assumption that they're
equivalent to C++ "protected" members (i.e. not fully private to the class, but
not fully open to the public).  How about "It might help to think of
AVOption-accessible public members as having 'protected' access, in that you
should access them through the AVOptions API unless you know what you're
doing.  This rule isn't always followed in practice, especially in older code"?

> 
> > +
> > +FFmpeg uses the context pattern to solve a variety of problems. But the most
> > +common contexts (AVCodecContext, AVFormatContext etc.) tend to have a lot of
> > +requirements in common:
> > +
> > +- need to query, set and get options
> > +  - including options whose implementation is not part of the public API
> > +- need to configure log message verbosity and content
> > +
> > +FFmpeg gradually converged on the AVClass struct to store that information,
> > +then converged on the @ref avoptions "AVOptions" system to manipulate it.
> > +So the terms "context", "AVClass context structure" and "AVOptions-enabled
> > +struct" are often used interchangeably when it's not important to emphasise
> > +the difference.  But for example, AVMediaCodecContext uses the context
> > +pattern, but is not an AVClass context structure, so cannot be manipulated
> > +with AVOptions.
> > +
> > +To understand AVClass context structures, consider the `libx264` encoder:
> > +
> > +- it has to support common encoder options like "bitrate"
> > +- it has to support encoder-specific options like "profile"
> > +  - the exact options could change quickly if a legal ruling forces a change of backend
> > +- it has to provide useful feedback about unsupported options
> > +
> > +Common encoder options like "bitrate" are stored in the AVCodecContext class,
> > +while encoder-specific options like "profile" are stored in an X264Context
> > +instance in AVCodecContext::priv_data.  These options are then exposed through
> > +a tree of AVOption objects, which include user-visible help text and
> > +machine-readable information about the memory location to read/write
> > +each option.  Common @ref avoptions "AVOptions" functionality lets end users
> > +get and set those values, and provides readable feedback about errors.  But
> > +even though they can be manipulated through an API, the X264Context class is
> > +private and new releases can modify it without affecting the public interface.
> > +
> 
> I like this section, looks useful to explain the internals.
> 
> > +FFmpeg itself uses the context design pattern to solve many problems.
> > +You can use this pattern anywhere it would be useful, and may want to use
> > +AVClass and @ref avoptions "AVOptions" if they're relevant to your situation.
> 
> But again, I'm confused by this since it's confusing two levels:
> internal API development and API usage. When you write "may want to
> use" it seems to refer to the former, but the user should not really
> care about this (unless he wants to know how the internal
> implementation works).
> 
> In fact, while one user might want to use the FFmpeg API as a generic
> development toolkit (and therefore create its own custom API with
> AVClass and AVOptions) I don't think this is really very common.

I think this is another anchoring problem on my part.  The AVOptions docs[1]
describe how to add AVOptions in accessible language that made me think it was
aimed at ordinary programmers who happen to use FFmpeg.  Would it be better for
the line below "Implementing AVOptions" on that page to say something like:

 This section describes how to add AVOptions capabilities to a struct.
+It is intended for developers of new FFmpeg libraries, but use outside of FFmpeg
+is also possible.

If so, I'll make a separate patch for that and rewrite the document to match.

	- Andrew Sayers

[1] https://ffmpeg.org/doxygen/trunk/group__avoptions.html#details


More information about the ffmpeg-devel mailing list