[FFmpeg-devel] [PATCH v5 1/4] doc: Explain what "context" means
Stefano Sabatini
stefasab at gmail.com
Sat May 25 14:00:14 EEST 2024
On date Thursday 2024-05-23 21:00:40 +0100, Andrew Sayers wrote:
> Derived from explanations kindly provided by Stefano Sabatini and others:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
> ---
> doc/context.md | 439 +++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 439 insertions(+)
> create mode 100644 doc/context.md
>
> diff --git a/doc/context.md b/doc/context.md
> new file mode 100644
> index 0000000000..21469a6e58
> --- /dev/null
> +++ b/doc/context.md
> @@ -0,0 +1,439 @@
> + at page Context Introduction to contexts
> +
> + at tableofcontents
> +
> +“Context” is a name for a widely-used programming idiom.
nit++: better to use simple "" quoting to help people with simple
ascii keyboard layouts.
nit++: "Context" is a name employed for a widely-used programming idiom.
since "context" is not the idiom itself.
> +This document explains the general idiom and some conventions used by FFmpeg.
> +
> +This document uses object-oriented analogies for readers familiar with
> +[object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming).
> +But contexts can also be used outside of OOP, and even in situations where OOP
> +isn't helpful. So these analogies should only be used as a rough guide.
> +
> + at section Context_general “Context” as a general concept
> +
> +Many projects use some kind of “context” idiom. You can safely skip this
> +section if you have used contexts in another project. You might also prefer to
> +read @ref Context_comparison before continuing with the rest of the document.
> +
> + at subsection Context_think “Context” as a way to think about code
> +
> +A context is any data structure that is passed to several functions
> +(or several instances of the same function) that all operate on the same entity.
> +For example, [object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming)
> +languages usually provide member functions with a `this` or `self` value:
> +
> +```python
> +# Python methods (functions within classes) must start with an object argument,
> +# which does a similar job to a context:
> +class MyClass:
> + def my_func(self):
> + ...
> +```
nit: as noted, I'd skip this example altogether
> +Contexts can also be used in C-style procedural code. If you have ever written
> +a callback function, you have probably used a context:
> +
> +```c
> +struct FileReader {
> + FILE* file;
> +};
> +
> +int my_callback(void *my_var_, uint8_t* buf, int buf_size) {
> +
> + // my_var provides context for the callback function:
> + struct FileReader *my_var = (struct FileReader *)my_var_;
> +
> + return read(my_var->file, sizeof(*buf), buf_size);
> +}
> +
> +void init() {
> +
> + struct FileReader my_var;
> + my_var->file = fopen("my-file", "rb");
> +
> + register_callback(my_callback, &my_var);
> +
> + ...
> +
> + fclose( my_var->file );
> +
> +}
> +```
Not convinced this is a good example, especially given that a struct
with a single field might be used directly so this looks a bit like
obfuscation.
Rather than this, maybe something as:
struct AVBikeShedContext {
uint8_t color_r;
uint8_t color_g;
uint8_t color_b;
char *color;
int (* set_default_color)(const char *color);
}
AVBikeShedContext *av_bikeshed_context_alloc();
int av_bikeshed_context_set_color(AVBikeShedContext *bikeshed, const char *color);
const char *av_bikeshed_context_get_color(AVBikeShedContext *bikeshed);
void av_bikeshed_context_get_rgb_color(AVBikeShedContext *bikeshed, uint8_t *color_r, uint8_t *color_g, uint8_t *color_b);
Then you can use:
int set_bikeshed_pink_color(AVBikeShedContext *bikeshed, const char *color) {
return av_bikeshed_context_set_color(bikeshed, "pink");
}
bikeshed->set_default_color = set_bikeshed_pink_color;
to provide a callback example
> +In the broadest sense, a context is just a way to think about some code.
> +You can even use it to think about code written by people who have never
> +heard the term, or who would disagree with you about what it means.
> +But when FFmpeg developers say “context”, they're usually talking about
> +a more specific set of conventions.
> +
> + at subsection Context_communication “Context” as a tool of communication
> +
> +“Context“ can just be a word to understand code in your own head,
> +but it can also be a term you use to explain your interfaces.
> +Here is a version of the callback example that makes the context explicit:
> +
> +```c
> +struct FileReaderContext {
> + FILE *file;
> +};
> +
> +int my_callback(void *ctx_, uint8_t *buf, int buf_size) {
> +
> + // ctx provides context for the callback function:
> + struct FileReaderContext *ctx = (struct FileReaderContext *)ctx_;
> +
> + return read(ctx->file, sizeof(*buf), buf_size);
> +}
> +
> +void init() {
> +
> + struct FileReader ctx;
> + ctx->file = fopen("my-file", "rb");
this should be ctx.file
> +
> + register_callback(my_callback, &ctx);
I don't understand what's this for
> +
> + ...
> +
> + fclose( ctx->file );
> +
> +}
> +```
> +
> +The difference here is subtle, but important. If a piece of code
> +*appears compatible with contexts*, then you are *allowed to think
> +that way*, but if a piece of code *explicitly states it uses
> +contexts*, then you are *required to follow that approach*.
> +
> +For example, take a look at avio_alloc_context().
> +The function name and return value both state it uses contexts,
> +so failing to follow that approach is a bug you can report.
"failing to follow that approach" is meant at the API level?
> +But its arguments are a set of callbacks that merely appear compatible with
> +contexts, so it's fine to write a `read_packet` function that just reads
> +from standard input.
> +
> +When a programmer says their code is "a context", they're guaranteeing
> +to follow a set of conventions enforced by their community - for example,
> +the FFmpeg community enforces that contexts have separate allocation,
> +configuration, and initialization steps. That's different from saying
> +their code is "an object", which normally guarantees to follow conventions
> +enforced by their programming language (e.g. using a constructor function).
> +
> + at section Context_ffmpeg FFmpeg contexts
> +
> +This section discusses specific context-related conventions used in FFmpeg.
> +Some of these are used in other projects, others are unique to this project.
> +
> + at subsection Context_naming Naming: “Context” and “ctx”
> +
> +```c
> +// Context struct names usually end with `Context`:
> +struct AVSomeContext {
> + ...
> +};
> +
> +// Functions are usually named after their context,
> +// context parameters usually come first and are often called `ctx`:
> +void av_some_function(AVSomeContext *ctx, ...);
> +```
> +
> +If an FFmpeg struct is intended for use as a context, its name usually
> +makes that clear. Exceptions to this rule include AVMD5, which is only
> +identified as a context by @ref libavutil/md5.c "the functions that call it".
> +
> +If a function is associated with a context, its name usually
> +begins with some variant of the context name (e.g. av_md5_alloc()
> +or avcodec_alloc_context3()). Exceptions to this rule include
> + at ref avformat.h "AVFormatContext's functions", many of which
> +begin with just `av_`.
> +
> +If a function has a context parameter, it usually comes first and its name
> +often contains `ctx`. Exceptions include av_bsf_alloc(), which puts the
> +context argument second to emphasise it's an out variable.
> +
> +Some functions fit awkwardly within FFmpeg's context idiom. For example,
> +av_ambient_viewing_environment_create_side_data() creates an
> +AVAmbientViewingEnvironment context, then adds it to the side-data of an
> +AVFrame context.
To go back to this unfitting example, can you state what would be
fitting in this case?
> If you find contexts a useful metaphor in these cases,
> +you might prefer to think of these functions as "receiving" and "producing"
> +contexts.
> +
> + at subsection Context_data_hiding Data hiding: private contexts
> +
> +```c
> +// Context structs often hide private context:
> +struct AVSomeContext {
> + void *priv_data; // sometimes just called "internal"
> +};
> +```
> +
> +Contexts present a public interface, so changing a context's members forces
> +everyone that uses the library to at least recompile their program,
> +if not rewrite it to remain compatible. Many contexts reduce this problem
> +by including a private context with a type that is not exposed in the public
> +interface. Hiding information this way ensures it can be modified without
> +affecting downstream software.
> +
> +Private contexts often store variables users aren't supposed to see
> +(similar to an OOP private block), but can also store information shared between
> +some but not all instances of a context (e.g. codec-specific functionality),
> +and @ref Context_avoptions "AVOptions-enabled structs" can include options
> +that are accessible through the @ref avoptions "AVOptions API".
> +Object-oriented programmers thinking about private contexts should remember
> +that FFmpeg isn't *large enough* to need some common object-oriented techniques,
> +even though it's solving a problem *complex enough* to benefit from
> +some rarer techniques.
> +
> + at subsection Context_lifetime Manage lifetime: allocate, initialize and free
> +
> +```c
> +void my_function( ... ) {
> +
> + // Context structs are allocated then initialized with associated functions:
> +
> + AVSomeContext *ctx = av_some_context_alloc(...);
> +
> + // ... configure ctx ...
> +
> + av_some_context_init(ctx, ...);
> +
> + // ... use ctx ...
> +
> + // Context structs are freed with associated functions:
> +
> + av_some_context_close(ctx);
> + av_some_context_free(ctx);
> +
> +}
> +```
> +
> +FFmpeg contexts go through the following stages of life:
> +
> +1. allocation (often a function that ends with `_alloc`)
> + * a range of memory is allocated for use by the structure
> + * memory is allocated on boundaries that improve caching
> + * memory is reset to zeroes, some internal structures may be initialized
> +2. configuration (implemented by setting values directly on the context)
> + * no function for this - calling code populates the structure directly
> + * memory is populated with useful values
> + * simple contexts can skip this stage
> +3. initialization (often a function that ends with `_init`)
> + * setup actions are performed based on the configuration (e.g. opening files)
> +5. normal usage
> + * most functions are called in this stage
> + * documentation implies some members are now read-only (or not used at all)
> + * some contexts allow re-initialization
> +6. closing (often a function that ends with `_close()`)
> + * teardown actions are performed (e.g. closing files)
> +7. deallocation (often a function that ends with `_free()`)
> + * memory is returned to the pool of available memory
> +
> +This can mislead object-oriented programmers, who expect something more like:
> +
> +1. allocation (usually a `new` keyword)
> + * a range of memory is allocated for use by the structure
> + * memory *may* be reset (e.g. for security reasons)
> +2. initialization (usually a constructor)
> + * memory is populated with useful values
> + * related setup actions are performed based on arguments (e.g. opening files)
> +3. normal usage
> + * most functions are called in this stage
> + * compiler enforces that some members are read-only (or private)
> + * no going back to the previous stage
> +4. finalization (usually a destructor)
> + * teardown actions are performed (e.g. closing files)
> +5. deallocation (usually a `delete` keyword)
> + * memory is returned to the pool of available memory
> +
> +FFmpeg's allocation stage is broadly similar to the OOP stage of the same name.
> +Both set aside some memory for use by a new entity, but FFmpeg's stage can also
> +do some higher-level operations. For example, @ref Context_avoptions
> +"AVOptions-enabled structs" set their AVClass member during allocation.
> +
> +FFmpeg's configuration stage involves setting any variables you want to before
> +you start using the context. Complicated FFmpeg structures like AVCodecContext
> +tend to have many members you *could* set, but in practice most programs set
> +few if any of them. The freeform configuration stage works better than bundling
> +these into the initilization stage, which would lead to functions with
> +impractically many parameters, and would mean each new option was an
> +incompatible change to the API.
> +
> +FFmpeg's initialization stage involves calling a function that sets the context
> +up based on your configuration.
> +
> +FFmpeg's first three stages do the same job as OOP's first two stages.
> +This can mislead object-oriented developers, who expect to do less work in the
> +allocation stage, and more work in the initialization stage. To simplify this,
> +most FFmpeg contexts provide a combined allocator and initializer function.
> +For historical reasons, suffixes like `_alloc`, `_init`, `_alloc_context` and
> +even `_open` can indicate the function does any combination of allocation and
> +initialization.
> +
> +FFmpeg's "closing" stage is broadly similar to OOP's "finalization" stage,
> +but some contexts allow re-initialization after finalization. For example,
> +SwrContext lets you call swr_close() then swr_init() to reuse a context.
> +Be aware that some FFmpeg functions happen to use the word "finalize" in a way
> +that has nothing to do with the OOP stage (e.g. av_bsf_list_finalize()).
> +
> +FFmpeg's "deallocation" stage is broadly similar to OOP, but can perform some
> +higher-level functions (similar to the allocation stage).
> +
> +Closing functions usually end with "_close", while deallocation
> +functions usually end with "_free". Very few contexts need the flexibility of
> +separate "closing" and "deallocation" stages, so many "_free" functions
> +implicitly close the context first.
> +
> + at subsection Context_avoptions Reflection: AVOptions-enabled structs
To clarify this, we should not treat reflection and AVOptions as
synonyms.
Clearly the AVOptions system enables the options introspection (you
can query the specified options, and get the default values or the set
values and help etc.) but it's not only about introspection, but also
about setting the values using a high-level API.
> +
> +Object-oriented programming puts more focus on data hiding than FFmpeg needs,
> +but it also puts less focus on
> +[reflection](https://en.wikipedia.org/wiki/Reflection_(computer_programming)).
> +
> +To understand FFmpeg's reflection requirements, run `ffmpeg -h full` on the
> +command-line, then ask yourself how you would implement all those options
> +with the C standard [`getopt` function](https://en.wikipedia.org/wiki/Getopt).
> +You can also ask the same question for any other programming languages you know.
> +[Python's argparse module](https://docs.python.org/3/library/argparse.html)
> +is a good example - its approach works well with far more complex programs
> +than `getopt`, but would you like to maintain an argparse implementation
> +with 15,000 options and growing?
To clarify this: the approach in this case is to use the introspection
API to expose the options programmatically, even if in fact one might
use getopt to build the CLI (or another toolkit to build a GUI).
In the case of ffmpeg, getopt cannot be used since the options are
positional (depending on the place it might take encoder/format/url
options). For a simpler CLI (e.g. an hash API wrapper) it might work.
> +
> +Most solutions assume you can just put all options in a single block,
> +which is unworkable at FFmpeg's scale. Instead, we split configuration
> +across many *AVOptions-enabled structs*, which use the @ref avoptions
> +"AVOptions API" to reflect information about their user-configurable members,
> +including members in private contexts.
> +
> +AVOptions-accessible members of a context should be accessed through the
> + at ref avoptions "AVOptions API" whenever possible, even if they're not hidden
> +in a private context. That ensures values are validated as they're set, and
> +means you won't have to do as much work if a future version of FFmpeg changes
> +the allowed values. This is broadly similar to the way object-oriented programs
> +recommend getters and setters over direct access.
> +
> +Object-oriented programmers may be tempted to compare AVOptions-accessible
> +members of a public context to protected members of a class. Both provide
> +global access through an API, and unrestricted access for trusted friends.
> +But this is just a happy accident, not a guarantee.
> +
> + at subsection Context_logging Logging: AVClass context structures
> +
> +FFmpeg's @ref lavu_log "logging facility" needs to be simple to use,
> +but flexible enough to let people debug problems. And much like reflection,
> +it needs to work the same across a wide variety of unrelated structs.
> +
> +FFmpeg structs that support the logging framework are called *@ref AVClass
> +context structures*. The name @ref AVClass was chosen early in FFmpeg's
> +development, but in practice it only came to store information about
> +logging, and about introspection.
nit: introspection => options
To further clarify this, I think AVClass was chosen because the idea
is to support a "class" of structures. You define logging and options
in a single place for all the encoders, decoders, filters etc (check
for example libavfilter/avfilter.c::avfilter_class).
> +
> + at section Context_further Further information about contexts
> +
> +So far, this document has provided a theoretical guide to FFmpeg contexts.
> +This final section provides some alternative approaches to the topic,
> +which may help round out your understanding.
> +
> + at subsection Context_example Learning by example: context for a codec
> +
> +It can help to learn contexts by doing a deep dive into a specific struct.
> +This section will discuss AVCodecContext - an AVOptions-enabled struct
> +that contains information about encoding or decoding one stream of data
> +(e.g. the video in a movie).
> +
> +The name "AVCodecContext" tells us this is a context. Many of
> + at ref libavcodec/avcodec.h "its functions" start with an `avctx` parameter,
> +indicating this parameter provides context for that function.
> +
> +AVCodecContext::internal contains the private context. For example,
> +codec-specific information might be stored here.
> +
> +AVCodecContext is allocated with avcodec_alloc_context3(), initialized with
> +avcodec_open2(), and freed with avcodec_free_context(). Most of its members
> +are configured with the @ref avoptions "AVOptions API", but for example you
> +can set AVCodecContext::opaque or AVCodecContext::draw_horiz_band() if your
> +program happens to need them.
> +
> +AVCodecContext provides an abstract interface to many different *codecs*.
> +Options supported by many codecs (e.g. "bitrate") are kept in AVCodecContext
> +and reflected as AVOptions. Options that are specific to one codec are
> +stored in the private context, and reflected from there.
> +
> +AVCodecContext::av_class contains logging metadata to ensure all codec-related
> +error messages look the same, plus implementation details about options.
> +
> +To support a specific codec, AVCodecContext's private context is set to
> +an encoder-specific data type. For example, the video codec
> +[H.264](https://en.wikipedia.org/wiki/Advanced_Video_Coding) is supported via
> +[the x264 library](https://www.videolan.org/developers/x264.html), and
> +implemented in X264Context. Although included in the documentation, X264Context
> +is not part of the public API. That means FFmpeg's @ref ffmpeg_versioning
> +"strict rules about changing public structs" aren't as important here, so a
> +version of FFmpeg could modify X264Context or replace it with another type
> +altogether. An adverse legal ruling or security problem could even force us to
> +switch to a completely different library without a major version bump.
> +
> +The design of AVCodecContext provides several important guarantees:
> +
> +- lets you use the same interface for any codec
> +- supports common encoder options like "bitrate" without duplicating code
> +- supports encoder-specific options like "profile" without bulking out the public interface
> +- reflects both types of options to users, with help text and detection of missing options
> +- provides uniform logging output
> +- hides implementation details (e.g. its encoding buffer)
> +
> + at subsection Context_comparison Learning by comparison: FFmpeg vs. Curl contexts
> +
> +It can help to learn contexts by comparing how different projects tackle
> +similar problems. This section will compare @ref AVMD5 "FFmpeg's MD5 context"
> +with [curl 8.8.0's equivalent](https://github.com/curl/curl/blob/curl-8_8_0/lib/md5.c#L48).
> +
> +The [MD5 algorithm](https://en.wikipedia.org/wiki/MD5) produces
> +a fixed-length digest from arbitrary-length data. It does this by calculating
> +the digest for a prefix of the data, then loading the next part and adding it
> +to the previous digest, and so on.
> +
> +```c
> +// FFmpeg's MD5 context looks like this:
> +typedef struct AVMD5 {
> + uint64_t len;
> + uint8_t block[64];
> + uint32_t ABCD[4];
> +} AVMD5;
> +
> +// Curl 8.8.0's MD5 context looks like this:
> +struct MD5_context {
> + const struct MD5_params *md5_hash; /* Hash function definition */
> + void *md5_hashctx; /* Hash function context */
> +};
> +```
> +
> +Curl's struct name ends with `_context`, guaranteeing contexts are the correct
> +interpretation. FFmpeg's struct does not explicitly say it's a context, but
> + at ref libavutil/md5.c "its functions do" so we can reasonably assume
> +it's the intended interpretation.
> +
> +Curl's struct uses `void *md5_hashctx` to avoid guaranteeing
> +implementation details in the public interface, whereas FFmpeg makes
> +everything accessible. This disagreement about data hiding is a good example
> +of how contexts can be used differently. Hiding the data means changing the
> +layout in a future version of curl won't break downstream programs that used
> +that data. But the MD5 algorithm has been stable for 30 years, and making the
> +data public makes it easier for people to follow a bug in their own code.
> +
> +Curl's struct is declared as `struct <type> { ... }`, whereas FFmpeg uses
> +`typedef struct <type> { ... } <type>`. These conventions are used with both
> +context and non-context structs, so don't say anything about contexts as such.
> +Specifically, FFmpeg's convention is a workaround for an issue with C grammar:
> +
> +```c
> +void my_function( ... ) {
> + int my_var; // good
> + MD5_context my_curl_ctx; // error: C needs you to explicitly say "struct"
> + struct MD5_context my_curl_ctx; // good: added "struct"
> + AVMD5 my_ffmpeg_ctx; // good: typedef's avoid the need for "struct"
> +}
> +```
> +
> +Both MD5 implementations are long-tested, widely-used examples of contexts
> +in the real world. They show how contexts can solve the same problem
> +in different ways.
I'm fine with keeping this section at the end of the document.
More information about the ffmpeg-devel
mailing list