[FFmpeg-devel] [PATCH v4 1/4] doc: Explain what "context" means

Andrew Sayers ffmpeg-devel at pileofstuff.org
Sun May 26 15:06:52 EEST 2024


It feels like we've got through most of the mid-level "how FFmpeg works" stuff,
and now we're left with language choices (e.g "options" vs. "introspection")
and philosophical discussions (e.g. the relationship between contexts and OOP).
It's probably best to philosophise first, then come back to language.

This message has been sent as a reply to one specific message, but is actually
springboarding off messages from two sub-threads.  Hopefully that will keep
the big questions contained in one place.

On Sat, May 25, 2024 at 11:49:48AM +0200, Stefano Sabatini wrote:
> What perplexes me is that "context" is not part of the standard OOP
> jargon, so this is probably adding more to the confusion.

This actually speaks to a more fundamental issue about how we learn.
To be clear, everything I'm about to describe applies every human that ever
lived, but starting with this message makes it easier to explain
(for reasons that will hopefully become obvious).

When you ask why "context" is not part of OOP jargon, one could equally ask
why "object" isn't part of FFmpeg jargon.  The document hints at some arguments:
their lifetime stages are different, their rules are enforced at the
language vs. community level, OOP encourages homogenous interfaces while FFmpeg
embraces unique interfaces that precisely suit each use case, and so on.
But the honest answer is much simpler - humans are lazy, and we want the things
we learn today to resemble the things we learnt yesterday.

Put another way - if we had infinite time every day, we could probably write an
object-oriented interface to FFmpeg.  But our time is sadly finite so we stick
with the thing that's proven to work.  Similarly, if our readers had infinite
free time every day, they could probably learn a completely new approach to
programming.  But their time is finite, so they stick to what they know.

That means people reading this document aren't just passively soaking up
information, they're looking for shortcuts that fit their assumptions.
And as anyone that's ever seen a political discussion can tell you,
humans are *really good* at finding shortcuts that fit their assumptions.
For example, when an OOP developer sees the words "alloc" and "init",
they will assume these map precisely to OOP allocators and initializers.  One
reason for the long section about context lifetimes is because it needs to
meet them where they are, then walk them step-by-step to a better place.

Aside: if FFmpeg had a blog, I could turn this discussion into a great post
called something like "reflections on object- vs. context-oriented development".
But the project's voice is more objective than that, so this document is limited
to discussing the subset of issues that relate specifically to the FFmpeg API.


On Sat, May 25, 2024 at 01:00:14PM +0200, Stefano Sabatini wrote:
> > +Some functions fit awkwardly within FFmpeg's context idiom.  For example,
> > +av_ambient_viewing_environment_create_side_data() creates an
> > +AVAmbientViewingEnvironment context, then adds it to the side-data of an
> > +AVFrame context.
> 
> To go back to this unfitting example, can you state what would be
> fitting in this case?

"Awkwardly" probably isn't the right word to use, but that's a language choice
we can come back to.

The problem with FFmpeg's interface isn't that any one part is illogical,
it's that different parts of the interface follow incompatible logic.

It's hard to give specific examples, because any given learner's journey looks
like a random walk through the API, and you can always say "well nobody else
would have that problem".  But if everyone has a different problem, that means
everyone has *a* problem, even though there's no localised code fix.

For sake of argument, let's imagine a user who was a world-leading expert in
Microsoft QBasic in the eighties, then fell into a forty-year coma and woke up
in front of the FFmpeg documentation.  In other words, a highly adept
programmer with zero knowledge of programming conventions more recent than
"a function is a special type of subroutine for returning a value".
Their journey might look like...

1. there's this thing called "context", and some functions "have" contexts
2. sws_init_context() says "Initialize the swscaler context sws_context",
   and `sws_context` is a `SwsContext *`, so I think it has a SwsContext context
3. sws_alloc_context() says "Allocate an empty SwsContext",
   and it returns a `SwsContext *`, so I think it has the same context
   as sws_init_context()
4. avio_alloc_context() and avio_open2() are both variations on this theme,
   so I should look for creative ways to interpret things as "having" contexts
5. av_alloc_format_context() puts the type in the middle of the function name,
   so I should only treat prefixes as a weak signal
6. av_ambient_viewing_environment_create_side_data() allocates like an alloc,
   so I think the return value is the context; but it also operates on AVFrame
   in a way that affects related functions, so I think the arg is the context.
   Its prefix is too a weak a signal to be a tiebreaker, so I'll just guess
   one of them at random and wait until something goes wrong

In the above case, the interface rewarded the developer for looking harder and
harder for ways to call something a context, to the point where saying "neither"
becomes inconceivable (or at best an admission of defeat).

There's no way to avoid that sort of inconsistency in a project like FFmpeg,
and explaining the logic behind each choice would involve an order of magnitude
more documentation.  So the only practical choice is to present a sort of
conceptual buffet - "here are several ways to think about the problem, choose
whatever suits your tastes".


More information about the ffmpeg-devel mailing list