[FFmpeg-devel] [RFC] FFmpeg Execution Graph Visualization

Fri Apr 11 20:09:09 EEST 2025

Michael Niedermayer (HE12025-03-29):
> can you repost the AVWriter patches ?

Sure, but I need to find the latest branch, and it is proving a cause of
procrastination for me. In the meantime, let me explain anew the
principles behind it, after a few years I might be able to express them
better.

(I hope it means you are considering re-asserting your authority to
approve it on principle pending review of the code. I will not invest
more coding into it unless I know nobody has the authority to put it to
waste with “it does not belong”.)

1. Why we need it

We use a lot of text for small things. As a command-line tool, we use
text to communicate with users. Even as a library, when GUIs do not have
the proper widget they will use text, and we need to provide the
necessary support functions.

We should have a rule that for every type we add, we must have a
corresponding to-text API function. Furthermore, these functions should
all have the same type (up to the type of the opaque pointer), so that
we can store them in a structure that describes the type along with
other information (from string, ref/unref, etc.).

We also need text for inter-process communication, like exporting
statistics from filters.

Some of these to-text conversions happen once per frame. That means they
should happen without dynamic allocation. Quite complex code is
warranted to prevent dynamic allocations once per frame: see the frame
and buffer pools too. But some of these to-text conversion have an
unbounded output. That means the API needs to be able to handle long
output.

BPrint meets the criteria: BPrint is as fast as a buffer on the stack
but can handle strings of arbitrary length.

Unfortunately, BPrint is ugly. The name is ugly, and more importantly,
it is inflexible, it works only with flat buffers in memory.

2. How AVWriter does it

AVWriter is an attempt — successful, in my opinion — at keeping what is
good in BPrint and fixing what is ugly, in particular by giving it
features similar to AVIO.

The design principle is the same as AVIO: a structure with callbacks
that perform the low-level operations.

The difference with AVIO is that effort were made to keep the structures
small and allocatable by the caller.

Also, the back-end is allowed to provide only a few methods: printf-like
or write-like or get_buffer-like, and the framework will make sure to
use one that is available. That means quite a bit of code to handle
doing a printf() on a back-end that only has get_buffer, but it is code
isolated within a single file and with ample FATE coverage.

One of the tricks I used to avoid dynamic allocations is to store the
structure of methods and the structure with an instance as a structure
with two pointer elements passed by value.

Another trick I used to avoid dynamic allocations is to store the size
of a structure in it. That way, if a caller is compiled with an old
version of the library, it stores a smaller size than the current one,
and the new version of the library can test and avoid accessing the new
fields. That lets the caller allocate the structure while keeping the
ability to add fields to the structure.

Apart from that, is is rather straightforward. The default AVWriter is a
simple wrapper around BPrint, but there are also quite a few other
built-in ones: wrapper around stdio and av_log (without storing the
whole string), wrapper around a fixed-size buffer. AVwriter can also act
as a filter: get an AVWriter, you can create a new one that will encode
to base64 or HTML entities on the fly.

3. Where it will go

The trick passing a struct of methods and the object as a structure with
two members passed by value is meant to become a very light-weight
object system. Thus if ctx is a pointer to a structure of type T, we can
define serialize_t as a structure with functions and consider
{ serialize_t, ctx } as an object of the class “serializable”. But we
can also make it an object of the class “refcounted” by pairing it with
another structure.

The object system, including casts from one class to another, can be
de-centralized, new classes ca be defined anywhere in the code without a
central registry. That will be useful to enhance several parts of our
code:

- Side data would not need to be added in libavutil. A pair of
  demuxer-muxer in libavformat can define a new type of side data by
  defining the methods to handle it.

- Filters with unusual parsing needs can define a new type of option and
  plug it into the standard options parsing system. It is extremely
  useful if the needs of the filter are too specific to add a type in
  libavutil but too far from something existing to be able to use it
  conveniently.

- Pluging new types into the options system will automatically fix our
  “escaping hell” issue where users sometimes need 42 levels of \ in
  front of a : to achieve their goals.

- We can implement a full-featured av_printf() function that serializes
  on the fly:

  av_printf(out, "Stream %d at %d Hz, %@ with layout %@\n",
      st->id, st->sample_rate,
      avany_sample_fmt(st->sample_fmt),
      avany_layout(st->channel_layout));

With this, it becomes possible to implement new features that rely a lot
on text. That includes:

- Better error reporting. Instead the error message going to the log,
  where nobody will see it if it is a GUI, instead of the application
  having to install a log handler, with the responsibility to assemble
  lines together and filter out unrelated messages that just happened to
  arrive at the same time, the error message is stored in the relevant
  context and can be retrieved cleanly later.

- Built-in comprehensive documentation. You can do
  av_get_documentation(ctx, …) and get explanations on how to use it.
  Depending on the flags given, the returned documentation can be terse
  and suitable for a tooltip or comprehensive including hyperlinks and
  reference for the syntax of the options.

- Serialization of high-level data structures into various standard
  formats: JSON, XML, etc., precisely what started this thread. That
  allows to factor the writers in ffprobe and that gives us means to
  exfiltrate statistics from filters.

4. Why it must be in FFmpeg

I acknowledge that nothing I described is even remotely specific to
FFmpeg. I could be arguing for AVWriter, a light-weight object system,
built-in documentation, etc., in a project that does something entirely
else.

That means in principle all this could go into a separate library, and
some people have argued for it. In principle it could, but in practice
it does not make sense for multiple reasons, but the main reason is
this:

As a matter of principle, we do not demand that users install
third-parties libraries in order to build FFmpeg. We rely on system
libraries, and we can rely on external libraries for optional features,
even very important ones (x264…), but if something is needed to run
FFmpeg at all then it must be in our own code base so that building from
sources will just work. Strings and error reporting cannot be optional,
they must be in FFmpeg.

I will come back later with the code as it was last time.

Regards,

-- 
  Nicolas George