[FFmpeg-user] How are ffmpeg internal frames structured?

Sun Feb 7 03:51:49 EET 2021

Am So., 7. Feb. 2021 um 01:34 Uhr schrieb Mark Filipak (ffmpeg)
<markfilipak at bog.us>:
>
> [decoder] --> [filters] --> [codecs] --> [encoder]

This looks wrong / misleading.
There is a transcoding pipeline:
demuxer -> decoder -> filter -> encoder -> muxer
And there are formats, codecs, filters, devices (and more) as
part of the FFmpeg project.

> My question is:
> How are decoder.output and encoder.input structured?
>
> Yes, I know that filters can reformat the video, deinterlace, stack fields, etc.

> I have been assuming that the structure is frames of pixels like this:
> pixel[0,0] pixel[1,0] ... pixel[in_w-1,0] pixel[0,1] pixel[1,1] ... pixel[in_w-1,in_h-1]

(The usual style is that rows come first, than columns.)

> In other words, raw video frames with lines abutted end-to-end

In abstract terms, this is of course correct, but note that planar and
packed pix_fmts exist, see libavutil/pixfmt.h, lines and frames are
often padded for performance reasons.

> (i.e. not deinterlaced).

I may misunderstand but I believe this part makes no sense
in above sentence.

Carl Eugen