[MPlayer-G2-dev] vp layer and config

Mon Dec 15 10:07:28 CET 2003

Despite it already being somewhat ugly and complicated, I'd actually
like to propose adding some more config-time negotiations:

* Strides for export & direct buffers.
* Sample aspect ratio.
* Time base.

Time base doesn't seem controversial. Maybe some people want float
time, but IMO using rational time base makes things easier for
filters, and more precise. It's also very helpful for mencoder-g2,
since you get a good default for the output framerate for
fixed-framerate formats.

Next up is sample aspect ratio (SAR). Using SAR is MUCH BETTER than
using DAR, because it doesn't have to be adjusted by most filters.
That means no messy rational arithmetic and trying to reduce fractions
with huge denominators. Also, this means the beginning of the filter
chain (codec end) doesn't need to know anything about the monitor's
aspect ratio. The SAR from the source file can just pass all the way
down the chain (except through filters like scale, halfpack, il, ...
which will have to modify it) and get used for hardware scaling at the
very end.

Now finally the controversial one: strides. There are two examples in
MPlayer G1 which provided the motivation for this:

One is a nasty conflict between lavc and mga_vid. Due to hardware
limitations, mga_vid requires stride to be aligned at 32-byte
boundaries. Due to software limitations (maybe a performance issue
too?) lavc requires stride not to change while decoding. Now, suppose
you try to play a 720x480 movie with lavc on mga_vid with direct
rendering of B frames. For the first (I) frame, MPlayer allocates a
720x480 buffer in readable system memory and lavc decodes into this
buffer. Same for the following P frame. Then lavc tries to decode the
B frame, so it requests a DR buffer from vo_mga. But this time, the
stride is aligned to 736, so lavc fails to decode the frame!! Not only
does DR fail to work, but all B frames get dropped entirely. This is
not acceptable.

Motivation #2 is a little more subtle, but still relevant. Recently
Michael wrote vf_fil, a filter which interleaves or deinterleaves the
two fields of an image by halving or doubling the stride. This is very
useful, but it can't work properly (unless the user gets lucky!)
because it needs to know the stride of the source images it will be
accepting when it configures the next filter (output width depends on
input stride). One workaround would be to delay configuring the rest
of the filter chain until it gets the first image (much like how
codecs delay config until they begin decoding the first frame), but I
don't think it's a very clean solution.

So the question that remains is _how_ to negotiate stride. Here's a
rather elaborate proposal that should do the job...

* Define a structure for representing stride restrictions. This
  structure will contain flags for various restriction types, as well
  as custom alignment values or explicit values the strides must take.

* Provide a function for comparing stride restrictions, so that it
  becomes easy to test whether a source and destination filter are
  compatible, and whether the source is compatible with the
  destination's DR buffers.

* When calling vp_config to configure the next filter, the source MUST
  pass its stride restrictions. It is ONLY allowed to pass a NULL
  structure if it can draw into a buffer of ANY stride.

* When a filter's config function is called, it MUST verify that it
  can accept input from buffers meeting the source filter's stride
  restrictions. It may ONLY ignore the source stride restrictions if
  it can read from a buffer of ANY stride.

* If a filter is going to provide direct-rendering buffers, it MUST
  verify during its config function that it will be able to provide DR
  buffers meeting the source filter's stride restrictions. If not, it
  MUST disable direct rendering. Furthermore, the filter SHOULD store
  the stride it will use for DR buffers in the link structure before
  returning. If the filter does not store such a stride, then it MUST
  use whatever stride the vp layer stores in the link structure after
  config returns.

* After a filter's config function returns, the vp layer will choose
  an appropriate stride and store it in the link structure, if one was
  not already selected by the filter.

* When allocating automatic-type buffers, the vp layer will always use
  the stride stored in the link.

* When using export-type buffers, a source filter MUST ensure that
  they meet the stride restrictions of the destination. However, the
  source filter is not required to use the stride stored in the link
  structure, so long as it meets all the requirements.

Now a list of possible stride restrictions:

* Byte aligned: Stride must be aligned on the given byte boundary.
* Pixel aligned: Stride must be aligned on the given pixel boundary.
* Exact value: Stride must be the exact value specified.
* Positive: All strides must be positive numbers.
* Common stride: Stride for U and V planes are both equal to the
  stride of the Y plane, shifted by horizontal chroma shift.
* Common plane: Planes follow one another immediately in memory.
* Static: Once a particular stride has been used, buffers with any
  other stride are not permitted.

Anything I'm missing?

Rich