[MPlayer-G2-dev] Re: requirements for vo2 layer

Fri Jan 9 21:21:34 CET 2004

On Fri, Jan 09, 2004 at 09:57:12AM +0100, Alban Bedel wrote:
> > Does this case really exist? I've never heard of it.
> 
> 3dfx stuff have such colorspace converters on board. 1 YUV planar -> packed
> and 1 YUV packed -> RGB. They are heavily used by tdfxfb and tdfx_vid.
> 
> > Usually the conversion is done in the driver, not the hardware, and this
> > is slow so it shouldn't be supported.
> 
> ROTF ??? are you proposing that i rewrite vo_tdfx_vid so it use sw converter
> in the driver instead of the hw converters ? That probably make things
> a lot faster !!!

No, the proposal was that you don't do any conversion and only support
the native format, but I agree that's slow in this nasty case. :)

> BTW, this rating i'm proposing should not be taken so seriously. I'm just
> thinking that it would be nice if the user could "easily" tell wich is
> the fastest colorspace a given vo support.

OK, how disgusting. Maybe we could indicate some sort of preference,
but IMO it will also work without it. The only time you'd be trying to
choose between yuy2 and yv12 is when vf_scale is picking its output
format, and it already has other considerations to use -- mainly
picking the format that minimizes both loss in quality and increase in
size. So for example, here are some choices vf_scale would make
(assuming both yv12 and yuy2 output are supported):

When converting from rgb:                               yuy2
When downscaling from rgb to 1/2 height or less:        yv12
When upscaling from yv12:                               yv12
When downscaling from yv12:                             yuy2
When upscaling from yuy2:                               yuy2
When downscaling from yuy2 to 1/2 height or less:       yv12
etc.

All of these minimize the amount of computation done in software, so
they should give the best performance even if the conversion in
hardware is sub-optimal.

It's rare that a source filter/codec can generate yuy2 and yv12 with
equal efficiency, so I doubt the performance of the hardware converter
would ever be a factor.

> > > > * Report or negotiate pixel aspect ratio of the output picture. For
> > > >   drivers without overlay which can change video mode, we might
> > > >   actually want some way of negotiating a video mode to meet certain
> > > >   size/aspect constraints.
> > > What we rather want is: the physical aspect ratio of the screen and
> > > all possible resolutions on this screen. This also imply a way to
> > > set these videmode. Imho this should be, if possible, separated from
> > > the vo drivers. All x drivers can use the same code, most console
> > > based can use the fb modedb thing, etc
> > 
> > It's still part of the vo driver. If the vo driver wants to call
> > generic code from x11_helper, that's fine.
> 
> I was more thinking along the line of a "video mode" driver to allow
> a single API for video mode query/setting while avoiding duplicate code.

This really belongs in the vo. The only common case is X11. Everything
else needs its own code (or doesn't care about video mode), and we
already have x11_helper for x11.

> > > I think that it must be up to libvo2 user to query avaible video mode
> > > and set a new mode before calling query_format, config, etc
> > > An helper func wich auto select best video mode would be nice, but
> > > libvo2 usurs must have 100% control over video mode selection if
> > > they need to.
> > 
> > This is nonsense. You have no idea what video mode you want until the
> > filter chain starts configuring.
> 
> Right. But what i was thinking about is that most limits that a vo may
> have (stride, visible area, etc) will in most case depend on the
> selected video mode and colorspace. Or taken the other way aroud:
> it will be hard for a vo to give it's stride restrictions without
> knowing in wich resolution/colorspace it will run.
> 
> And that's only resolution. Now if you change display depth you may
> also change the avaible colorspace too :(
> 
> I think that we must assume the following to be safe with most hw:
>  - each video mode support a number of colorspace.

Each video mode supports a single colorspace. Colorspace is part of
the video mode.

>  - in each video mode, each colorspace have it's own restrictions.

Is this really true or not? Stride restrictions are not necessarily
fixed numbers. They're things like "must be positive", "must be
aligned at N pixels", "must be aligned at N bytes", etc... Do these
really differ from mode to mode?

> This mean that you can know the restriction only when you choosed
> a video mode. What i proposed is the simplest: first set the mode
> then query restriction. It make driver implementation easy.

This is possible as long as the vo exports its buffers. But it's not
possible for vo's that require draw_slice. They _must_ report their
restrictions in advance so vf_expand can be inserted if necessary,
_before_ the vo is configured.

Also this setup is incompatible with the STATIC stride restriction
(meaning lavc can't direct render into such vo's).

> I agree it would be better if one could query restrictions without
> really setting the video mode. But it make implementation in drivers
> a bit harder and for some hw you probably have no other way than
> setting the mode at least once to get the limits :(

Some output API's suck. A few of those (SDL comes to mind) won't be
supported because they suck. Others will just have limited
capabilities (no DR). Crippling MPlayer for the sake of shitty video
api's is not acceptable.

> > > To resume i think that query_format should return :
> > >  colorspace rating (0 = not supported 1-255 speed rating)
> > >  flags (support oversized buffer, need aligned stride, etc)
> > >  max visible width,height
> > >  min,max strides
> > 
> > I don't like querying everything at once. Ultimately the only things
> > you absolutely _need_ to know when configuring are (a) supported
> > colorspace and (b) whether there's an absolute stride restriction
> > (only happens when buffers are indirect). All the other queries are
> > only needed for direct rendering.
> 
> Personaly i don't like to implement too many query stuff.
> Again as restrictions depend on colorspace it made sense for me to put
> that together.
> 
> > I still haven't decided exactly how query and config should work,
> > though... :(
> 
> That's why i was doing some proposition ;)

OK, let's work out what all is needed:

Colorspace
Slice stride restrictions
Buffer stride restrictions
Borders/oversized buffers
Max w/h[/stride?]
Aspect

And now where they're needed:

At negotiation time, before actual config:
- Colorspace
- Slice stride restrictions
- Aspect
- Width/height
If source and dest can't agree on these, converter must be loaded!
It must be possible to tell _which_ one is the problem so that the
right conversions can be done!!

At config time:
- Buffer stride restrictions
- Borders/oversized buffers
These will affect the border and stride options that get stored in the
link layer. But if the vo can't provide what's requested, that's ok --
direct rendering will just be disabled and the vp layer will meet the
requests using its allocated buffers.

> > > And config should get:
> > >  colorspace
> > >  src width, height
> > >  visible width, height, x,y (only if oversized buf supported)
> > 
> > This is ugly, or at least very different from how vp will work. It
> > will have width, height, and 4 border sizes (top, bottom, left,
> > right). But I don't really care if vo is done differently since we get
> > to wrap it anyway.
> 
> How fields are named is smthg i don't care about, as long as the information
> is there in some form or another ;)
> Imho it should be the same as in vp (or vp the same as vo ;)

Well from vp's perspective, the "visible" w/h is the only important
property, and the others are just scratch space to the sides of the
image to prevent sig11. :) Maybe the vo wants to think of it from a
different perspective, but I tend to agree that consistency between
the two would be better.

> > >  desired num_buff
> > 
> > This is never known. :(
> > Very unfortunate, but I don't see any way we can possibly know it.
> > Maybe force each vp node (decoders and filters) to report how many
> > buffers they'll need at config time? This would be very helpful, but
> > might be impossible for some! Or at least very very difficult to guess
> > correctly...
> 
> I really understand that in vp it's near to impossible to know. Anyway
> i thought that as a hint only. It should just give the vo a "good number"
> to start with. In mplayer the default value would 2 or 3, smtg a la

2 is never sufficient and 3 is only sufficient without B frames.
Default would have to be "unknown" since filters can do whatever they
want.

On the other hand, lots of problems arise out of not knowing how many
buffers we need. I'm almost inclined to completely overhaul the buffer
management system (again :/) and include some provision whereby you
set up the number of buffers you'll need to use, but I can't think of
any way to do it correctly anyway...

> > >  flags (fullscreen, etc)
> > 
> > This is irrelevant to the filter chain. It should be configured some
> > other way.
> We are talking about libvo2, aren't we ? At least I am ;)

Yes. G1 had a stupid method of setting fullscreen flag at the
beginning of the chain and passing it all the way thru, even tho it
was irrelevant to everything but the UI. In G2 there's no reason for
it to know about fullscreen. That's entirely a UI matter, just like
positioning the window on the screen, resizing, etc.

Rich