[MPlayer-dev-eng] patch for quadbuffered stereo using "-vo gl2:stereo"

Stuart Levy slevy at ncsa.uiuc.edu
Thu Nov 29 17:54:01 CET 2007


On Thu, Nov 29, 2007 at 11:27:55AM +0100, Reimar Döffinger wrote:
> Hello,
> On Wed, Nov 28, 2007 at 07:10:17PM -0600, Stuart Levy wrote:
> > On Wed, Nov 28, 2007 at 11:09:58PM +0100, Reimar Döffinger wrote:
> > > Well,
> > > glPushMatrix();
> > > glScalef(2, 1, 1);
> > > drawTextureDisplay();
> > > glTranslatef(-0.5, 0, 0);
> > > drawTextureDisplay();
> > > glPopMatrix();
> > > (obviously I forgot the parts setting right/left backbuffer)
> > > 
> > > sure is not any more complex than fiddling with drawTextureDisplay!
> > 
> > But that sends all the pixels to the graphics system *twice*, and just depends
> > on opengl clipping to discard the extra texture tiles.  Surely it's
> > worth a few extra lines of code to avoid using twice the opengl bandwidth?
> 
> Huh? How do you think OpenGL works? The textures already have been fully
> transferred (that is what texdirty variable is there for), or do you think
> about the vertexes?
How about the glBindTexture() calls?  It's doing those regardless.
> I highly doubt we need to concern us with saving probably
> at most 500 bytes data per frame - which anyway can be saved by using a DisplayList.
> And the non-visible vertexes will be dropped far before the rasterization
> stage.
> 
> > > Not to mention that it can be sped up more easily via DisplayLists, it
> > > is less of a hindrance to further extending and maintaining (among other
> > > things, due to containing the code to support stereo in one place), is
> > > trivial to port to e.g. vo_gl etc.
> > 
> > Can vo_gl handle high-resolution textures?  If that code is doing texture tiling,
> > it's not obvious to me.  Can it display a 1920x1080 or larger image using -vo gl
> > if the graphics card only accepts 1K textures, as some do?
> 
> There are also graphics cards that can already handle 4096x4096 and
> more. vo_gl offers far more features, esp. concerning OSD rendering.
> vo_gl2 also does not support vsync handling, so unless enabling it in the
> driver (which usually means enabling it globally), it will flicker.
> 
> > Also, at least for the nvidia cards I've been playing with, it's tens of percent
> > faster to load images into many smaller textures (like 128x128 or 256x256) than
> > into fewer larger ones (like 1K or bigger).  E.g. on one card here,
> > I get 298 Mpixels/sec on 256x270 GL_TEXTURE_RECTANGLE_ARB textures to cover
> > a 1920x1080 window, but only 190 Mpix/sec using a 1Kx1K or 1920x1080 texture.
> 
> That problem is in my tests quite specific to AGP cards, so will soon be
> irrelevant (not to mention that it is driver stupidity).

I'm seeing it for PCIe cards too -- the above report (298 Mp/s is the fastest I've
found on any system) is for one of those.  I can get higher speeds out of
PCIe than AGP, and some other factors differ (e.g. BGR texture format is
faster than RGB on the AGP cards but they're equal on the PCIe's I've tried).
But across the board, the texture loading performance is a good deal faster for
medium-sized textures than large ones, on both nVidia Linux drivers and for
Apple's driver.  And to be fair, the improvement isn't always 30-40%;
it's often more like 15%.  Still it's significant.
Caveats:
    I haven't tried it for any ATi cards.
    I also haven't tried using glTexSubImage2D to load
    a large texture object in many smaller tiles -- that would be a good test.

In case anyone would like to play with this, there's a test program at
   http://virdir.ncsa.uiuc.edu/slevy/visbox/txspeed.c
with some example tests in its first few lines of comments.

> In addition MPlayer already takes this into account and uploads the
> pictures not all at once, see slice-height suboption.
> Though there might be this kind of effect if the on-card memory is too
> small for the whole texture.
> Also note that this performance will be bought with worse quality: there
> is no interpolation at the texture boarders, which will result in
> clearly visible artefacts from time to time, and it also makes it
> basically impossible to use more advanced scaling (like lscale=2) since
> the artefacts are even stronger there.
> 
> > And -- out of curiosity -- would a display list actually help here?
> > I'd expect that typically this routine would just load an image into a
> > batch of texture tiles, display each tile once, then advance to the next image.
> > (I can see it would help for on-screen display pieces, as vo_gl.c does.)
> 
> Sure, but it calculates and uploads the vertex coordinates for the
> texture tiles anew each time although they stay the same.
> 
> [...]
> > > I don't know. I simply loathe having any kind of logic in
> > > message-printing code. I somewhat doubt it is helpful, since it does not
> > > say what exactly was the problem, and I somewhat doubt that
> > > "quadbuffered stereo" will mean anything at all to someone who stumbles
> > > over it by accident.
> > 
> > Hey, at least you can feed the phrase to google or search driver documentation for it!
> 
> Yes, how about a simple "no suitable GLX mode found (note quadbuffered
> stereo is needed for :stereo)" instead?

Great, I'll do that.

> I completely forgot why I hate logic in messages: It makes it very hard
> to translate properly.

That makes sense.  I'll make it be two separate messages. 

Will send another patch...  This one returns vo_gl2.c's drawTextureDisplay()
to a version needing no arguments -- it encapsulates all the stereo logic
there now.  Will do red/cyan stereo too.



More information about the MPlayer-dev-eng mailing list