[MPlayer-dev-eng] [PATCH] Direct3D libvo driver

Tue Nov 18 18:07:38 CET 2008

On 11/18/08, Georgi Petrov <gogothebee at gmail.com> wrote:
> You have some very good points...
>
>> 1. The w32_common documentation part should be separate from the
>> vo_direct3d patch and probably committed right away.

It wasn't when I started writing this mail.
It is usual practice to give warning before committing patch, but as
Reimar is the last active coder in the project it is understandable
why he didn't want to bother.

> It got committed already - everything - the driver and the changes to
> w32_common.c...
>
>> 2. in w32_common I see that there is two changes of function prototype.
>> vo_w32_border() and vo_w32_uninit(). This is not acceptable.
>> In some (standard) C flavors function() is equivalent of function(int).
>>
>
> I'm sure Reimar wanted to correct the prototypes - they are supposed
> to be (void).

I don't see Raimer commit changing the declarations,
so he should have noticed and fixed them.

>> 3. There is something wrong in your draw_slice() implementation.
>> It looks like you are calling IDirect3DDevice9_StretchRect() for the whole
>> image on each function invocation. But draw_slice() is not used only by
>> D3DRenderFrame(), it is also used to update the screen on every 16
>> rendered lines. In short for 480p video you call it 30 times more than
>> necessary.
>> In theory the adapter may delay the stretch until it is exposition time,
>> and probably this is what happens to you. But it may not be true for the
>> above 90% cpu bugreport. Slice drawing could be disable with -noslices
>> options
>
> You are right. However, I never really understood one thing:
>
> 1. We have a draw_slices() libvo callback, which as you said updates
> the screen with the rendered content every 16 lines. I thought it's on
> every line though. I never realized that StretchRect is called so
> often. This is huge overkill. BUT - it's overkill for for every 16
> lines (even if it copies only them) as well!!!
>
> 2. We have a VOCTRL_DRAW_IMAGE in "control", which passes mpi
> structure with whole frame already rendered.
>
> This is stupid. Or it seems so. This way we have two "paths" for
> rendering. I never really understood when is VOCTRL_DRAW_IMAGE used
> for the whole frame and when draw_slices() for a 16 by 16 lines... On
> my test images, draw_slices() never called directly.
>
> Please somebody explain to me WHY are there those two draw paths and
> exactly when is the first and when the second one used? As far as I
> understand, this question is valid only for planar modes. In packed
> mode, only VOCTRL_DRAW_IMAGE is used, right?

Drawing slice by slice is used to increase cache efficiency.
Decoder works on macroblocks (MB) of 16x16 pixels. When it finishes decoding of 
one row of MB it is very likely that the whole row (e.g. 720x16*1.5=18KB) is
still inside the cache. So by working on smaller data that fits into the cache you get a better speed.
If you transfer the whole image (720x560*1.5=5904KB) it is very likely that when you finish writing the last pixels, the first would already be out of the cache.
It usually gives 5-10% speed boost.

So we call draw_slice() to move smaller data, right after it is ready. 
It is so called partial update.

When the frame is finished, we call DRAW_IMAGE to signal the driver 
that the frame is ready and the buffer is filled with valid data.

When the DRAW_IMAGE is called if first check few flags.
1. It checks if the buffer have been filled with valid data by using direct rendering.
(this is giving the buffer memory directly to the decoder as working memory).
MP_IMGFLAG_DIRECT

2. if the above is not true, then it checks if the buffer have been filled with valid data by calling draw_slice() functions.
MP_IMGFLAG_DRAW_CALLBACK

3. If none of the above is true, then fill the buffer on its own.

DRAW_IMAGE is always called. draw_slice is optional.

> StretchRect is expensive. I don't want to call it each 16 lines even
> if I update only 16 lines on the destination (with RECT given as a
> parameter to StretchRect). I would want to delay StretchRect until
> Present and call it in flip_page() then...
>
> So in conclusion - I the two StretchRects - the one from draw_slice()
> and from D3DRender() - to flip_page() just before Present, right?

If I've explained it correctly you should already be guessing that
StretchRects should be done when the frame is complete, 
this means at DRAW_IMAGE.

>> 4. You have to put "static" in front of function names, unless that
>> function is going to be called from outside the .c file. This keyword says
>> to the compiler that this function is only used inside this file and
>> doesn't export its symbols.
>> This is how we have control(), query_format() in each  vo_* drivers and
>> they don't clash at linking. The only exported function in vo_*drivers is
>> the one defined by the LIBVO_EXTERN(), it sets callbacks to the API
>> functions inside the driver.
>
> I'll do it.
>
>> 5. I don't like naming your functions  D3D<something>. The D3D is used by
>> Microsoft (e.g. D3DFORMAT) and we can't say with single look if this is
>> API function or something you wrote. The possibility of name clash is also
>> bigger than zero, even if your functions are static.
>>
>
> Ok, I'll do it.
>
>> You may notice that all function (and variables) in mplayer are in
>> lowercase, with "_" separator between words. It would be welcome if you
>> follow that style too.
>
> Is this REALLY the coding style in the whole MPlayer? If so, I'll
> change mines as well, no problem.

MPlayer doesn't really have coding style. You are free to do whatever horrible
mess you like. A lot of people however complain that this make MPlayer source
unreadable.
As of recently we try to follow at least FFmpeg style guidelines.

As I said, you are free to ignore my requests.

>> Also, I don't like the use of gVariable, I would recommend you to make
>> single struct containing all local to the driver variables and use it like
>> this
>>   priv->SrcWidth; priv->DesktopFmt; //or like this:
>>   priv->src_width; priv->desktop_fmt;
>
> Funny - it was just like that when Diego wanted from me to drop
> gConfig struct, which was a container for all globals. I did it. Now
> you want me to introduce globals  container again. Please - you both
> decide which is preferable, because I will do it only once again.

Hum, I must have misunderstood him. I thought he complained 
against the gName, same as I did.

The old assumption was that locally global variables are enough
for the vo as you can have only one video output at a time.
However if you want to run 2 different vo's (e.g. same video on 2 different screens),
you need to make these locally global variables into struct and pass it via context.

I think that Uoti have done this already in his own fork, and I hope
that it would be done in the main tree too.

>> We use "priv" in video filter system, so it would be consistent and would
>> require less changes when/if we start updating the vo.