From dalias at aerifal.cx  Fri Sep 12 01:53:01 2003
From: dalias at aerifal.cx (D Richard Felker III)
Date: Thu, 11 Sep 2003 19:53:01 -0400
Subject: [MPlayer-G2-dev] A new vf layer proposal...
Message-ID: <20030911235301.GA24980@brightrain.aerifal.cx>

Early in G2 development we discussed changes to the vf layer, and some
good improvements were made over G1, but IMO lots of it's still ugly
and unsatisfactory. Here are the main 3 problems:

1) No easy way to expand the filter layer to allow branching rather
   than just linear filter chain. Think of rendering PIP (picture in
   picture, very cool for PVR type use!) or fading between separate
   clips in a video editing program based on G2.

2) The old get_image approach to DR does not make it possible for the
   destination filter/vo to know when the source filter/vd is done
   using a particular reference image. This means DR will not be
   possible with emerging advanced codecs which are capable of using
   multiple reference frames instead of the simple I/P/B model.

3) The whole vf_process_image loop is ugly and makes artificial
   distinction between "pending" images (pulled) and the old G1
   push-model drawing.

Actually (3) has a lot to do with (1).

So the proposal for the new vf layer is that everything be "pull"
model, i.e. the player "pulls" an image from the last filter, which in
turn (recursively) pulls an image from the previous filter (or perhaps
from multiple previous filters!).

Such a design was discussed early on in G2 development, but we ran
into problems with auto-insertion of conversion filters. However I
think the following proposal solves the problem.


vf_get_buffer (2 cases):

1) Next filter can accept our output format. If the next filter
   implements get_buffer, call that. Otherwise get a buffer from a
   pool for this filter-connection, growing the pool if all the
   buffers are already in use.

2) Next filter doesn't like our output format. Insert appropriate
   conversion filter and then do (1).

In all cases, buffers obtained from get_buffer have reference counts.
When a buffer is first obtained, it has reference count=1, meaning
that the destination filter has a hold on it because it wants the
output which the source filter is drawing into the buffer. If the
source filter does not need to use the image as a reference for future
frames, it can just return the image to the caller and the destination
filter will unlock the buffer (thus freeing it for reuse) when it's
finished using the image as input. On the other hand, if the source
filter needs to keep the image as a reference for future frames, it
can add its own lock (vf_lock_buffer) so that the image still has a
nonzero reference count once the destination filter finishes using it.

In addition to the above behavior, flags can be used to signal who
(source or dest) is allowed to read/modify the image, and when. Thus,
we have the equivalencies (old system to new system):

TEMP: source does not lock the buffer

TEMP+READABLE: source does not lock the buffer, but is allowed to read
it (i.e. it can't be in video mem)

IP: source locks buffer and is allowed to read it

STATIC: source locks buffer and is allowed to write to it again after
passing it on to the destination

STATIC+READABLE: source locks buffer and is allowed to read it and
write again after passing it on

These explanations are fairly rough; they're just meant to give an
idea of how things convert over. There's probably a need for a
function similar to get_buffer, but which instead notifies the next
filter that you want to reuse a buffer you already have (from previous
locking) as the output for another frame. But as far as I can tell,
all of this is minor detail that doesn't affect the proposal as a
whole.


Now we get to the more interesting function.....


vf_pull_image:

This one also has a couple cases:

1) The filter calling vf_pull_image was just created (by
   auto-insertion) during the previous filter's pull_image, so we
   do NOT want to call the previous filter's pull_image again. Instead
   we've saved the mpi returned by the previous filter somewhere in
   our vf structure, so we just clear that from the vf structure and
   immediately return it. This is a minor hack to make auto-inserted
   filters work, but it's not visible from outside of the
   vf_pull_image function itself, so IMO it's not ugly.

2) We call the previous filter's pull_image and get an image with
   destination == the calling filter. Return the image to the caller.

3) The previous filter's pull_image returns an image whose destination
   is *not* the calling filter. This means a conversion filter must
   have been inserted during the previous filter's pull_image (as a
   result of it calling get_buffer).


Summary:     (may have 10l bugs :)

if (vf->pending_mpi) {
        mpi = vf->pending_mpi;
	vf->pending_mpi = NULL;
	return mpi;
}
while ((mpi = src_vf->pull_image(vf, src_vf)) && mpi->dest_vf != vf) {
        mpi->dest_vf->pending_mpi = mpi;
        src_vf = mpi->dest_vf;
}
return mpi;


A couple comments about this. The nicest part of the design is that
vf_pull_image doesn't need to know so much about the 'chain' structure
of the filters. It should be called with something like:

    mpi = vf_pull_image(vf, vf->prev);

so that a filter which wants multiple sources could do something like:

    mpi1 = vf_pull_image(vf, vf->priv->src1);
    mpi1 = vf_pull_image(vf, vf->priv->src2);

or whatever. Actually the source should probably be passed to
vf_pull_image as a pointer so that it can be updated when a conversion
filter is auto-inserted.

Also note that my proposal has mpi structure containing pointers to
the dest (and possibly also source) filters with which the buffer is
associated. I'm not sure this is entirely necessary, but it seems like
a good idea.

Of course the best part of all is that, from the calling program's
perspective and the filter authors' perspective, vf_pull_image looks
like a 100% transparent pull-based recursive frame processing system.
No ugly process_image/get_pending_image distinction and push/pull mix,
just a sequential flow of frames.


Comments? I believe there are a few details to be worked out,
especially in what happens when a filter gets auto-inserted by
get_buffer, how buffer pools work, etc., but the basic design is
sound. Concerns about get_buffer (e.g. whether you release a buffer
before or after you return it, and if after, how) have been eliminated
by use of reference counts and there seem to be no major obstacles to
implementing the vf_pull_image system as described.


At some point on the not-too-distant future I'd like to begin porting
filters (especially pullup) to G2 and writing mencoder-g2, so I hope
we can discuss the matter of overhauling the vf layer soon and then
get around to some actual coding.


                                                  Rich


P.S. One more thing: I made no mention of how configuration
(especially output size and all the resize nonsense Arpi was talking
about :) works. I'll be happy to discuss that later, but I'd like to
see what Arpi suggests first since that's all very confusing to me,
and I don't think the design I've described above makes much
difference to it...


From andrej at lucky.net  Fri Sep 12 02:06:05 2003
From: andrej at lucky.net (Andriy N. Gritsenko)
Date: Fri, 12 Sep 2003 03:06:05 +0300
Subject: [MPlayer-G2-dev] Re: A new vf layer proposal...
In-Reply-To: <20030911235301.GA24980@brightrain.aerifal.cx>
References: <20030911235301.GA24980@brightrain.aerifal.cx>
Message-ID: <20030912000605.GO76003@lucky.net>

    Hi, D Richard Felker III!

Sometime (on Friday, September 12 at  2:42) I've received something...
>Early in G2 development we discussed changes to the vf layer, and some
>good improvements were made over G1, but IMO lots of it's still ugly
>and unsatisfactory. Here are the main 3 problems:

>1) No easy way to expand the filter layer to allow branching rather
>   than just linear filter chain. Think of rendering PIP (picture in
>   picture, very cool for PVR type use!) or fading between separate
>   clips in a video editing program based on G2.

>2) The old get_image approach to DR does not make it possible for the
>   destination filter/vo to know when the source filter/vd is done
>   using a particular reference image. This means DR will not be
>   possible with emerging advanced codecs which are capable of using
>   multiple reference frames instead of the simple I/P/B model.

>3) The whole vf_process_image loop is ugly and makes artificial
>   distinction between "pending" images (pulled) and the old G1
>   push-model drawing.

>Actually (3) has a lot to do with (1).

>So the proposal for the new vf layer is that everything be "pull"
>model, i.e. the player "pulls" an image from the last filter, which in
>turn (recursively) pulls an image from the previous filter (or perhaps
>from multiple previous filters!).

Agree 100%, it's that I hoped already so we could build custom chain with
branches - filter with more than one input(s) may pull all inputs at the
same. :)
When we pull images from two or more streams then we could have a sync
problem but that problem could be solved if we run pull for "expected"
time. So decoder (or other stream source) puts pts into image structure
and then any filter could decide if that pts is over expected then just
return null frame and keep that pulled until it'll fit expected time.
May be I said it not very clean - sorry for my bad English then. :)

[...rest is skipped, sorry...]

    Thank you all.
    Andriy.


From dalias at aerifal.cx  Fri Sep 12 04:44:39 2003
From: dalias at aerifal.cx (D Richard Felker III)
Date: Thu, 11 Sep 2003 22:44:39 -0400
Subject: [MPlayer-G2-dev] Re: A new vf layer proposal...
In-Reply-To: <20030912000605.GO76003@lucky.net>
References: <20030911235301.GA24980@brightrain.aerifal.cx>
	<20030912000605.GO76003@lucky.net>
Message-ID: <20030912024439.GL250@brightrain.aerifal.cx>

On Fri, Sep 12, 2003 at 03:06:05AM +0300, Andriy N. Gritsenko wrote:
>     Hi, D Richard Felker III!
> 
> Sometime (on Friday, September 12 at  2:42) I've received something...
> >Early in G2 development we discussed changes to the vf layer, and some
> >good improvements were made over G1, but IMO lots of it's still ugly
> >and unsatisfactory. Here are the main 3 problems:
> 
> >1) No easy way to expand the filter layer to allow branching rather
> >   than just linear filter chain. Think of rendering PIP (picture in
> >   picture, very cool for PVR type use!) or fading between separate
> >   clips in a video editing program based on G2.
> 
> >2) The old get_image approach to DR does not make it possible for the
> >   destination filter/vo to know when the source filter/vd is done
> >   using a particular reference image. This means DR will not be
> >   possible with emerging advanced codecs which are capable of using
> >   multiple reference frames instead of the simple I/P/B model.
> 
> >3) The whole vf_process_image loop is ugly and makes artificial
> >   distinction between "pending" images (pulled) and the old G1
> >   push-model drawing.
> 
> >Actually (3) has a lot to do with (1).
> 
> >So the proposal for the new vf layer is that everything be "pull"
> >model, i.e. the player "pulls" an image from the last filter, which in
> >turn (recursively) pulls an image from the previous filter (or perhaps
> >from multiple previous filters!).
> 
> Agree 100%, it's that I hoped already so we could build custom chain with
> branches - filter with more than one input(s) may pull all inputs at the
> same. :)
> When we pull images from two or more streams then we could have a sync
> problem but that problem could be solved if we run pull for "expected"
> time. So decoder (or other stream source) puts pts into image structure
> and then any filter could decide if that pts is over expected then just
> return null frame and keep that pulled until it'll fit expected time.
> May be I said it not very clean - sorry for my bad English then. :)

Image structure already has pts, so that's no problem. :) Normally for
combining filters you'd be using several fixed-fps streams (with same
fps) as input so it wouldn't matter too much anyway -- variable fps is
mostly for ugly low quality stuff like asf and rm or for handling
made-for-tv stuff from mixed sources (24/30/60 fps).

BTW there's also the question of how to do filters that have multiple
outputs, and it's a little more complicated, but I think they can be
done as two filters sorta linked together. In any case, there doesn't
seem to be anything in my design that precludes filters with multiple
outputs, so I'm happy.

Thanks for the comments!

Rich


From andrej at lucky.net  Fri Sep 12 06:17:28 2003
From: andrej at lucky.net (Andriy N. Gritsenko)
Date: Fri, 12 Sep 2003 07:17:28 +0300
Subject: [MPlayer-G2-dev] Re: A new vf layer proposal...
In-Reply-To: <20030912024439.GL250@brightrain.aerifal.cx>
References: <20030911235301.GA24980@brightrain.aerifal.cx>
	<20030912000605.GO76003@lucky.net>
	<20030912024439.GL250@brightrain.aerifal.cx>
Message-ID: <20030912041728.GP76003@lucky.net>

    Hi, D Richard Felker III!

Sometime (on Friday, September 12 at  5:34) I've received something...
>On Fri, Sep 12, 2003 at 03:06:05AM +0300, Andriy N. Gritsenko wrote:
>>     Hi, D Richard Felker III!

>> Sometime (on Friday, September 12 at  2:42) I've received something...
>> >Early in G2 development we discussed changes to the vf layer, and some
>> >good improvements were made over G1, but IMO lots of it's still ugly
>> >and unsatisfactory. Here are the main 3 problems:

>> >1) No easy way to expand the filter layer to allow branching rather
>> >   than just linear filter chain. Think of rendering PIP (picture in
>> >   picture, very cool for PVR type use!) or fading between separate
>> >   clips in a video editing program based on G2.

>> >2) The old get_image approach to DR does not make it possible for the
>> >   destination filter/vo to know when the source filter/vd is done
>> >   using a particular reference image. This means DR will not be
>> >   possible with emerging advanced codecs which are capable of using
>> >   multiple reference frames instead of the simple I/P/B model.

>> >3) The whole vf_process_image loop is ugly and makes artificial
>> >   distinction between "pending" images (pulled) and the old G1
>> >   push-model drawing.

>> >Actually (3) has a lot to do with (1).

>> >So the proposal for the new vf layer is that everything be "pull"
>> >model, i.e. the player "pulls" an image from the last filter, which in
>> >turn (recursively) pulls an image from the previous filter (or perhaps
>> >from multiple previous filters!).

>> Agree 100%, it's that I hoped already so we could build custom chain with
>> branches - filter with more than one input(s) may pull all inputs at the
>> same. :)
>> When we pull images from two or more streams then we could have a sync
>> problem but that problem could be solved if we run pull for "expected"
>> time. So decoder (or other stream source) puts pts into image structure
>> and then any filter could decide if that pts is over expected then just
>> return null frame and keep that pulled until it'll fit expected time.
>> May be I said it not very clean - sorry for my bad English then. :)

>Image structure already has pts, so that's no problem. :) Normally for
>combining filters you'd be using several fixed-fps streams (with same
>fps) as input so it wouldn't matter too much anyway -- variable fps is
>mostly for ugly low quality stuff like asf and rm or for handling
>made-for-tv stuff from mixed sources (24/30/60 fps).

    Not only. With video editing program you mentioned above you may want
to use some filter to scale speed of fragment (my former fiancee likes to
make music videos so I saw that many times) to be faster or slower and/or
even mix two video streams with different time scaling. :)

>BTW there's also the question of how to do filters that have multiple
>outputs, and it's a little more complicated, but I think they can be
>done as two filters sorta linked together. In any case, there doesn't
>seem to be anything in my design that precludes filters with multiple
>outputs, so I'm happy.

    I think that multiple-output filter is very rare case and I even
cannot see real example but two-screen display of video or cloning for
network streaming. :)

>Thanks for the comments!

    Thank you too!

    With best wishes.
    Andriy.


From atmosfear at users.sourceforge.net  Mon Sep 15 15:59:18 2003
From: atmosfear at users.sourceforge.net (Felix Buenemann)
Date: Mon, 15 Sep 2003 15:59:18 +0200
Subject: [MPlayer-G2-dev] Re: A new vf layer proposal...
In-Reply-To: <20030912041728.GP76003@lucky.net>
References: <20030911235301.GA24980@brightrain.aerifal.cx>
	<20030912024439.GL250@brightrain.aerifal.cx>
	<20030912041728.GP76003@lucky.net>
Message-ID: <200309151559.18477.atmosfear@users.sourceforge.net>

On Friday 12 September 2003 06:17, Andriy N. Gritsenko wrote:
> >BTW there's also the question of how to do filters that have multiple
> >outputs, and it's a little more complicated, but I think they can be
> >done as two filters sorta linked together. In any case, there doesn't
> >seem to be anything in my design that precludes filters with multiple
> >outputs, so I'm happy.
>
>     I think that multiple-output filter is very rare case and I even
> cannot see real example but two-screen display of video or cloning for
> network streaming. :)
Or think about a filter that splits up the picture to it's planes for eg. 
dumping them to files, but that probably could be done inside the filter 
without the need of further processsing the output.

-- 
Best Regards,
        Atmos
____________________________________________
- MPlayer Developer - http://mplayerhq.hu/ -
____________________________________________


From atmosfear at users.sourceforge.net  Mon Sep 15 16:15:20 2003
From: atmosfear at users.sourceforge.net (Felix Buenemann)
Date: Mon, 15 Sep 2003 16:15:20 +0200
Subject: [MPlayer-G2-dev] [?] hit 3 flies - aspect ratio, resize,
	query_format
In-Reply-To: <200308221948.h7MJmwZW017658@mail.mplayerhq.hu>
References: <200308221948.h7MJmwZW017658@mail.mplayerhq.hu>
Message-ID: <200309151615.20520.atmosfear@users.sourceforge.net>

On Friday 22 August 2003 21:48, Arpi wrote:
> I would extend vf's query_format() by a int p[6] parameter.
> (actually int* size, which points to an array of 6 integers)
> these 6 values are:
>
> buff_w, buff_h  -  w/h of the image buffer (real pixels)
> disp_w, disp_h  -  pre-scaled w/h (recommended display size)  [for startup]
> want_w, want_h  -  wanted output size [for window resizing]
>
>
> query_format() of 'normal' filters (which dont alter aspect ratio nor
> buffer size) would just pass thru the pointer to next filter.
> other filters shoudl implement it this way:
> query_format(...){
>       - change buff_w/h  (only filters which chaneg buffer dimensios)
>       - change disp_w/h  (only filters which change aspect ratio)
>       - call next filter's query_format()
>       - change want_w/h  (only filters which change buffer dimensios)
> }

hmm, I see something missing here: Where do you account for the 
aspect-discrepancy of Screen-Resolution-Aspect vs. 
Physical-Displaydevice-Aspect. Eg. think of the case where displaying video 
at 1280x1024 on a 4:3 19" CRT, which is a very common case. In this case we 
have to do slight aspect correction in order to retain correct aspect ratio.
Another place is TV-Out, often the display-area from the graphics card doesn't 
fill the whole visible area of the TV's CRT, so that there are black areas 
above and below (sometimes also at the sides). With mplayer G1 id'd simply 
measure the display area from the graphics card, with a ruler or sth. and 
give that to MPlayer, eg: mplayer -monitoraspect 40:27 movie.avi
In most cases it would then make the black bars above and below the movie 
smaller, so aspect would be correct again and I'd be happy.
Maybe I've kinda lamely coded the aspect code in G1, but at least it works as 
expected =)

> also, the scaling flags of vfcap.h shoudl be reviewed: merging HSWSCALE_UP
> and HWSCALE_DOWN, it ha sno sence to keep them separated.
> query_format() implementations can now check source and dest resolution so
> can decide if sw/hw scaling is possible or not. if they can do the scaling
> (or resize), they should change the want_w/h values. otherwise left
> unchanged.
hmm, I'm not sure about this. The bad thing about eg. XV is that you can tell 
it to scale down in most cases but then if the adapter can't do it, it'll 
simply crop away part of the image to get the desired size. So the idea was 
to be able to specify if the card can do hw downsizing/upsizing using the 
selected vo, so we can downsize/upsize by swscaler if needed whilst using the 
faster hw scaler for upscaling/downscaling. But maybe I misunderstood you 
Arpi.

-- 
Best Regards,
        Atmos
____________________________________________
- MPlayer Developer - http://mplayerhq.hu/ -
____________________________________________


From dalias at aerifal.cx  Mon Sep 15 18:17:56 2003
From: dalias at aerifal.cx (D Richard Felker III)
Date: Mon, 15 Sep 2003 12:17:56 -0400
Subject: [MPlayer-G2-dev] Re: A new vf layer proposal...
In-Reply-To: <200309151559.18477.atmosfear@users.sourceforge.net>
References: <20030911235301.GA24980@brightrain.aerifal.cx>
	<20030912024439.GL250@brightrain.aerifal.cx>
	<20030912041728.GP76003@lucky.net>
	<200309151559.18477.atmosfear@users.sourceforge.net>
Message-ID: <20030915161756.GX250@brightrain.aerifal.cx>

On Mon, Sep 15, 2003 at 03:59:18PM +0200, Felix Buenemann wrote:
> On Friday 12 September 2003 06:17, Andriy N. Gritsenko wrote:
> > >BTW there's also the question of how to do filters that have multiple
> > >outputs, and it's a little more complicated, but I think they can be
> > >done as two filters sorta linked together. In any case, there doesn't
> > >seem to be anything in my design that precludes filters with multiple
> > >outputs, so I'm happy.
> >
> >     I think that multiple-output filter is very rare case and I even
> > cannot see real example but two-screen display of video or cloning for
> > network streaming. :)
> Or think about a filter that splits up the picture to it's planes for eg. 
> dumping them to files, but that probably could be done inside the filter 
> without the need of further processsing the output.

Or think about mplayer-g2-PVR, with simultaneous display and encoding
of video. Maybe you have something like:

                 ,-> vf_madei -> vo
                /
tvin -> vd_raw <
                \
                 `-> ve

...or...

                              ,-> vo
                             /
tvin -> vd_raw -> vf_pullup <
                             \
                              `-> vf_scale -> ve

:))))))))

Rich


From dalias at aerifal.cx  Mon Sep 15 18:35:03 2003
From: dalias at aerifal.cx (D Richard Felker III)
Date: Mon, 15 Sep 2003 12:35:03 -0400
Subject: [MPlayer-G2-dev] [?] hit 3 flies - aspect ratio, resize,
	query_format
In-Reply-To: <200309151615.20520.atmosfear@users.sourceforge.net>
References: <200308221948.h7MJmwZW017658@mail.mplayerhq.hu>
	<200309151615.20520.atmosfear@users.sourceforge.net>
Message-ID: <20030915163503.GY250@brightrain.aerifal.cx>

On Mon, Sep 15, 2003 at 04:15:20PM +0200, Felix Buenemann wrote:
> On Friday 22 August 2003 21:48, Arpi wrote:
> > I would extend vf's query_format() by a int p[6] parameter.
> > (actually int* size, which points to an array of 6 integers)
> > these 6 values are:
> >
> > buff_w, buff_h  -  w/h of the image buffer (real pixels)
> > disp_w, disp_h  -  pre-scaled w/h (recommended display size)  [for startup]
> > want_w, want_h  -  wanted output size [for window resizing]
> >
> >
> > query_format() of 'normal' filters (which dont alter aspect ratio nor
> > buffer size) would just pass thru the pointer to next filter.
> > other filters shoudl implement it this way:
> > query_format(...){
> >       - change buff_w/h  (only filters which chaneg buffer dimensios)
> >       - change disp_w/h  (only filters which change aspect ratio)
> >       - call next filter's query_format()
> >       - change want_w/h  (only filters which change buffer dimensios)
> > }
> 
> hmm, I see something missing here: Where do you account for the 
> aspect-discrepancy of Screen-Resolution-Aspect vs. 
> Physical-Displaydevice-Aspect. Eg. think of the case where displaying video 
> at 1280x1024 on a 4:3 19" CRT, which is a very common case. In this case we 
> have to do slight aspect correction in order to retain correct aspect ratio.

The vo simply sets the wanted display width/height based on monitor
aspect and disp_w/disp_h from filters. No problem there. However IMO
this whole system where window resizes propagate back through the
filter chain is a very very bad idea. Consider the following example
filter chains:

1) scale=480:480,(deinterlace/ivtc)

If the user resizes the window, Arpi's proposal would have the first
scale filter get reconfigured for the new output size! If any vertical
resizing takes place, this will ruin the deinterlacing!!

2) rgb codec => scale,denoise3d

If user resizes the window, the scale filter will resize the image
before denoising rather than just converting colorspace! This will
ruin the denoising process.

I'm sure there are more examples too. In all these cases, the basic
problem is the same -- when resizes propagate back through the filter
chain, the video gets resized at the wrong point, and the output is
wrong. It's a similar problem to how mencoder skips frames at the
beginning of the filterchain rather than at the end. IMO any final
preparation for display like this needs to be done at the very end of
the filter chain. I'd even suggest putting swscaler support in vf_vo2
rather than loading a filter for window resizing. That way the filter
chain doesn't have to be aware of any silly resize signals. IIRC Arpi
also considered putting swscaler in vf_vo2 when we were talking about
it on IRC.

> Another place is TV-Out, often the display-area from the graphics card doesn't 
> fill the whole visible area of the TV's CRT, so that there are black areas 
> above and below (sometimes also at the sides). With mplayer G1 id'd simply 

This means your TVout is horribly misconfigured! Try changing the
timings (with Matrox, sync pulse length is used to control the black
border size when in TV mode, so it may be similar on other cards).
I've never seen a card which forces black borders when used on
windows, so if you're getting black borders, I really do expect a
driver/configuration problem, not a fundamental limit of the hardware.

> > also, the scaling flags of vfcap.h shoudl be reviewed: merging HSWSCALE_UP
> > and HWSCALE_DOWN, it ha sno sence to keep them separated.
> > query_format() implementations can now check source and dest resolution so
> > can decide if sw/hw scaling is possible or not. if they can do the scaling
> > (or resize), they should change the want_w/h values. otherwise left
> > unchanged.
> hmm, I'm not sure about this. The bad thing about eg. XV is that you can tell 
> it to scale down in most cases but then if the adapter can't do it, it'll 
> simply crop away part of the image to get the desired size. So the idea was 
> to be able to specify if the card can do hw downsizing/upsizing using the 
> selected vo, so we can downsize/upsize by swscaler if needed whilst using the 
> faster hw scaler for upscaling/downscaling. But maybe I misunderstood you 
> Arpi.

I think you just misunderstood. Arpi was saying that config would just
return failure for scaling down, if the card didn't support it, rather
than using a flag.

Rich


From atmosfear at users.sourceforge.net  Mon Sep 15 18:52:09 2003
From: atmosfear at users.sourceforge.net (Felix Buenemann)
Date: Mon, 15 Sep 2003 18:52:09 +0200
Subject: [MPlayer-G2-dev] [?] hit 3 flies - aspect ratio, resize,
	query_format
In-Reply-To: <20030915163503.GY250@brightrain.aerifal.cx>
References: <200308221948.h7MJmwZW017658@mail.mplayerhq.hu>
	<200309151615.20520.atmosfear@users.sourceforge.net>
	<20030915163503.GY250@brightrain.aerifal.cx>
Message-ID: <200309151852.09495.atmosfear@users.sourceforge.net>

On Monday 15 September 2003 18:35, D Richard Felker III wrote:
> On Mon, Sep 15, 2003 at 04:15:20PM +0200, Felix Buenemann wrote:
> > On Friday 22 August 2003 21:48, Arpi wrote:
[...Arpis proposal...]
> > hmm, I see something missing here: Where do you account for the
> > aspect-discrepancy of Screen-Resolution-Aspect vs.
> > Physical-Displaydevice-Aspect. Eg. think of the case where displaying
> > video at 1280x1024 on a 4:3 19" CRT, which is a very common case. In this
> > case we have to do slight aspect correction in order to retain correct
> > aspect ratio.
>
> The vo simply sets the wanted display width/height based on monitor
> aspect and disp_w/disp_h from filters. No problem there. However IMO
> this whole system where window resizes propagate back through the
> filter chain is a very very bad idea. Consider the following example
> filter chains:
>
> 1) scale=480:480,(deinterlace/ivtc)
>
> If the user resizes the window, Arpi's proposal would have the first
> scale filter get reconfigured for the new output size! If any vertical
> resizing takes place, this will ruin the deinterlacing!!
>
> 2) rgb codec => scale,denoise3d
>
> If user resizes the window, the scale filter will resize the image
> before denoising rather than just converting colorspace! This will
> ruin the denoising process.
>
> I'm sure there are more examples too. In all these cases, the basic
> problem is the same -- when resizes propagate back through the filter
> chain, the video gets resized at the wrong point, and the output is
> wrong. It's a similar problem to how mencoder skips frames at the
> beginning of the filterchain rather than at the end. IMO any final
> preparation for display like this needs to be done at the very end of
> the filter chain. I'd even suggest putting swscaler support in vf_vo2
> rather than loading a filter for window resizing. That way the filter
> chain doesn't have to be aware of any silly resize signals. IIRC Arpi
> also considered putting swscaler in vf_vo2 when we were talking about
> it on IRC.

You are totally right, scaling at the wrong point in the chain can mess uo 
things badly. I, too, think it's a good idea to put a scaler in the video out 
filter.

> > Another place is TV-Out, often the display-area from the graphics card
> > doesn't fill the whole visible area of the TV's CRT, so that there are
> > black areas above and below (sometimes also at the sides). With mplayer
> > G1 id'd simply
>
> This means your TVout is horribly misconfigured! Try changing the
> timings (with Matrox, sync pulse length is used to control the black
> border size when in TV mode, so it may be similar on other cards).
> I've never seen a card which forces black borders when used on
> windows, so if you're getting black borders, I really do expect a
> driver/configuration problem, not a fundamental limit of the hardware.
Oh, I'm not talking about vanilla ice graphics cards, I'm talking about shit 
hardware, like Savage/MX Chip (my laptop), or Gefore 4 MX on Windows with 
cheapo TV-Codec (the one in a friends PC).

>
> Rich
>

-- 
Best Regards,
        Atmos
____________________________________________
- MPlayer Developer - http://mplayerhq.hu/ -
____________________________________________


From gsbarbieri at yahoo.com.br  Mon Sep 15 22:32:51 2003
From: gsbarbieri at yahoo.com.br (=?iso-8859-1?q?Gustavo=20Sverzut=20Barbieri?=)
Date: Mon, 15 Sep 2003 17:32:51 -0300 (ART)
Subject: [MPlayer-G2-dev] Re: A new vf layer proposal...
In-Reply-To: <20030915161756.GX250@brightrain.aerifal.cx>
Message-ID: <20030915203251.78909.qmail@web20904.mail.yahoo.com>

 --- D Richard Felker III <dalias at aerifal.cx> escreveu: > On Mon, Sep
15, 2003 at 03:59:18PM +0200, Felix Buenemann wrote:
> > On Friday 12 September 2003 06:17, Andriy N. Gritsenko wrote:
> > > >BTW there's also the question of how to do filters that have
> multiple
> > > >outputs, and it's a little more complicated, but I think they
> can be
> > > >done as two filters sorta linked together. In any case, there
> doesn't
> > > >seem to be anything in my design that precludes filters with
> multiple
> > > >outputs, so I'm happy.
> > >
> > >     I think that multiple-output filter is very rare case and I
> even
> > > cannot see real example but two-screen display of video or
> cloning for
> > > network streaming. :)
> > Or think about a filter that splits up the picture to it's planes
> for eg. 
> > dumping them to files, but that probably could be done inside the
> filter 
> > without the need of further processsing the output.
> 
> Or think about mplayer-g2-PVR, with simultaneous display and encoding
> of video. Maybe you have something like:
> 
>                  ,-> vf_madei -> vo
>                 /
> tvin -> vd_raw <
>                 \
>                  `-> ve
> 
> ...or...
> 
>                               ,-> vo
>                              /
> tvin -> vd_raw -> vf_pullup <
>                              \
>                               `-> vf_scale -> ve

But here maybe we need something different, since would be cool if the
VO could go back and forth, while VE keeps recording... (Tivo like)

Anyway, this can be great to do Video Walls, so video is cropped and
exported to different video heads, ie:

                 .---> Top_Left
                 |
                 +---> Top_Right
                 |
vf_videowall(4) -+
                 |
                 +---> Bottom_Left
                 |
                 `---> Bottom_Right

Maybe something to crop each video display border out (Ie, using
monitors, crop around 2")

Gustavo

_______________________________________________________________________
Desafio AntiZona: participe do jogo de perguntas e respostas que vai
dar um Renault Clio, computadores, c?meras digitais, videogames e muito
mais! www.cade.com.br/antizona


From atmosfear at users.sourceforge.net  Tue Sep 16 00:48:05 2003
From: atmosfear at users.sourceforge.net (Felix Buenemann)
Date: Tue, 16 Sep 2003 00:48:05 +0200
Subject: [MPlayer-G2-dev] Re: A new vf layer proposal...
In-Reply-To: <20030915203251.78909.qmail@web20904.mail.yahoo.com>
References: <20030915203251.78909.qmail@web20904.mail.yahoo.com>
Message-ID: <200309160048.05307.atmosfear@users.sourceforge.net>

On Monday 15 September 2003 22:32, Gustavo Sverzut Barbieri wrote:
[...some ugly wrapped text...]
> >
> > Or think about mplayer-g2-PVR, with simultaneous display and encoding
> > of video. Maybe you have something like:
> >
> >                  ,-> vf_madei -> vo
> >                 /
> > tvin -> vd_raw <
> >                 \
> >                  `-> ve
> >
> > ...or...
> >
> >                               ,-> vo
> >                              /
> > tvin -> vd_raw -> vf_pullup <
> >                              \
> >                               `-> vf_scale -> ve
>
> But here maybe we need something different, since would be cool if the
> VO could go back and forth, while VE keeps recording... (Tivo like)

It would probably be easier, if you record to file with one process and 
playback the currently recording file using another process =)

>
> Anyway, this can be great to do Video Walls, so video is cropped and
> exported to different video heads, ie:
>
>                  .---> Top_Left
>
>                  +---> Top_Right
>
> vf_videowall(4) -+
>
>                  +---> Bottom_Left
>
>                  `---> Bottom_Right
>
> Maybe something to crop each video display border out (Ie, using
> monitors, crop around 2")
hmm, do you remember this project where they animated a whole building using 
lights which represented grey-sahed pixels? ah, Blinkenlights Arcade, anyways 
there was an MPlayer plugin for it. -> http://www.blinkenlights.de/

> Gustavo

-- 
Best Regards,
        Atmos
____________________________________________
- MPlayer Developer - http://mplayerhq.hu/ -
____________________________________________


From dalias at aerifal.cx  Tue Sep 16 03:29:22 2003
From: dalias at aerifal.cx (D Richard Felker III)
Date: Mon, 15 Sep 2003 21:29:22 -0400
Subject: [MPlayer-G2-dev] Re: A new vf layer proposal...
In-Reply-To: <200309160048.05307.atmosfear@users.sourceforge.net>
References: <20030915203251.78909.qmail@web20904.mail.yahoo.com>
	<200309160048.05307.atmosfear@users.sourceforge.net>
Message-ID: <20030916012922.GE250@brightrain.aerifal.cx>

On Tue, Sep 16, 2003 at 12:48:05AM +0200, Felix Buenemann wrote:
> On Monday 15 September 2003 22:32, Gustavo Sverzut Barbieri wrote:
> [...some ugly wrapped text...]
> > >
> > > Or think about mplayer-g2-PVR, with simultaneous display and encoding
> > > of video. Maybe you have something like:
> > >
> > >                  ,-> vf_madei -> vo
> > >                 /
> > > tvin -> vd_raw <
> > >                 \
> > >                  `-> ve
> > >
> > > ...or...
> > >
> > >                               ,-> vo
> > >                              /
> > > tvin -> vd_raw -> vf_pullup <
> > >                              \
> > >                               `-> vf_scale -> ve
> >
> > But here maybe we need something different, since would be cool if the
> > VO could go back and forth, while VE keeps recording... (Tivo like)
> 
> It would probably be easier, if you record to file with one process and 
> playback the currently recording file using another process =)

It would also require more cpu power! Decoding 640x480 (or higher!)
video takes a lot more than just copying it to video memory. IMO it
would be worthwhile to have both approaches. Really what I described
isn't like a PVR so much as just a simple video recorder that lets you
watch while you record...for a full PVR you'd want to play from the
file so you could seek during recording.

> > Anyway, this can be great to do Video Walls, so video is cropped and
> > exported to different video heads, ie:
> >
> >                  .---> Top_Left
> >
> >                  +---> Top_Right
> >
> > vf_videowall(4) -+
> >
> >                  +---> Bottom_Left
> >
> >                  `---> Bottom_Right
> >
> > Maybe something to crop each video display border out (Ie, using
> > monitors, crop around 2")
> hmm, do you remember this project where they animated a whole building using 
> lights which represented grey-sahed pixels? ah, Blinkenlights Arcade, anyways 
> there was an MPlayer plugin for it. -> http://www.blinkenlights.de/

Yep. :)

Rich


From gsbarbieri at yahoo.com.br  Wed Sep 17 00:14:48 2003
From: gsbarbieri at yahoo.com.br (=?iso-8859-1?q?Gustavo=20Sverzut=20Barbieri?=)
Date: Tue, 16 Sep 2003 19:14:48 -0300 (ART)
Subject: [MPlayer-G2-dev] Re: A new vf layer proposal...
In-Reply-To: <200309160048.05307.atmosfear@users.sourceforge.net>
Message-ID: <20030916221448.85691.qmail@web20902.mail.yahoo.com>

 --- Felix Buenemann <atmosfear at users.sourceforge.net> escreveu: 
> On Monday 15 September 2003 22:32, Gustavo Sverzut Barbieri wrote:
> [...some ugly wrapped text...]
> > >
> > > Or think about mplayer-g2-PVR, with simultaneous display and
> encoding
> > > of video. Maybe you have something like:
> > >
> > >                  ,-> vf_madei -> vo
> > >                 /
> > > tvin -> vd_raw <
> > >                 \
> > >                  `-> ve
> > >
> > > ...or...
> > >
> > >                               ,-> vo
> > >                              /
> > > tvin -> vd_raw -> vf_pullup <
> > >                              \
> > >                               `-> vf_scale -> ve
> >
> > But here maybe we need something different, since would be cool if
> the
> > VO could go back and forth, while VE keeps recording... (Tivo like)
> 
> It would probably be easier, if you record to file with one process
> and 
> playback the currently recording file using another process =)
> 
> >
> > Anyway, this can be great to do Video Walls, so video is cropped
> and
> > exported to different video heads, ie:
> >
> >                  .---> Top_Left
> >
> >                  +---> Top_Right
> >
> > vf_videowall(4) -+
> >
> >                  +---> Bottom_Left
> >
> >                  `---> Bottom_Right
> >
> > Maybe something to crop each video display border out (Ie, using
> > monitors, crop around 2")
> hmm, do you remember this project where they animated a whole
> building using 
> lights which represented grey-sahed pixels? ah, Blinkenlights Arcade,
> anyways 
> there was an MPlayer plugin for it. -> http://www.blinkenlights.de/
> 

Wow! Crazy! 

Gustavo

_______________________________________________________________________
Desafio AntiZona: participe do jogo de perguntas e respostas que vai
dar um Renault Clio, computadores, c?meras digitais, videogames e muito
mais! www.cade.com.br/antizona


From dalias at aerifal.cx  Wed Sep 17 06:37:43 2003
From: dalias at aerifal.cx (D Richard Felker III)
Date: Wed, 17 Sep 2003 00:37:43 -0400
Subject: [MPlayer-G2-dev] A new vf layer proposal...
In-Reply-To: <20030911235301.GA24980@brightrain.aerifal.cx>
References: <20030911235301.GA24980@brightrain.aerifal.cx>
Message-ID: <20030917043743.GL250@brightrain.aerifal.cx>

OK, as much as I appreciate the tangential discussion about branching
filters and multi-monitor displays and stuff...

What I was really looking for was comments on the design, whether
there are any obvious mistakes or problems I'm overlooking, etc. It
might also be nice to hear some thoughts on the way filter config and
format negotiation should work, since I didn't address that at all, as
well as how window resizing should be handled (special dynamically
inserted scale filter, or swscaler code in vf_vo2) and how format
conversion should be handled between filters...

My original idea was to put in some minor, isolated hacks to allow a
(swscaler) filter to be inserted dynamically between the called filter
and the calling filter. But since then I've had a couple more ideas...

1) Put format conversion directly in the vf layer, like:

vf_pull_image(vf_instance_t *vf_src, *vf_dest)
{
	mpi = vf_src->pull_image(vf_src, vf_dest);
	if (mpi->imgfmt not usable by vf_dest) {
		mpi2 = vf_get_buffer(vf_dest, ...);
		swScale(mpi2, mpi, ...); /* Michael's coolJavaCaps! */
		vf_release_buffer(mpi);
		return mpi2;
	}
	return mpi;
}

2) Require filters to accept any input format (but their get_buffer
   can restrict which formats allow DR, and their query_format can
   report which formats are natively supported. Then, once a filter
   gets an image from vf_pull_image, it checks to see if it can use
   the format as-is. If not, it calls a conversion function in the vf
   api:

	mpi = vf_pull_image(vf->prev, vf);
	if (mpi->imgfmt not supported) mpi = vf_convert(vf, mpi, fmt);


Don't assume I'm being sloppy in the way these functions are called
for the sake of providing a simplified example. The idea is that the
buffer lock count and ownership semantics would automatically "do the
right thing" with code just about as simple as what I've written
above. For instance, vf_convert woulc call vf_get_buffer(vf, ...),
allowing direct rendering, and would decrement the lock count on the
mpi passed into it (releasing this buffer as long as the current vf or
the one that returned it hadn't established an extra lock with
vf_lock_buffer).

Actually, with this in mind, approaches (1) and (2) above are
basically identical. It's just a matter of where the conversion code
goes.

After thinking about things more, I have various reasons of preferring
approaches like the following instead of dynamically inserting a scale
filter wherever there's a format mismatch. Please reply and make
comments if you particularly like or dislike any of the approaches
I've described.

Rich


From andrej at lucky.net  Wed Sep 17 12:46:49 2003
From: andrej at lucky.net (Andriy N. Gritsenko)
Date: Wed, 17 Sep 2003 13:46:49 +0300
Subject: [MPlayer-G2-dev] Re: A new vf layer proposal...
In-Reply-To: <20030917043743.GL250@brightrain.aerifal.cx>
References: <20030911235301.GA24980@brightrain.aerifal.cx>
	<20030917043743.GL250@brightrain.aerifal.cx>
Message-ID: <20030917104649.GA14942@lucky.net>

    Hi, D Richard Felker III!

Sometime (on Wednesday, September 17 at  7:26) I've received something...
>OK, as much as I appreciate the tangential discussion about branching
>filters and multi-monitor displays and stuff...

[.......]

>After thinking about things more, I have various reasons of preferring
>approaches like the following instead of dynamically inserting a scale
>filter wherever there's a format mismatch. Please reply and make
>comments if you particularly like or dislike any of the approaches
>I've described.

    I like both very much and I wish it'll be done! I've thought about
some alike that already and you've described it well. :)

    With best wishes.
    Andriy.


From r_togni at libero.it  Tue Sep 23 22:49:13 2003
From: r_togni at libero.it (Roberto Togni)
Date: Tue, 23 Sep 2003 22:49:13 +0200
Subject: [MPlayer-G2-dev] Native codecs and g2
Message-ID: <20030923204913.GA1068@tower2.myhome.qwe>

Any news about codecs in g2?

Some time ago A'rpi was thinking about moving to a get/release buffer  
method instead of get_image/mpi as g1. At the end, IIRC, he said he's  
going to stay with mpi.
It is a final decision or it's still a work in progress item?

I was thinking about moving most of the native codecs from libmpcodecs  
to ffmpeg/libavcodec since a long time.
This could be the right time to do it, and it's even more true if  
codecs have to be modified to be used in g2.

The codecs i'm thinking about are old QT codecs (rle, rpza, smc,  
8bps, ...) and old vfw codecs (Cinepak, cvid, msrle, ...), but probably  
every native codec can be moved (lcl and lzo depends on external libs,  
i have to check how ffmpeg handles it).
Some codecs are already available in libavcodec, even if MPlayer uses  
its own version (various adpcm audio codecs, cyuv, realaudio 1.0 and  
2.0, and probably others i don't remember now).

Pro: less code to port and mantain, more people will be able to use  
them and fix bugs
Cons: MPlayer without libavcodec will be unable to play most files  
(unless you use binary codecs), but i think that using MPlayer without  
libavcodec is not a wise choice even now.

What's your opinion about it?

Ciao,
 Roberto


From michaelni at gmx.at  Tue Sep 23 23:08:58 2003
From: michaelni at gmx.at (Michael Niedermayer)
Date: Tue, 23 Sep 2003 23:08:58 +0200
Subject: [MPlayer-G2-dev] Native codecs and g2
In-Reply-To: <20030923204913.GA1068@tower2.myhome.qwe>
References: <20030923204913.GA1068@tower2.myhome.qwe>
Message-ID: <200309232308.58755.michaelni@gmx.at>

Hi

On Tuesday 23 September 2003 22:49, Roberto Togni wrote:
> Any news about codecs in g2?
>
> Some time ago A'rpi was thinking about moving to a get/release buffer
> method instead of get_image/mpi as g1. At the end, IIRC, he said he's
> going to stay with mpi.
> It is a final decision or it's still a work in progress item?
>
> I was thinking about moving most of the native codecs from libmpcodecs
> to ffmpeg/libavcodec since a long time.
> This could be the right time to do it, and it's even more true if
> codecs have to be modified to be used in g2.
>
> The codecs i'm thinking about are old QT codecs (rle, rpza, smc,
> 8bps, ...) and old vfw codecs (Cinepak, cvid, msrle, ...), but probably
> every native codec can be moved (lcl and lzo depends on external libs,
> i have to check how ffmpeg handles it).
> Some codecs are already available in libavcodec, even if MPlayer uses
> its own version (various adpcm audio codecs, cyuv, realaudio 1.0 and
> 2.0, and probably others i don't remember now).
cinepack, msvidc, msrle have been ported to ffdshow/libavcodec by someone 
(milan cutka maybe), melanson said (a long time ago) that he would port them 
to ffmpeg/libavcodec

[...]

-- 
Michael
level[i]= get_vlc(); i+=get_vlc();		(violates patent EP0266049)
median(mv[y-1][x], mv[y][x-1], mv[y+1][x+1]);	(violates patent #5,905,535)
buf[i]= qp - buf[i-1];				(violates patent #?)
for more examples, see http://mplayerhq.hu/~michael/patent.html
stop it, see http://petition.eurolinux.org & http://petition.ffii.org/eubsa/en


From dalias at aerifal.cx  Tue Sep 23 23:47:17 2003
From: dalias at aerifal.cx (D Richard Felker III)
Date: Tue, 23 Sep 2003 17:47:17 -0400
Subject: [MPlayer-G2-dev] Native codecs and g2
In-Reply-To: <20030923204913.GA1068@tower2.myhome.qwe>
References: <20030923204913.GA1068@tower2.myhome.qwe>
Message-ID: <20030923214717.GS2856@brightrain.aerifal.cx>

On Tue, Sep 23, 2003 at 10:49:13PM +0200, Roberto Togni wrote:
> Any news about codecs in g2?
> 
> Some time ago A'rpi was thinking about moving to a get/release buffer  
> method instead of get_image/mpi as g1. At the end, IIRC, he said he's  
> going to stay with mpi.
> It is a final decision or it's still a work in progress item?

Read my recent vf proposals. IMO we much switch to get/release buffer
rather than the g1-style system; otherwise G2 will suck and won't be
able to do any DR with new multi-reference-frame codecs or temporal
filters.

> I was thinking about moving most of the native codecs from libmpcodecs  
> to ffmpeg/libavcodec since a long time.
> This could be the right time to do it, and it's even more true if  
> codecs have to be modified to be used in g2.

Agree.

> The codecs i'm thinking about are old QT codecs (rle, rpza, smc,  
> 8bps, ...) and old vfw codecs (Cinepak, cvid, msrle, ...), but probably  
> every native codec can be moved (lcl and lzo depends on external libs,  
> i have to check how ffmpeg handles it).

IMO fix the external lib dependencies (i.e. remove them :).

> Some codecs are already available in libavcodec, even if MPlayer uses  
> its own version (various adpcm audio codecs, cyuv, realaudio 1.0 and  
> 2.0, and probably others i don't remember now).
> 
> Pro: less code to port and mantain, more people will be able to use  
> them and fix bugs
> Cons: MPlayer without libavcodec will be unable to play most files  
> (unless you use binary codecs), but i think that using MPlayer without  
> libavcodec is not a wise choice even now.

IMO this is not a con. :)

Rich


From dalias at aerifal.cx  Sun Sep 28 02:02:04 2003
From: dalias at aerifal.cx (D Richard Felker III)
Date: Sat, 27 Sep 2003 20:02:04 -0400
Subject: [MPlayer-G2-dev] more on g2 & video filters
Message-ID: <20030928000204.GA19033@brightrain.aerifal.cx>

Here's an updated and more thorough g2 video layer design.


For consistency and for the sake of having a name to talk about the
system, I'll call it "video pipeline" (vp). This encompasses filters
as well as decoders, encoders, vo's, and the glue layer between it
all.

First, the structure of connection between the pieces. Nodes of the
video pipeline can (in theory) be connected in many different ways. A
simple implementation for the time being could be entirely linear like
G1's filter chain, but the design should not require this. Thus, we'll
talk about the video pipeline as a collection of nodes and links,
where a link consists of a source, a destination, and a link structure
which ties the two together and assists in managing buffers.

----------------------------------------------------------------------

The first topic, and probably the most important, is buffer
management. G1 did a remarkably good job compared to any other player
at the time, but it still has some big limitations. In particular:

1. There's no clear rule on what a filter is allowed to do with an
   image after calling vf_put_image on it. Can it still read the
   contents? Can it write more? Can it call vf_put_image more than
   once on the same mpi without calling vf_get_image again? In general
   the answer is probably no, but several filters (including ones I
   wrote) do stuff like this, and it's just not clear what's ok and
   what's not.

2. A filter that gives out DR buffers from its get_image has no way of
   knowing when the caller is done with those buffers. In theory,
   put_image should be a good indication (but see (1) above), and even
   worse, if the previous filter/dec_video drops frames, then
   put_image will never be called.

3. A video decoder (or filter) has no way of telling the system how
   long it needs its buffers preserved (for prediction or whatever).
   This works ok with standard IP[B] type codecs, but with more
   complicated prediction models it might totally break.

So here's the new buffer model, based on get_buffer/release_buffer and
reference counts:

When a node of the video pipeline wants a buffer to return as output
from its pull_frame (see next section below), it has three options for
the buffer type: export, indirect, and direct. The first two are
always available, but direct it only available if the destination's
get_buffer function is willing to allocate a buffer with the desired
format and flags (similar to G1). All buffers are associated with the
link structure.

Export -- almost exactly like in G1, with a few improvements. In the
export case, the source filter is considered the owner of the buffer.
It will be notified when the buffer's reference count reaches zero, so
that it can in turn release any buffer it might be re-exporting (for
example, the source buffer of which vf_crop is exporting a cropped
version).

Direct -- destination sets up a buffer structure so that source can
render directly into it. In this case, the destination is considered
the owner of the buffer, and is notified when the buffer's reference
count reaches zero, so that it can in turn release any buffer it might
be using (for example, the full destination buffer, a small part of
which vf_expand is making available to the source).

Indirect -- allocated and managed by the link layer.

The new video pipeline design also has certain flags analogous to the
old image types and flags in G1:

Readable -- the buffer cannot reside in write-only memory, slow video
memory, or anywhere that makes reading it slow, difficult, or
restricted. This should always be set correctly when requesting a
buffer, even though it generally applies only to direct-type buffers.

Preserve -- source and rely on destination not to clobber the buffer
as long as it is valid. If destination is the owner of the buffer
(direct-type), then it it still of course free to clobber the buffer
after the reference count reaches zero.

Reusable -- source is free to continue writing to buffer even after
passing it on to destination (assuming it maintains a reference count)
and to pass the same buffer to destination multiple times if desired.
Note that as long as the reusable flag is NOT set, destination can
rely on source not to clobber the buffer after source returns (the
analogue of the preserve flag, in the reverse direction).

One should be particularly aware that the preserve flag applies to ALL
image type, not just direct and indirect. That means that, unless
source sets the preserve flag on exported buffers, destination is free
to clobber them. (One example where this is useful is for rendering
OSD onto the exported buffer of a filter before copying to video
memory, instead of having to alpha-blend OSD in video memory.)

Now an overview of how to convert old G1-model filters/codecs to the
new model:

IP[B] codecs -- call vp_get_buffer with readable+preserve flags for I
and P frames, no flags for B frames. Increment reference count for I/P
frames (vp_lock_buffer) before returning, then release them
(vp_release_buffer) when they're no longer needed for prediction. For
standard IP model this just involves keeping one buffer pointer in the
codec's private data area (the previous I/P frame).

Filters and codecs that used the "static" buffer type in G1 -- on the
first frame, call vp_get_buffer with preserve+reusable (and optionally
readable) flags to get a buffer, then establish a lock
(vp_lock_buffer) before returning the image to the caller so that the
reference count does not reach zero. When rendering subsequent frames,
don't call vp_get_buffer again; just increment the reference count
(vp_lock_buffer) before returning so that destination has an extra
reference to release without the count reaching zero.

I-only codecs and filters that use temp buffers -- call vp_get_buffer
with no flags and return the buffer after drawing into it.

This pretty much covers the G1 cases. Of course there are many more
possibilities in G2 which weren't allowed in G1 and thus don't
correspond to any old buffer model.

----------------------------------------------------------------------

The second topic is flow of execution.

>From a final destination (vo/ve), the pipeline is called in reverse
order, using a "pull" model for obtaining frames. The main relevant
function is vp_pull_frame, which takes as its argument a pointer to a
link structure, and calls the source's pull_frame function asking for
a frame for destination.

A filter/codec's pull_frame, in turn, is responsible for obtaining a
buffer (via vp_get_buffer) filling it with the picture, and returning
it to the caller. 

The reader would be advised to read and study the following example:

Filter chain:

VD --L1--> Filter A --L2--> Filter B --L3--> VO

Let's say filter A is crop, exporting image, and B is scale, direct
rendering into VO's video memory. L1,L2,L3 are the link structures.

Flow of execution:

vp_pull_frame(L3)
  B->pull_frame(L3)
    sbuf=vp_pull_frame(L2)
      A->pull_frame(L2)
        sbuf=vp_pull_frame(L1)
          VD->pull_frame(L1)
            figure out video format, dimensions, etc.
            A->query_format [*1]
              B->query_format
            A->config
              B->config
                VO->query_format [*2]
                VO->config
            dbuf=vp_get_buffer(L1)
              A->get_buffer(L1)
                dr fails, return NULL
              setup and return indirect image
            VD decodes video into dbuf
            return dbuf
        dbuf=vp_get_buffer(L2,export)
        setup export strides/pointers
        dbuf->priv->source=sbuf [*3]
        return dbuf
    dbuf=vp_get_buffer(L3)
      VO->get_buffer(L3)
        setup dr buffer and return it
    scale image from sbuf to dbuf
    vp_release_buffer(sbuf)
      A->release_buffer
        vp_release_buffer(...->priv->source)
    return dbuf

Notes:

[*1] query_format is called to determine which formats the destination
supports natively. If no acceptable native formats are found, config
will be called with whatever format source prefers to use, and
destination will be responsible for converting images after receiving
them from vp_pull_frame.

[*2] Here filter B waits to query which formats the VO supports until
it is configured. Since scale's input and output formats are
independent of one another, there's no need to know during scale's
query_format which formats the VO supports.

[*3] Notice here that filter A does not release the source buffer it
obtained from L1 at this time. Instead it stores it in the private
data area for its exported destination buffer, so that it can release
the source after (and only after) that buffer is no longer in use.

----------------------------------------------------------------------

The next (and maybe most controversial) topic: automatic format
conversion!

Believe it or not, it is possible with the above design to dynamically
insert a filter between source and destination during source's
pull_frame. It only requires very minor hacks in vp_pull_frame. But
instead I would like to propose doing away with auto-insertion of
scale filter for format conversion, and instead require filters/vo to
accept any image format.

Then, we introduce a new function to the vp api, vp_convert. I'll
explain it with pseudocode:

vp_buffer *vf_convert(vp_buffer *in) {
	vp_buffer *out = vp_get_buffer(in->link);
	swScaler(in, out);
	vp_release_buffer(in);
	return out;
}

Note that each buffer stores which link it's associated with (in->link
here). Of course vp_convert would also have to keep the sws context
somewhere; in->link would be an appropriate place. Also note that this
will direct-render the conversion if the calling filter supports
direct rendering. :)

Now let's see how this affects format negotiation...

G1's model was to have query_format only return true if this filter
and ALL the subsequent filters/VO support the requested format. Since
G1 could really only auto-insert scale at a few places in the chain
(beginning or end...?) this made sense. But a side effect of this
behavior is that conversion tends to get forced as early as possible
in the filter chain.

Consider the example:

RGB codec ----> crop ----> YUV VO

If crop's query_format returns false because VO does not support RGB,
then RGB->YUV conversion will happen before cropping. But this is
stupid and wastes cpu time.

Now suppose that we're using the above model, with no auto-insertion
of filters. the RGB codec sees that crop's query_format returns false
for RGB, but since it can't output anything except RGB, it returns an
RGB image anyway. Now, crop gets the RGB image. And crop is free to
crop the image in RGB space, since it knows how to do that, totally
oblivious to what the VO wants. Then the VO gets an RGB image, and has
to call vp_convert, which will direct-render the converted image into
video memory if possible.

On the other hand, vf_expand might want to be more careful of what
formats its destination filter supports natively (using query_format)
so it doesn't force the destination to convert lots of useless black
bars.

Finally, one other benefit of converting as late as possible, is that
a filter which drops frames might be able to determine it wants to
drop the next frame before calling vp_convert. This could save a lot
of cpu time. But the following plan for frame dropping makes the
situation even better:

----------------------------------------------------------------------

What happens if in addition to vp_pull_frame, we also have
vp_skip_frame, which notifies the source filter that the destination
wants to "run the pipeline" for a frame, but throw away the output?

The idea is that this call could propagate back as far as possible
through the filter chain. It allows us to have the same behavior as
-framedrop in G1, but also much better. If a filter knows it's going
to drop the next frame before even looking at it, it can use
vp_skip_frame instead of vp_pull_frame, and earlier filters can skip
processing the frame altogether. BUT, if there are filters in the
chain which cannot deal with missing frames (for example, inverse
telecine), they're not obligated to propagate the vp_skip_frame call,
and they can implement their skip_frame with the same function as
pull_frame.

If vp_skip_frame propagates all the way back to the decoder, and the
next frame is a B frame (or the file is I-only), then the decoder can
of course skip decoding entirely!

As for filters which voluntarily drop frames (vf_decimate)...
pull_frame is required to return a valid image unless:

1. A fatal error occurred.
2. The end of the movie has been reached.

So, if a filter wants to drop some frames, that's ok, but it can't
just return NULL from pull_frame. Instead it could do something like
the following:

sbuf=vp_pull_frame(prev);
if (skip) {
    vp_release_buffer(sbuf);
    sbuf=vp_pull_frame(prev);
}

Or, if it knows which frame it wants to skip without looking at the
image contents first, it could call vp_skip_frame instead to save some
cpu time!

One more thing to keep in mind: PTS in G2 propagates through the
video pipeline! So, if a filter drops a frame, it has to add the
relative_pts of that frame to the relative_pts of the next non-skipped
frame before returning it! Otherwise you'll ruin A/V sync!

----------------------------------------------------------------------

OK, I think that's about all for now. I hope this serves as a (mostly)
complete G2 video pipeline design document. If there are no
objections, I may start coding this within the next few weeks. Speak
up now if you want anything changed!!


Rich


From dalias at aerifal.cx  Tue Sep 30 17:16:05 2003
From: dalias at aerifal.cx (D Richard Felker III)
Date: Tue, 30 Sep 2003 11:16:05 -0400
Subject: [MPlayer-G2-dev] more on g2 & video filters
In-Reply-To: <20030928000204.GA19033@brightrain.aerifal.cx>
References: <20030928000204.GA19033@brightrain.aerifal.cx>
Message-ID: <20030930151605.GP2856@brightrain.aerifal.cx>

A few additional things that came up while talking to Ivan on IRC:

* I forgot about slices.
* I should include examples of buffer alloc/release for IPB codecs.

About slices...actually I think there are two different types of
slices:

1) Simple slices -- source gets a dummy buffer structure with no
   actual pointers in it, and sends the picture to dest one slice at a
   time via draw_slice, passing a pointer to the block to copy.

2) Hybrid slices -- source has an actual buffer (indirect or export,
   or perhaps even direct) which it obtained with a slices flag set,
   and it calls draw_slice with rectangles within this buffer, as they
   are ready.

So now I need to explain why having both is beneficial...

Type (1), simple slices, correspond to the way libmpeg2 slice-renders
B frames, reusing the same small buffer for each slice to ease stress
on the cache. It could also be used for other slice rendering, but
type (2), hybrid slices, has a big advantage! Suppose you have the
following filter chain:

    vd ----> expand ----> vo

and suppose the vo's buffers are in video memory, so the (IP-type)
codec can't direct render through expand. Also suppose the user wants
expand to render OSD.

Now, vd draws with slices to improve performance. The expand filter
could either pass them on to the vo, or get a direct (dr) buffer from
the vo and draw the slices right into it. And here's where the
performance benefit comes. Let's say expand does direct rendering to
vo. Expand's pull_image has called vd's pull_image, which responds
with a sequence of draw_slice calls and then returns the buffer
structure. If we're using hybrid slices, this returned buffer actually
has valid pointers to the decoded picture in it, so expand can use
them as the source for alpha-blending osd onto the dr buffer in video
memory. No reads from video memory are needed!

Actually for the OSD/expand example here, it should be possible to do
the alpha-blending during the actual draw_slice calls, as long as OSD
contents are already known at the time. But there could be other
situations where it would be useful to do some computations at
slice-rendering time (certain localized computations that don't modify
the image -- maybe edge or combing detection) while the data is still
in the cache, and then use the results later once the whole frame is
available for more large-scale or global filtering.

A couple proposed rules for slices:

1. No two slices for the same frame may overlap.
2. With hybrid-type slices, source may NOT modify any region of the
   buffer which has already been submitted via draw_slice.

I'm still a bit undecided on rule #2; it may be better to make this
behavior dependent upon the "reusable" buffer flag.


Updated buffer types list (indirect renamed):

Direct -- owned by dest, allows direct rendering
Indirect -- owned by dest, but no pointers (slices required)
Export -- owned by source
Auto[matic] -- allocated/owned by vp link layer


API issues:

I'm a bit at a loss as for how to make the hybrid slices API clean and
usable (so that a filter/vd can detect availability of different
methods and select the optimal one), but plain simple slices is, well,
simple. You get the indirect buffer with vp_get_buffer, then call
vp_draw_slice to draw into/through it, and eventually return the
indirect buffer (not necessarily in order; out-of-order rendering is
possible just like with DR) to the caller.

The problem with hybrid slices is that some filters may only accept
hybrid slices (if they need to write into nonreadable memory but also
need to be able to read the source image again later -- see my OSD
example above) while some filters and decoders (e.g. libmpeg2
rendering a B frame) will prefer simple slices, and might only support
simple slices. The situation gets more complicated if slices propagate
through several filters.


A thought for Ivan: Slices and XVMC.

>From what I understand, the current XVMC code uses slices to pass the
motion vectors & dct coefficients to the vo, so that a function in the
vo will get called in coded-frame-order. But since slices are used
rather than direct rendering, this wastes an extra copy (data has to
be copied from mplayer's codec's slice buffer to shared-mem/X buffer).

If we find a good way to do hybrid slices with direct-type buffers,
then the codec could DR into shared mem to begin with, and the
draw_slice calls would just notify the vo that the data is ready. If
someone's using XVMC, they probably have a system that's barely fast
enough for DVD playback, so eliminating a copy could make the
difference that allows full-framerate DVD.


"Enough of slices..."
or "Examples with IPB codecs" (also for Ivan)

Let's say we have a codec with IPB frames, rendering to VO via direct
rendering. Coded frame order is IPB and display order is IBP (for the
first three frames). The first time codec's pull_image is called, the
decoder...

1. Gets direct buffer from vo via vp_get_buffer.
2. Decodes first I frame into the buffer.
3. Adds a reference count to the buffer with vp_lock_buffer so it can
   be kept for predicting the next frame, and stores pointer in
   private data area.
4. Returns the buffer.

Simple enough. Now the next call. The decoder...

1. Gets direct buffer from vo via vp_get_buffer.
2. Looks up the pointer for the previous I frame from its private data
   area.
3. Decodes the P frame into the new buffer.
4. Also stores the pointer to the new buffer in private area.
5. Gets another direct buffer from vo.
6. Renders B frame into new buffer based on the I and P buffers.
7. Returns the pointer to the B frame buffer without locking it.

We've now decoded 3 frames and output 2. On the third call, the
decoder does the following:

1. Sees that it's time to output the P frame, so the old I frame is no
   longer useful for prediction.
2. Releases the old I buffer. As far as the codec is concerned now,
   that buffer no longer exists.
3. Locks the P buffer so it won't be lost when the vo releases it.
4. Returns the P buffer.

The same procedure works in principle for slices, except the decoder
must keep both indirect buffers (from the vo, for the purpose of
returning in order to show them) and automatic buffers (from the link
layer, for the purpose of prediction). As the slices API is not yet
finalized, it may be preferred to merge these buffer pairs into one.


Rich