From dalias at aerifal.cx Fri Sep 12 01:53:01 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Thu, 11 Sep 2003 19:53:01 -0400 Subject: [MPlayer-G2-dev] A new vf layer proposal... Message-ID: <20030911235301.GA24980@brightrain.aerifal.cx> Early in G2 development we discussed changes to the vf layer, and some good improvements were made over G1, but IMO lots of it's still ugly and unsatisfactory. Here are the main 3 problems: 1) No easy way to expand the filter layer to allow branching rather than just linear filter chain. Think of rendering PIP (picture in picture, very cool for PVR type use!) or fading between separate clips in a video editing program based on G2. 2) The old get_image approach to DR does not make it possible for the destination filter/vo to know when the source filter/vd is done using a particular reference image. This means DR will not be possible with emerging advanced codecs which are capable of using multiple reference frames instead of the simple I/P/B model. 3) The whole vf_process_image loop is ugly and makes artificial distinction between "pending" images (pulled) and the old G1 push-model drawing. Actually (3) has a lot to do with (1). So the proposal for the new vf layer is that everything be "pull" model, i.e. the player "pulls" an image from the last filter, which in turn (recursively) pulls an image from the previous filter (or perhaps from multiple previous filters!). Such a design was discussed early on in G2 development, but we ran into problems with auto-insertion of conversion filters. However I think the following proposal solves the problem. vf_get_buffer (2 cases): 1) Next filter can accept our output format. If the next filter implements get_buffer, call that. Otherwise get a buffer from a pool for this filter-connection, growing the pool if all the buffers are already in use. 2) Next filter doesn't like our output format. Insert appropriate conversion filter and then do (1). In all cases, buffers obtained from get_buffer have reference counts. When a buffer is first obtained, it has reference count=1, meaning that the destination filter has a hold on it because it wants the output which the source filter is drawing into the buffer. If the source filter does not need to use the image as a reference for future frames, it can just return the image to the caller and the destination filter will unlock the buffer (thus freeing it for reuse) when it's finished using the image as input. On the other hand, if the source filter needs to keep the image as a reference for future frames, it can add its own lock (vf_lock_buffer) so that the image still has a nonzero reference count once the destination filter finishes using it. In addition to the above behavior, flags can be used to signal who (source or dest) is allowed to read/modify the image, and when. Thus, we have the equivalencies (old system to new system): TEMP: source does not lock the buffer TEMP+READABLE: source does not lock the buffer, but is allowed to read it (i.e. it can't be in video mem) IP: source locks buffer and is allowed to read it STATIC: source locks buffer and is allowed to write to it again after passing it on to the destination STATIC+READABLE: source locks buffer and is allowed to read it and write again after passing it on These explanations are fairly rough; they're just meant to give an idea of how things convert over. There's probably a need for a function similar to get_buffer, but which instead notifies the next filter that you want to reuse a buffer you already have (from previous locking) as the output for another frame. But as far as I can tell, all of this is minor detail that doesn't affect the proposal as a whole. Now we get to the more interesting function..... vf_pull_image: This one also has a couple cases: 1) The filter calling vf_pull_image was just created (by auto-insertion) during the previous filter's pull_image, so we do NOT want to call the previous filter's pull_image again. Instead we've saved the mpi returned by the previous filter somewhere in our vf structure, so we just clear that from the vf structure and immediately return it. This is a minor hack to make auto-inserted filters work, but it's not visible from outside of the vf_pull_image function itself, so IMO it's not ugly. 2) We call the previous filter's pull_image and get an image with destination == the calling filter. Return the image to the caller. 3) The previous filter's pull_image returns an image whose destination is *not* the calling filter. This means a conversion filter must have been inserted during the previous filter's pull_image (as a result of it calling get_buffer). Summary: (may have 10l bugs :) if (vf->pending_mpi) { mpi = vf->pending_mpi; vf->pending_mpi = NULL; return mpi; } while ((mpi = src_vf->pull_image(vf, src_vf)) && mpi->dest_vf != vf) { mpi->dest_vf->pending_mpi = mpi; src_vf = mpi->dest_vf; } return mpi; A couple comments about this. The nicest part of the design is that vf_pull_image doesn't need to know so much about the 'chain' structure of the filters. It should be called with something like: mpi = vf_pull_image(vf, vf->prev); so that a filter which wants multiple sources could do something like: mpi1 = vf_pull_image(vf, vf->priv->src1); mpi1 = vf_pull_image(vf, vf->priv->src2); or whatever. Actually the source should probably be passed to vf_pull_image as a pointer so that it can be updated when a conversion filter is auto-inserted. Also note that my proposal has mpi structure containing pointers to the dest (and possibly also source) filters with which the buffer is associated. I'm not sure this is entirely necessary, but it seems like a good idea. Of course the best part of all is that, from the calling program's perspective and the filter authors' perspective, vf_pull_image looks like a 100% transparent pull-based recursive frame processing system. No ugly process_image/get_pending_image distinction and push/pull mix, just a sequential flow of frames. Comments? I believe there are a few details to be worked out, especially in what happens when a filter gets auto-inserted by get_buffer, how buffer pools work, etc., but the basic design is sound. Concerns about get_buffer (e.g. whether you release a buffer before or after you return it, and if after, how) have been eliminated by use of reference counts and there seem to be no major obstacles to implementing the vf_pull_image system as described. At some point on the not-too-distant future I'd like to begin porting filters (especially pullup) to G2 and writing mencoder-g2, so I hope we can discuss the matter of overhauling the vf layer soon and then get around to some actual coding. Rich P.S. One more thing: I made no mention of how configuration (especially output size and all the resize nonsense Arpi was talking about :) works. I'll be happy to discuss that later, but I'd like to see what Arpi suggests first since that's all very confusing to me, and I don't think the design I've described above makes much difference to it... From andrej at lucky.net Fri Sep 12 02:06:05 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Fri, 12 Sep 2003 03:06:05 +0300 Subject: [MPlayer-G2-dev] Re: A new vf layer proposal... In-Reply-To: <20030911235301.GA24980@brightrain.aerifal.cx> References: <20030911235301.GA24980@brightrain.aerifal.cx> Message-ID: <20030912000605.GO76003@lucky.net> Hi, D Richard Felker III! Sometime (on Friday, September 12 at 2:42) I've received something... >Early in G2 development we discussed changes to the vf layer, and some >good improvements were made over G1, but IMO lots of it's still ugly >and unsatisfactory. Here are the main 3 problems: >1) No easy way to expand the filter layer to allow branching rather > than just linear filter chain. Think of rendering PIP (picture in > picture, very cool for PVR type use!) or fading between separate > clips in a video editing program based on G2. >2) The old get_image approach to DR does not make it possible for the > destination filter/vo to know when the source filter/vd is done > using a particular reference image. This means DR will not be > possible with emerging advanced codecs which are capable of using > multiple reference frames instead of the simple I/P/B model. >3) The whole vf_process_image loop is ugly and makes artificial > distinction between "pending" images (pulled) and the old G1 > push-model drawing. >Actually (3) has a lot to do with (1). >So the proposal for the new vf layer is that everything be "pull" >model, i.e. the player "pulls" an image from the last filter, which in >turn (recursively) pulls an image from the previous filter (or perhaps >from multiple previous filters!). Agree 100%, it's that I hoped already so we could build custom chain with branches - filter with more than one input(s) may pull all inputs at the same. :) When we pull images from two or more streams then we could have a sync problem but that problem could be solved if we run pull for "expected" time. So decoder (or other stream source) puts pts into image structure and then any filter could decide if that pts is over expected then just return null frame and keep that pulled until it'll fit expected time. May be I said it not very clean - sorry for my bad English then. :) [...rest is skipped, sorry...] Thank you all. Andriy. From dalias at aerifal.cx Fri Sep 12 04:44:39 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Thu, 11 Sep 2003 22:44:39 -0400 Subject: [MPlayer-G2-dev] Re: A new vf layer proposal... In-Reply-To: <20030912000605.GO76003@lucky.net> References: <20030911235301.GA24980@brightrain.aerifal.cx> <20030912000605.GO76003@lucky.net> Message-ID: <20030912024439.GL250@brightrain.aerifal.cx> On Fri, Sep 12, 2003 at 03:06:05AM +0300, Andriy N. Gritsenko wrote: > Hi, D Richard Felker III! > > Sometime (on Friday, September 12 at 2:42) I've received something... > >Early in G2 development we discussed changes to the vf layer, and some > >good improvements were made over G1, but IMO lots of it's still ugly > >and unsatisfactory. Here are the main 3 problems: > > >1) No easy way to expand the filter layer to allow branching rather > > than just linear filter chain. Think of rendering PIP (picture in > > picture, very cool for PVR type use!) or fading between separate > > clips in a video editing program based on G2. > > >2) The old get_image approach to DR does not make it possible for the > > destination filter/vo to know when the source filter/vd is done > > using a particular reference image. This means DR will not be > > possible with emerging advanced codecs which are capable of using > > multiple reference frames instead of the simple I/P/B model. > > >3) The whole vf_process_image loop is ugly and makes artificial > > distinction between "pending" images (pulled) and the old G1 > > push-model drawing. > > >Actually (3) has a lot to do with (1). > > >So the proposal for the new vf layer is that everything be "pull" > >model, i.e. the player "pulls" an image from the last filter, which in > >turn (recursively) pulls an image from the previous filter (or perhaps > >from multiple previous filters!). > > Agree 100%, it's that I hoped already so we could build custom chain with > branches - filter with more than one input(s) may pull all inputs at the > same. :) > When we pull images from two or more streams then we could have a sync > problem but that problem could be solved if we run pull for "expected" > time. So decoder (or other stream source) puts pts into image structure > and then any filter could decide if that pts is over expected then just > return null frame and keep that pulled until it'll fit expected time. > May be I said it not very clean - sorry for my bad English then. :) Image structure already has pts, so that's no problem. :) Normally for combining filters you'd be using several fixed-fps streams (with same fps) as input so it wouldn't matter too much anyway -- variable fps is mostly for ugly low quality stuff like asf and rm or for handling made-for-tv stuff from mixed sources (24/30/60 fps). BTW there's also the question of how to do filters that have multiple outputs, and it's a little more complicated, but I think they can be done as two filters sorta linked together. In any case, there doesn't seem to be anything in my design that precludes filters with multiple outputs, so I'm happy. Thanks for the comments! Rich From andrej at lucky.net Fri Sep 12 06:17:28 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Fri, 12 Sep 2003 07:17:28 +0300 Subject: [MPlayer-G2-dev] Re: A new vf layer proposal... In-Reply-To: <20030912024439.GL250@brightrain.aerifal.cx> References: <20030911235301.GA24980@brightrain.aerifal.cx> <20030912000605.GO76003@lucky.net> <20030912024439.GL250@brightrain.aerifal.cx> Message-ID: <20030912041728.GP76003@lucky.net> Hi, D Richard Felker III! Sometime (on Friday, September 12 at 5:34) I've received something... >On Fri, Sep 12, 2003 at 03:06:05AM +0300, Andriy N. Gritsenko wrote: >> Hi, D Richard Felker III! >> Sometime (on Friday, September 12 at 2:42) I've received something... >> >Early in G2 development we discussed changes to the vf layer, and some >> >good improvements were made over G1, but IMO lots of it's still ugly >> >and unsatisfactory. Here are the main 3 problems: >> >1) No easy way to expand the filter layer to allow branching rather >> > than just linear filter chain. Think of rendering PIP (picture in >> > picture, very cool for PVR type use!) or fading between separate >> > clips in a video editing program based on G2. >> >2) The old get_image approach to DR does not make it possible for the >> > destination filter/vo to know when the source filter/vd is done >> > using a particular reference image. This means DR will not be >> > possible with emerging advanced codecs which are capable of using >> > multiple reference frames instead of the simple I/P/B model. >> >3) The whole vf_process_image loop is ugly and makes artificial >> > distinction between "pending" images (pulled) and the old G1 >> > push-model drawing. >> >Actually (3) has a lot to do with (1). >> >So the proposal for the new vf layer is that everything be "pull" >> >model, i.e. the player "pulls" an image from the last filter, which in >> >turn (recursively) pulls an image from the previous filter (or perhaps >> >from multiple previous filters!). >> Agree 100%, it's that I hoped already so we could build custom chain with >> branches - filter with more than one input(s) may pull all inputs at the >> same. :) >> When we pull images from two or more streams then we could have a sync >> problem but that problem could be solved if we run pull for "expected" >> time. So decoder (or other stream source) puts pts into image structure >> and then any filter could decide if that pts is over expected then just >> return null frame and keep that pulled until it'll fit expected time. >> May be I said it not very clean - sorry for my bad English then. :) >Image structure already has pts, so that's no problem. :) Normally for >combining filters you'd be using several fixed-fps streams (with same >fps) as input so it wouldn't matter too much anyway -- variable fps is >mostly for ugly low quality stuff like asf and rm or for handling >made-for-tv stuff from mixed sources (24/30/60 fps). Not only. With video editing program you mentioned above you may want to use some filter to scale speed of fragment (my former fiancee likes to make music videos so I saw that many times) to be faster or slower and/or even mix two video streams with different time scaling. :) >BTW there's also the question of how to do filters that have multiple >outputs, and it's a little more complicated, but I think they can be >done as two filters sorta linked together. In any case, there doesn't >seem to be anything in my design that precludes filters with multiple >outputs, so I'm happy. I think that multiple-output filter is very rare case and I even cannot see real example but two-screen display of video or cloning for network streaming. :) >Thanks for the comments! Thank you too! With best wishes. Andriy. From atmosfear at users.sourceforge.net Mon Sep 15 15:59:18 2003 From: atmosfear at users.sourceforge.net (Felix Buenemann) Date: Mon, 15 Sep 2003 15:59:18 +0200 Subject: [MPlayer-G2-dev] Re: A new vf layer proposal... In-Reply-To: <20030912041728.GP76003@lucky.net> References: <20030911235301.GA24980@brightrain.aerifal.cx> <20030912024439.GL250@brightrain.aerifal.cx> <20030912041728.GP76003@lucky.net> Message-ID: <200309151559.18477.atmosfear@users.sourceforge.net> On Friday 12 September 2003 06:17, Andriy N. Gritsenko wrote: > >BTW there's also the question of how to do filters that have multiple > >outputs, and it's a little more complicated, but I think they can be > >done as two filters sorta linked together. In any case, there doesn't > >seem to be anything in my design that precludes filters with multiple > >outputs, so I'm happy. > > I think that multiple-output filter is very rare case and I even > cannot see real example but two-screen display of video or cloning for > network streaming. :) Or think about a filter that splits up the picture to it's planes for eg. dumping them to files, but that probably could be done inside the filter without the need of further processsing the output. -- Best Regards, Atmos ____________________________________________ - MPlayer Developer - http://mplayerhq.hu/ - ____________________________________________ From atmosfear at users.sourceforge.net Mon Sep 15 16:15:20 2003 From: atmosfear at users.sourceforge.net (Felix Buenemann) Date: Mon, 15 Sep 2003 16:15:20 +0200 Subject: [MPlayer-G2-dev] [?] hit 3 flies - aspect ratio, resize, query_format In-Reply-To: <200308221948.h7MJmwZW017658@mail.mplayerhq.hu> References: <200308221948.h7MJmwZW017658@mail.mplayerhq.hu> Message-ID: <200309151615.20520.atmosfear@users.sourceforge.net> On Friday 22 August 2003 21:48, Arpi wrote: > I would extend vf's query_format() by a int p[6] parameter. > (actually int* size, which points to an array of 6 integers) > these 6 values are: > > buff_w, buff_h - w/h of the image buffer (real pixels) > disp_w, disp_h - pre-scaled w/h (recommended display size) [for startup] > want_w, want_h - wanted output size [for window resizing] > > > query_format() of 'normal' filters (which dont alter aspect ratio nor > buffer size) would just pass thru the pointer to next filter. > other filters shoudl implement it this way: > query_format(...){ > - change buff_w/h (only filters which chaneg buffer dimensios) > - change disp_w/h (only filters which change aspect ratio) > - call next filter's query_format() > - change want_w/h (only filters which change buffer dimensios) > } hmm, I see something missing here: Where do you account for the aspect-discrepancy of Screen-Resolution-Aspect vs. Physical-Displaydevice-Aspect. Eg. think of the case where displaying video at 1280x1024 on a 4:3 19" CRT, which is a very common case. In this case we have to do slight aspect correction in order to retain correct aspect ratio. Another place is TV-Out, often the display-area from the graphics card doesn't fill the whole visible area of the TV's CRT, so that there are black areas above and below (sometimes also at the sides). With mplayer G1 id'd simply measure the display area from the graphics card, with a ruler or sth. and give that to MPlayer, eg: mplayer -monitoraspect 40:27 movie.avi In most cases it would then make the black bars above and below the movie smaller, so aspect would be correct again and I'd be happy. Maybe I've kinda lamely coded the aspect code in G1, but at least it works as expected =) > also, the scaling flags of vfcap.h shoudl be reviewed: merging HSWSCALE_UP > and HWSCALE_DOWN, it ha sno sence to keep them separated. > query_format() implementations can now check source and dest resolution so > can decide if sw/hw scaling is possible or not. if they can do the scaling > (or resize), they should change the want_w/h values. otherwise left > unchanged. hmm, I'm not sure about this. The bad thing about eg. XV is that you can tell it to scale down in most cases but then if the adapter can't do it, it'll simply crop away part of the image to get the desired size. So the idea was to be able to specify if the card can do hw downsizing/upsizing using the selected vo, so we can downsize/upsize by swscaler if needed whilst using the faster hw scaler for upscaling/downscaling. But maybe I misunderstood you Arpi. -- Best Regards, Atmos ____________________________________________ - MPlayer Developer - http://mplayerhq.hu/ - ____________________________________________ From dalias at aerifal.cx Mon Sep 15 18:17:56 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 15 Sep 2003 12:17:56 -0400 Subject: [MPlayer-G2-dev] Re: A new vf layer proposal... In-Reply-To: <200309151559.18477.atmosfear@users.sourceforge.net> References: <20030911235301.GA24980@brightrain.aerifal.cx> <20030912024439.GL250@brightrain.aerifal.cx> <20030912041728.GP76003@lucky.net> <200309151559.18477.atmosfear@users.sourceforge.net> Message-ID: <20030915161756.GX250@brightrain.aerifal.cx> On Mon, Sep 15, 2003 at 03:59:18PM +0200, Felix Buenemann wrote: > On Friday 12 September 2003 06:17, Andriy N. Gritsenko wrote: > > >BTW there's also the question of how to do filters that have multiple > > >outputs, and it's a little more complicated, but I think they can be > > >done as two filters sorta linked together. In any case, there doesn't > > >seem to be anything in my design that precludes filters with multiple > > >outputs, so I'm happy. > > > > I think that multiple-output filter is very rare case and I even > > cannot see real example but two-screen display of video or cloning for > > network streaming. :) > Or think about a filter that splits up the picture to it's planes for eg. > dumping them to files, but that probably could be done inside the filter > without the need of further processsing the output. Or think about mplayer-g2-PVR, with simultaneous display and encoding of video. Maybe you have something like: ,-> vf_madei -> vo / tvin -> vd_raw < \ `-> ve ...or... ,-> vo / tvin -> vd_raw -> vf_pullup < \ `-> vf_scale -> ve :)))))))) Rich From dalias at aerifal.cx Mon Sep 15 18:35:03 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 15 Sep 2003 12:35:03 -0400 Subject: [MPlayer-G2-dev] [?] hit 3 flies - aspect ratio, resize, query_format In-Reply-To: <200309151615.20520.atmosfear@users.sourceforge.net> References: <200308221948.h7MJmwZW017658@mail.mplayerhq.hu> <200309151615.20520.atmosfear@users.sourceforge.net> Message-ID: <20030915163503.GY250@brightrain.aerifal.cx> On Mon, Sep 15, 2003 at 04:15:20PM +0200, Felix Buenemann wrote: > On Friday 22 August 2003 21:48, Arpi wrote: > > I would extend vf's query_format() by a int p[6] parameter. > > (actually int* size, which points to an array of 6 integers) > > these 6 values are: > > > > buff_w, buff_h - w/h of the image buffer (real pixels) > > disp_w, disp_h - pre-scaled w/h (recommended display size) [for startup] > > want_w, want_h - wanted output size [for window resizing] > > > > > > query_format() of 'normal' filters (which dont alter aspect ratio nor > > buffer size) would just pass thru the pointer to next filter. > > other filters shoudl implement it this way: > > query_format(...){ > > - change buff_w/h (only filters which chaneg buffer dimensios) > > - change disp_w/h (only filters which change aspect ratio) > > - call next filter's query_format() > > - change want_w/h (only filters which change buffer dimensios) > > } > > hmm, I see something missing here: Where do you account for the > aspect-discrepancy of Screen-Resolution-Aspect vs. > Physical-Displaydevice-Aspect. Eg. think of the case where displaying video > at 1280x1024 on a 4:3 19" CRT, which is a very common case. In this case we > have to do slight aspect correction in order to retain correct aspect ratio. The vo simply sets the wanted display width/height based on monitor aspect and disp_w/disp_h from filters. No problem there. However IMO this whole system where window resizes propagate back through the filter chain is a very very bad idea. Consider the following example filter chains: 1) scale=480:480,(deinterlace/ivtc) If the user resizes the window, Arpi's proposal would have the first scale filter get reconfigured for the new output size! If any vertical resizing takes place, this will ruin the deinterlacing!! 2) rgb codec => scale,denoise3d If user resizes the window, the scale filter will resize the image before denoising rather than just converting colorspace! This will ruin the denoising process. I'm sure there are more examples too. In all these cases, the basic problem is the same -- when resizes propagate back through the filter chain, the video gets resized at the wrong point, and the output is wrong. It's a similar problem to how mencoder skips frames at the beginning of the filterchain rather than at the end. IMO any final preparation for display like this needs to be done at the very end of the filter chain. I'd even suggest putting swscaler support in vf_vo2 rather than loading a filter for window resizing. That way the filter chain doesn't have to be aware of any silly resize signals. IIRC Arpi also considered putting swscaler in vf_vo2 when we were talking about it on IRC. > Another place is TV-Out, often the display-area from the graphics card doesn't > fill the whole visible area of the TV's CRT, so that there are black areas > above and below (sometimes also at the sides). With mplayer G1 id'd simply This means your TVout is horribly misconfigured! Try changing the timings (with Matrox, sync pulse length is used to control the black border size when in TV mode, so it may be similar on other cards). I've never seen a card which forces black borders when used on windows, so if you're getting black borders, I really do expect a driver/configuration problem, not a fundamental limit of the hardware. > > also, the scaling flags of vfcap.h shoudl be reviewed: merging HSWSCALE_UP > > and HWSCALE_DOWN, it ha sno sence to keep them separated. > > query_format() implementations can now check source and dest resolution so > > can decide if sw/hw scaling is possible or not. if they can do the scaling > > (or resize), they should change the want_w/h values. otherwise left > > unchanged. > hmm, I'm not sure about this. The bad thing about eg. XV is that you can tell > it to scale down in most cases but then if the adapter can't do it, it'll > simply crop away part of the image to get the desired size. So the idea was > to be able to specify if the card can do hw downsizing/upsizing using the > selected vo, so we can downsize/upsize by swscaler if needed whilst using the > faster hw scaler for upscaling/downscaling. But maybe I misunderstood you > Arpi. I think you just misunderstood. Arpi was saying that config would just return failure for scaling down, if the card didn't support it, rather than using a flag. Rich From atmosfear at users.sourceforge.net Mon Sep 15 18:52:09 2003 From: atmosfear at users.sourceforge.net (Felix Buenemann) Date: Mon, 15 Sep 2003 18:52:09 +0200 Subject: [MPlayer-G2-dev] [?] hit 3 flies - aspect ratio, resize, query_format In-Reply-To: <20030915163503.GY250@brightrain.aerifal.cx> References: <200308221948.h7MJmwZW017658@mail.mplayerhq.hu> <200309151615.20520.atmosfear@users.sourceforge.net> <20030915163503.GY250@brightrain.aerifal.cx> Message-ID: <200309151852.09495.atmosfear@users.sourceforge.net> On Monday 15 September 2003 18:35, D Richard Felker III wrote: > On Mon, Sep 15, 2003 at 04:15:20PM +0200, Felix Buenemann wrote: > > On Friday 22 August 2003 21:48, Arpi wrote: [...Arpis proposal...] > > hmm, I see something missing here: Where do you account for the > > aspect-discrepancy of Screen-Resolution-Aspect vs. > > Physical-Displaydevice-Aspect. Eg. think of the case where displaying > > video at 1280x1024 on a 4:3 19" CRT, which is a very common case. In this > > case we have to do slight aspect correction in order to retain correct > > aspect ratio. > > The vo simply sets the wanted display width/height based on monitor > aspect and disp_w/disp_h from filters. No problem there. However IMO > this whole system where window resizes propagate back through the > filter chain is a very very bad idea. Consider the following example > filter chains: > > 1) scale=480:480,(deinterlace/ivtc) > > If the user resizes the window, Arpi's proposal would have the first > scale filter get reconfigured for the new output size! If any vertical > resizing takes place, this will ruin the deinterlacing!! > > 2) rgb codec => scale,denoise3d > > If user resizes the window, the scale filter will resize the image > before denoising rather than just converting colorspace! This will > ruin the denoising process. > > I'm sure there are more examples too. In all these cases, the basic > problem is the same -- when resizes propagate back through the filter > chain, the video gets resized at the wrong point, and the output is > wrong. It's a similar problem to how mencoder skips frames at the > beginning of the filterchain rather than at the end. IMO any final > preparation for display like this needs to be done at the very end of > the filter chain. I'd even suggest putting swscaler support in vf_vo2 > rather than loading a filter for window resizing. That way the filter > chain doesn't have to be aware of any silly resize signals. IIRC Arpi > also considered putting swscaler in vf_vo2 when we were talking about > it on IRC. You are totally right, scaling at the wrong point in the chain can mess uo things badly. I, too, think it's a good idea to put a scaler in the video out filter. > > Another place is TV-Out, often the display-area from the graphics card > > doesn't fill the whole visible area of the TV's CRT, so that there are > > black areas above and below (sometimes also at the sides). With mplayer > > G1 id'd simply > > This means your TVout is horribly misconfigured! Try changing the > timings (with Matrox, sync pulse length is used to control the black > border size when in TV mode, so it may be similar on other cards). > I've never seen a card which forces black borders when used on > windows, so if you're getting black borders, I really do expect a > driver/configuration problem, not a fundamental limit of the hardware. Oh, I'm not talking about vanilla ice graphics cards, I'm talking about shit hardware, like Savage/MX Chip (my laptop), or Gefore 4 MX on Windows with cheapo TV-Codec (the one in a friends PC). > > Rich > -- Best Regards, Atmos ____________________________________________ - MPlayer Developer - http://mplayerhq.hu/ - ____________________________________________ From gsbarbieri at yahoo.com.br Mon Sep 15 22:32:51 2003 From: gsbarbieri at yahoo.com.br (=?iso-8859-1?q?Gustavo=20Sverzut=20Barbieri?=) Date: Mon, 15 Sep 2003 17:32:51 -0300 (ART) Subject: [MPlayer-G2-dev] Re: A new vf layer proposal... In-Reply-To: <20030915161756.GX250@brightrain.aerifal.cx> Message-ID: <20030915203251.78909.qmail@web20904.mail.yahoo.com> --- D Richard Felker III escreveu: > On Mon, Sep 15, 2003 at 03:59:18PM +0200, Felix Buenemann wrote: > > On Friday 12 September 2003 06:17, Andriy N. Gritsenko wrote: > > > >BTW there's also the question of how to do filters that have > multiple > > > >outputs, and it's a little more complicated, but I think they > can be > > > >done as two filters sorta linked together. In any case, there > doesn't > > > >seem to be anything in my design that precludes filters with > multiple > > > >outputs, so I'm happy. > > > > > > I think that multiple-output filter is very rare case and I > even > > > cannot see real example but two-screen display of video or > cloning for > > > network streaming. :) > > Or think about a filter that splits up the picture to it's planes > for eg. > > dumping them to files, but that probably could be done inside the > filter > > without the need of further processsing the output. > > Or think about mplayer-g2-PVR, with simultaneous display and encoding > of video. Maybe you have something like: > > ,-> vf_madei -> vo > / > tvin -> vd_raw < > \ > `-> ve > > ...or... > > ,-> vo > / > tvin -> vd_raw -> vf_pullup < > \ > `-> vf_scale -> ve But here maybe we need something different, since would be cool if the VO could go back and forth, while VE keeps recording... (Tivo like) Anyway, this can be great to do Video Walls, so video is cropped and exported to different video heads, ie: .---> Top_Left | +---> Top_Right | vf_videowall(4) -+ | +---> Bottom_Left | `---> Bottom_Right Maybe something to crop each video display border out (Ie, using monitors, crop around 2") Gustavo _______________________________________________________________________ Desafio AntiZona: participe do jogo de perguntas e respostas que vai dar um Renault Clio, computadores, c?meras digitais, videogames e muito mais! www.cade.com.br/antizona From atmosfear at users.sourceforge.net Tue Sep 16 00:48:05 2003 From: atmosfear at users.sourceforge.net (Felix Buenemann) Date: Tue, 16 Sep 2003 00:48:05 +0200 Subject: [MPlayer-G2-dev] Re: A new vf layer proposal... In-Reply-To: <20030915203251.78909.qmail@web20904.mail.yahoo.com> References: <20030915203251.78909.qmail@web20904.mail.yahoo.com> Message-ID: <200309160048.05307.atmosfear@users.sourceforge.net> On Monday 15 September 2003 22:32, Gustavo Sverzut Barbieri wrote: [...some ugly wrapped text...] > > > > Or think about mplayer-g2-PVR, with simultaneous display and encoding > > of video. Maybe you have something like: > > > > ,-> vf_madei -> vo > > / > > tvin -> vd_raw < > > \ > > `-> ve > > > > ...or... > > > > ,-> vo > > / > > tvin -> vd_raw -> vf_pullup < > > \ > > `-> vf_scale -> ve > > But here maybe we need something different, since would be cool if the > VO could go back and forth, while VE keeps recording... (Tivo like) It would probably be easier, if you record to file with one process and playback the currently recording file using another process =) > > Anyway, this can be great to do Video Walls, so video is cropped and > exported to different video heads, ie: > > .---> Top_Left > > +---> Top_Right > > vf_videowall(4) -+ > > +---> Bottom_Left > > `---> Bottom_Right > > Maybe something to crop each video display border out (Ie, using > monitors, crop around 2") hmm, do you remember this project where they animated a whole building using lights which represented grey-sahed pixels? ah, Blinkenlights Arcade, anyways there was an MPlayer plugin for it. -> http://www.blinkenlights.de/ > Gustavo -- Best Regards, Atmos ____________________________________________ - MPlayer Developer - http://mplayerhq.hu/ - ____________________________________________ From dalias at aerifal.cx Tue Sep 16 03:29:22 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 15 Sep 2003 21:29:22 -0400 Subject: [MPlayer-G2-dev] Re: A new vf layer proposal... In-Reply-To: <200309160048.05307.atmosfear@users.sourceforge.net> References: <20030915203251.78909.qmail@web20904.mail.yahoo.com> <200309160048.05307.atmosfear@users.sourceforge.net> Message-ID: <20030916012922.GE250@brightrain.aerifal.cx> On Tue, Sep 16, 2003 at 12:48:05AM +0200, Felix Buenemann wrote: > On Monday 15 September 2003 22:32, Gustavo Sverzut Barbieri wrote: > [...some ugly wrapped text...] > > > > > > Or think about mplayer-g2-PVR, with simultaneous display and encoding > > > of video. Maybe you have something like: > > > > > > ,-> vf_madei -> vo > > > / > > > tvin -> vd_raw < > > > \ > > > `-> ve > > > > > > ...or... > > > > > > ,-> vo > > > / > > > tvin -> vd_raw -> vf_pullup < > > > \ > > > `-> vf_scale -> ve > > > > But here maybe we need something different, since would be cool if the > > VO could go back and forth, while VE keeps recording... (Tivo like) > > It would probably be easier, if you record to file with one process and > playback the currently recording file using another process =) It would also require more cpu power! Decoding 640x480 (or higher!) video takes a lot more than just copying it to video memory. IMO it would be worthwhile to have both approaches. Really what I described isn't like a PVR so much as just a simple video recorder that lets you watch while you record...for a full PVR you'd want to play from the file so you could seek during recording. > > Anyway, this can be great to do Video Walls, so video is cropped and > > exported to different video heads, ie: > > > > .---> Top_Left > > > > +---> Top_Right > > > > vf_videowall(4) -+ > > > > +---> Bottom_Left > > > > `---> Bottom_Right > > > > Maybe something to crop each video display border out (Ie, using > > monitors, crop around 2") > hmm, do you remember this project where they animated a whole building using > lights which represented grey-sahed pixels? ah, Blinkenlights Arcade, anyways > there was an MPlayer plugin for it. -> http://www.blinkenlights.de/ Yep. :) Rich From gsbarbieri at yahoo.com.br Wed Sep 17 00:14:48 2003 From: gsbarbieri at yahoo.com.br (=?iso-8859-1?q?Gustavo=20Sverzut=20Barbieri?=) Date: Tue, 16 Sep 2003 19:14:48 -0300 (ART) Subject: [MPlayer-G2-dev] Re: A new vf layer proposal... In-Reply-To: <200309160048.05307.atmosfear@users.sourceforge.net> Message-ID: <20030916221448.85691.qmail@web20902.mail.yahoo.com> --- Felix Buenemann escreveu: > On Monday 15 September 2003 22:32, Gustavo Sverzut Barbieri wrote: > [...some ugly wrapped text...] > > > > > > Or think about mplayer-g2-PVR, with simultaneous display and > encoding > > > of video. Maybe you have something like: > > > > > > ,-> vf_madei -> vo > > > / > > > tvin -> vd_raw < > > > \ > > > `-> ve > > > > > > ...or... > > > > > > ,-> vo > > > / > > > tvin -> vd_raw -> vf_pullup < > > > \ > > > `-> vf_scale -> ve > > > > But here maybe we need something different, since would be cool if > the > > VO could go back and forth, while VE keeps recording... (Tivo like) > > It would probably be easier, if you record to file with one process > and > playback the currently recording file using another process =) > > > > > Anyway, this can be great to do Video Walls, so video is cropped > and > > exported to different video heads, ie: > > > > .---> Top_Left > > > > +---> Top_Right > > > > vf_videowall(4) -+ > > > > +---> Bottom_Left > > > > `---> Bottom_Right > > > > Maybe something to crop each video display border out (Ie, using > > monitors, crop around 2") > hmm, do you remember this project where they animated a whole > building using > lights which represented grey-sahed pixels? ah, Blinkenlights Arcade, > anyways > there was an MPlayer plugin for it. -> http://www.blinkenlights.de/ > Wow! Crazy! Gustavo _______________________________________________________________________ Desafio AntiZona: participe do jogo de perguntas e respostas que vai dar um Renault Clio, computadores, c?meras digitais, videogames e muito mais! www.cade.com.br/antizona From dalias at aerifal.cx Wed Sep 17 06:37:43 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 17 Sep 2003 00:37:43 -0400 Subject: [MPlayer-G2-dev] A new vf layer proposal... In-Reply-To: <20030911235301.GA24980@brightrain.aerifal.cx> References: <20030911235301.GA24980@brightrain.aerifal.cx> Message-ID: <20030917043743.GL250@brightrain.aerifal.cx> OK, as much as I appreciate the tangential discussion about branching filters and multi-monitor displays and stuff... What I was really looking for was comments on the design, whether there are any obvious mistakes or problems I'm overlooking, etc. It might also be nice to hear some thoughts on the way filter config and format negotiation should work, since I didn't address that at all, as well as how window resizing should be handled (special dynamically inserted scale filter, or swscaler code in vf_vo2) and how format conversion should be handled between filters... My original idea was to put in some minor, isolated hacks to allow a (swscaler) filter to be inserted dynamically between the called filter and the calling filter. But since then I've had a couple more ideas... 1) Put format conversion directly in the vf layer, like: vf_pull_image(vf_instance_t *vf_src, *vf_dest) { mpi = vf_src->pull_image(vf_src, vf_dest); if (mpi->imgfmt not usable by vf_dest) { mpi2 = vf_get_buffer(vf_dest, ...); swScale(mpi2, mpi, ...); /* Michael's coolJavaCaps! */ vf_release_buffer(mpi); return mpi2; } return mpi; } 2) Require filters to accept any input format (but their get_buffer can restrict which formats allow DR, and their query_format can report which formats are natively supported. Then, once a filter gets an image from vf_pull_image, it checks to see if it can use the format as-is. If not, it calls a conversion function in the vf api: mpi = vf_pull_image(vf->prev, vf); if (mpi->imgfmt not supported) mpi = vf_convert(vf, mpi, fmt); Don't assume I'm being sloppy in the way these functions are called for the sake of providing a simplified example. The idea is that the buffer lock count and ownership semantics would automatically "do the right thing" with code just about as simple as what I've written above. For instance, vf_convert woulc call vf_get_buffer(vf, ...), allowing direct rendering, and would decrement the lock count on the mpi passed into it (releasing this buffer as long as the current vf or the one that returned it hadn't established an extra lock with vf_lock_buffer). Actually, with this in mind, approaches (1) and (2) above are basically identical. It's just a matter of where the conversion code goes. After thinking about things more, I have various reasons of preferring approaches like the following instead of dynamically inserting a scale filter wherever there's a format mismatch. Please reply and make comments if you particularly like or dislike any of the approaches I've described. Rich From andrej at lucky.net Wed Sep 17 12:46:49 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Wed, 17 Sep 2003 13:46:49 +0300 Subject: [MPlayer-G2-dev] Re: A new vf layer proposal... In-Reply-To: <20030917043743.GL250@brightrain.aerifal.cx> References: <20030911235301.GA24980@brightrain.aerifal.cx> <20030917043743.GL250@brightrain.aerifal.cx> Message-ID: <20030917104649.GA14942@lucky.net> Hi, D Richard Felker III! Sometime (on Wednesday, September 17 at 7:26) I've received something... >OK, as much as I appreciate the tangential discussion about branching >filters and multi-monitor displays and stuff... [.......] >After thinking about things more, I have various reasons of preferring >approaches like the following instead of dynamically inserting a scale >filter wherever there's a format mismatch. Please reply and make >comments if you particularly like or dislike any of the approaches >I've described. I like both very much and I wish it'll be done! I've thought about some alike that already and you've described it well. :) With best wishes. Andriy. From r_togni at libero.it Tue Sep 23 22:49:13 2003 From: r_togni at libero.it (Roberto Togni) Date: Tue, 23 Sep 2003 22:49:13 +0200 Subject: [MPlayer-G2-dev] Native codecs and g2 Message-ID: <20030923204913.GA1068@tower2.myhome.qwe> Any news about codecs in g2? Some time ago A'rpi was thinking about moving to a get/release buffer method instead of get_image/mpi as g1. At the end, IIRC, he said he's going to stay with mpi. It is a final decision or it's still a work in progress item? I was thinking about moving most of the native codecs from libmpcodecs to ffmpeg/libavcodec since a long time. This could be the right time to do it, and it's even more true if codecs have to be modified to be used in g2. The codecs i'm thinking about are old QT codecs (rle, rpza, smc, 8bps, ...) and old vfw codecs (Cinepak, cvid, msrle, ...), but probably every native codec can be moved (lcl and lzo depends on external libs, i have to check how ffmpeg handles it). Some codecs are already available in libavcodec, even if MPlayer uses its own version (various adpcm audio codecs, cyuv, realaudio 1.0 and 2.0, and probably others i don't remember now). Pro: less code to port and mantain, more people will be able to use them and fix bugs Cons: MPlayer without libavcodec will be unable to play most files (unless you use binary codecs), but i think that using MPlayer without libavcodec is not a wise choice even now. What's your opinion about it? Ciao, Roberto From michaelni at gmx.at Tue Sep 23 23:08:58 2003 From: michaelni at gmx.at (Michael Niedermayer) Date: Tue, 23 Sep 2003 23:08:58 +0200 Subject: [MPlayer-G2-dev] Native codecs and g2 In-Reply-To: <20030923204913.GA1068@tower2.myhome.qwe> References: <20030923204913.GA1068@tower2.myhome.qwe> Message-ID: <200309232308.58755.michaelni@gmx.at> Hi On Tuesday 23 September 2003 22:49, Roberto Togni wrote: > Any news about codecs in g2? > > Some time ago A'rpi was thinking about moving to a get/release buffer > method instead of get_image/mpi as g1. At the end, IIRC, he said he's > going to stay with mpi. > It is a final decision or it's still a work in progress item? > > I was thinking about moving most of the native codecs from libmpcodecs > to ffmpeg/libavcodec since a long time. > This could be the right time to do it, and it's even more true if > codecs have to be modified to be used in g2. > > The codecs i'm thinking about are old QT codecs (rle, rpza, smc, > 8bps, ...) and old vfw codecs (Cinepak, cvid, msrle, ...), but probably > every native codec can be moved (lcl and lzo depends on external libs, > i have to check how ffmpeg handles it). > Some codecs are already available in libavcodec, even if MPlayer uses > its own version (various adpcm audio codecs, cyuv, realaudio 1.0 and > 2.0, and probably others i don't remember now). cinepack, msvidc, msrle have been ported to ffdshow/libavcodec by someone (milan cutka maybe), melanson said (a long time ago) that he would port them to ffmpeg/libavcodec [...] -- Michael level[i]= get_vlc(); i+=get_vlc(); (violates patent EP0266049) median(mv[y-1][x], mv[y][x-1], mv[y+1][x+1]); (violates patent #5,905,535) buf[i]= qp - buf[i-1]; (violates patent #?) for more examples, see http://mplayerhq.hu/~michael/patent.html stop it, see http://petition.eurolinux.org & http://petition.ffii.org/eubsa/en From dalias at aerifal.cx Tue Sep 23 23:47:17 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Tue, 23 Sep 2003 17:47:17 -0400 Subject: [MPlayer-G2-dev] Native codecs and g2 In-Reply-To: <20030923204913.GA1068@tower2.myhome.qwe> References: <20030923204913.GA1068@tower2.myhome.qwe> Message-ID: <20030923214717.GS2856@brightrain.aerifal.cx> On Tue, Sep 23, 2003 at 10:49:13PM +0200, Roberto Togni wrote: > Any news about codecs in g2? > > Some time ago A'rpi was thinking about moving to a get/release buffer > method instead of get_image/mpi as g1. At the end, IIRC, he said he's > going to stay with mpi. > It is a final decision or it's still a work in progress item? Read my recent vf proposals. IMO we much switch to get/release buffer rather than the g1-style system; otherwise G2 will suck and won't be able to do any DR with new multi-reference-frame codecs or temporal filters. > I was thinking about moving most of the native codecs from libmpcodecs > to ffmpeg/libavcodec since a long time. > This could be the right time to do it, and it's even more true if > codecs have to be modified to be used in g2. Agree. > The codecs i'm thinking about are old QT codecs (rle, rpza, smc, > 8bps, ...) and old vfw codecs (Cinepak, cvid, msrle, ...), but probably > every native codec can be moved (lcl and lzo depends on external libs, > i have to check how ffmpeg handles it). IMO fix the external lib dependencies (i.e. remove them :). > Some codecs are already available in libavcodec, even if MPlayer uses > its own version (various adpcm audio codecs, cyuv, realaudio 1.0 and > 2.0, and probably others i don't remember now). > > Pro: less code to port and mantain, more people will be able to use > them and fix bugs > Cons: MPlayer without libavcodec will be unable to play most files > (unless you use binary codecs), but i think that using MPlayer without > libavcodec is not a wise choice even now. IMO this is not a con. :) Rich From dalias at aerifal.cx Sun Sep 28 02:02:04 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 27 Sep 2003 20:02:04 -0400 Subject: [MPlayer-G2-dev] more on g2 & video filters Message-ID: <20030928000204.GA19033@brightrain.aerifal.cx> Here's an updated and more thorough g2 video layer design. For consistency and for the sake of having a name to talk about the system, I'll call it "video pipeline" (vp). This encompasses filters as well as decoders, encoders, vo's, and the glue layer between it all. First, the structure of connection between the pieces. Nodes of the video pipeline can (in theory) be connected in many different ways. A simple implementation for the time being could be entirely linear like G1's filter chain, but the design should not require this. Thus, we'll talk about the video pipeline as a collection of nodes and links, where a link consists of a source, a destination, and a link structure which ties the two together and assists in managing buffers. ---------------------------------------------------------------------- The first topic, and probably the most important, is buffer management. G1 did a remarkably good job compared to any other player at the time, but it still has some big limitations. In particular: 1. There's no clear rule on what a filter is allowed to do with an image after calling vf_put_image on it. Can it still read the contents? Can it write more? Can it call vf_put_image more than once on the same mpi without calling vf_get_image again? In general the answer is probably no, but several filters (including ones I wrote) do stuff like this, and it's just not clear what's ok and what's not. 2. A filter that gives out DR buffers from its get_image has no way of knowing when the caller is done with those buffers. In theory, put_image should be a good indication (but see (1) above), and even worse, if the previous filter/dec_video drops frames, then put_image will never be called. 3. A video decoder (or filter) has no way of telling the system how long it needs its buffers preserved (for prediction or whatever). This works ok with standard IP[B] type codecs, but with more complicated prediction models it might totally break. So here's the new buffer model, based on get_buffer/release_buffer and reference counts: When a node of the video pipeline wants a buffer to return as output from its pull_frame (see next section below), it has three options for the buffer type: export, indirect, and direct. The first two are always available, but direct it only available if the destination's get_buffer function is willing to allocate a buffer with the desired format and flags (similar to G1). All buffers are associated with the link structure. Export -- almost exactly like in G1, with a few improvements. In the export case, the source filter is considered the owner of the buffer. It will be notified when the buffer's reference count reaches zero, so that it can in turn release any buffer it might be re-exporting (for example, the source buffer of which vf_crop is exporting a cropped version). Direct -- destination sets up a buffer structure so that source can render directly into it. In this case, the destination is considered the owner of the buffer, and is notified when the buffer's reference count reaches zero, so that it can in turn release any buffer it might be using (for example, the full destination buffer, a small part of which vf_expand is making available to the source). Indirect -- allocated and managed by the link layer. The new video pipeline design also has certain flags analogous to the old image types and flags in G1: Readable -- the buffer cannot reside in write-only memory, slow video memory, or anywhere that makes reading it slow, difficult, or restricted. This should always be set correctly when requesting a buffer, even though it generally applies only to direct-type buffers. Preserve -- source and rely on destination not to clobber the buffer as long as it is valid. If destination is the owner of the buffer (direct-type), then it it still of course free to clobber the buffer after the reference count reaches zero. Reusable -- source is free to continue writing to buffer even after passing it on to destination (assuming it maintains a reference count) and to pass the same buffer to destination multiple times if desired. Note that as long as the reusable flag is NOT set, destination can rely on source not to clobber the buffer after source returns (the analogue of the preserve flag, in the reverse direction). One should be particularly aware that the preserve flag applies to ALL image type, not just direct and indirect. That means that, unless source sets the preserve flag on exported buffers, destination is free to clobber them. (One example where this is useful is for rendering OSD onto the exported buffer of a filter before copying to video memory, instead of having to alpha-blend OSD in video memory.) Now an overview of how to convert old G1-model filters/codecs to the new model: IP[B] codecs -- call vp_get_buffer with readable+preserve flags for I and P frames, no flags for B frames. Increment reference count for I/P frames (vp_lock_buffer) before returning, then release them (vp_release_buffer) when they're no longer needed for prediction. For standard IP model this just involves keeping one buffer pointer in the codec's private data area (the previous I/P frame). Filters and codecs that used the "static" buffer type in G1 -- on the first frame, call vp_get_buffer with preserve+reusable (and optionally readable) flags to get a buffer, then establish a lock (vp_lock_buffer) before returning the image to the caller so that the reference count does not reach zero. When rendering subsequent frames, don't call vp_get_buffer again; just increment the reference count (vp_lock_buffer) before returning so that destination has an extra reference to release without the count reaching zero. I-only codecs and filters that use temp buffers -- call vp_get_buffer with no flags and return the buffer after drawing into it. This pretty much covers the G1 cases. Of course there are many more possibilities in G2 which weren't allowed in G1 and thus don't correspond to any old buffer model. ---------------------------------------------------------------------- The second topic is flow of execution. >From a final destination (vo/ve), the pipeline is called in reverse order, using a "pull" model for obtaining frames. The main relevant function is vp_pull_frame, which takes as its argument a pointer to a link structure, and calls the source's pull_frame function asking for a frame for destination. A filter/codec's pull_frame, in turn, is responsible for obtaining a buffer (via vp_get_buffer) filling it with the picture, and returning it to the caller. The reader would be advised to read and study the following example: Filter chain: VD --L1--> Filter A --L2--> Filter B --L3--> VO Let's say filter A is crop, exporting image, and B is scale, direct rendering into VO's video memory. L1,L2,L3 are the link structures. Flow of execution: vp_pull_frame(L3) B->pull_frame(L3) sbuf=vp_pull_frame(L2) A->pull_frame(L2) sbuf=vp_pull_frame(L1) VD->pull_frame(L1) figure out video format, dimensions, etc. A->query_format [*1] B->query_format A->config B->config VO->query_format [*2] VO->config dbuf=vp_get_buffer(L1) A->get_buffer(L1) dr fails, return NULL setup and return indirect image VD decodes video into dbuf return dbuf dbuf=vp_get_buffer(L2,export) setup export strides/pointers dbuf->priv->source=sbuf [*3] return dbuf dbuf=vp_get_buffer(L3) VO->get_buffer(L3) setup dr buffer and return it scale image from sbuf to dbuf vp_release_buffer(sbuf) A->release_buffer vp_release_buffer(...->priv->source) return dbuf Notes: [*1] query_format is called to determine which formats the destination supports natively. If no acceptable native formats are found, config will be called with whatever format source prefers to use, and destination will be responsible for converting images after receiving them from vp_pull_frame. [*2] Here filter B waits to query which formats the VO supports until it is configured. Since scale's input and output formats are independent of one another, there's no need to know during scale's query_format which formats the VO supports. [*3] Notice here that filter A does not release the source buffer it obtained from L1 at this time. Instead it stores it in the private data area for its exported destination buffer, so that it can release the source after (and only after) that buffer is no longer in use. ---------------------------------------------------------------------- The next (and maybe most controversial) topic: automatic format conversion! Believe it or not, it is possible with the above design to dynamically insert a filter between source and destination during source's pull_frame. It only requires very minor hacks in vp_pull_frame. But instead I would like to propose doing away with auto-insertion of scale filter for format conversion, and instead require filters/vo to accept any image format. Then, we introduce a new function to the vp api, vp_convert. I'll explain it with pseudocode: vp_buffer *vf_convert(vp_buffer *in) { vp_buffer *out = vp_get_buffer(in->link); swScaler(in, out); vp_release_buffer(in); return out; } Note that each buffer stores which link it's associated with (in->link here). Of course vp_convert would also have to keep the sws context somewhere; in->link would be an appropriate place. Also note that this will direct-render the conversion if the calling filter supports direct rendering. :) Now let's see how this affects format negotiation... G1's model was to have query_format only return true if this filter and ALL the subsequent filters/VO support the requested format. Since G1 could really only auto-insert scale at a few places in the chain (beginning or end...?) this made sense. But a side effect of this behavior is that conversion tends to get forced as early as possible in the filter chain. Consider the example: RGB codec ----> crop ----> YUV VO If crop's query_format returns false because VO does not support RGB, then RGB->YUV conversion will happen before cropping. But this is stupid and wastes cpu time. Now suppose that we're using the above model, with no auto-insertion of filters. the RGB codec sees that crop's query_format returns false for RGB, but since it can't output anything except RGB, it returns an RGB image anyway. Now, crop gets the RGB image. And crop is free to crop the image in RGB space, since it knows how to do that, totally oblivious to what the VO wants. Then the VO gets an RGB image, and has to call vp_convert, which will direct-render the converted image into video memory if possible. On the other hand, vf_expand might want to be more careful of what formats its destination filter supports natively (using query_format) so it doesn't force the destination to convert lots of useless black bars. Finally, one other benefit of converting as late as possible, is that a filter which drops frames might be able to determine it wants to drop the next frame before calling vp_convert. This could save a lot of cpu time. But the following plan for frame dropping makes the situation even better: ---------------------------------------------------------------------- What happens if in addition to vp_pull_frame, we also have vp_skip_frame, which notifies the source filter that the destination wants to "run the pipeline" for a frame, but throw away the output? The idea is that this call could propagate back as far as possible through the filter chain. It allows us to have the same behavior as -framedrop in G1, but also much better. If a filter knows it's going to drop the next frame before even looking at it, it can use vp_skip_frame instead of vp_pull_frame, and earlier filters can skip processing the frame altogether. BUT, if there are filters in the chain which cannot deal with missing frames (for example, inverse telecine), they're not obligated to propagate the vp_skip_frame call, and they can implement their skip_frame with the same function as pull_frame. If vp_skip_frame propagates all the way back to the decoder, and the next frame is a B frame (or the file is I-only), then the decoder can of course skip decoding entirely! As for filters which voluntarily drop frames (vf_decimate)... pull_frame is required to return a valid image unless: 1. A fatal error occurred. 2. The end of the movie has been reached. So, if a filter wants to drop some frames, that's ok, but it can't just return NULL from pull_frame. Instead it could do something like the following: sbuf=vp_pull_frame(prev); if (skip) { vp_release_buffer(sbuf); sbuf=vp_pull_frame(prev); } Or, if it knows which frame it wants to skip without looking at the image contents first, it could call vp_skip_frame instead to save some cpu time! One more thing to keep in mind: PTS in G2 propagates through the video pipeline! So, if a filter drops a frame, it has to add the relative_pts of that frame to the relative_pts of the next non-skipped frame before returning it! Otherwise you'll ruin A/V sync! ---------------------------------------------------------------------- OK, I think that's about all for now. I hope this serves as a (mostly) complete G2 video pipeline design document. If there are no objections, I may start coding this within the next few weeks. Speak up now if you want anything changed!! Rich From dalias at aerifal.cx Tue Sep 30 17:16:05 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Tue, 30 Sep 2003 11:16:05 -0400 Subject: [MPlayer-G2-dev] more on g2 & video filters In-Reply-To: <20030928000204.GA19033@brightrain.aerifal.cx> References: <20030928000204.GA19033@brightrain.aerifal.cx> Message-ID: <20030930151605.GP2856@brightrain.aerifal.cx> A few additional things that came up while talking to Ivan on IRC: * I forgot about slices. * I should include examples of buffer alloc/release for IPB codecs. About slices...actually I think there are two different types of slices: 1) Simple slices -- source gets a dummy buffer structure with no actual pointers in it, and sends the picture to dest one slice at a time via draw_slice, passing a pointer to the block to copy. 2) Hybrid slices -- source has an actual buffer (indirect or export, or perhaps even direct) which it obtained with a slices flag set, and it calls draw_slice with rectangles within this buffer, as they are ready. So now I need to explain why having both is beneficial... Type (1), simple slices, correspond to the way libmpeg2 slice-renders B frames, reusing the same small buffer for each slice to ease stress on the cache. It could also be used for other slice rendering, but type (2), hybrid slices, has a big advantage! Suppose you have the following filter chain: vd ----> expand ----> vo and suppose the vo's buffers are in video memory, so the (IP-type) codec can't direct render through expand. Also suppose the user wants expand to render OSD. Now, vd draws with slices to improve performance. The expand filter could either pass them on to the vo, or get a direct (dr) buffer from the vo and draw the slices right into it. And here's where the performance benefit comes. Let's say expand does direct rendering to vo. Expand's pull_image has called vd's pull_image, which responds with a sequence of draw_slice calls and then returns the buffer structure. If we're using hybrid slices, this returned buffer actually has valid pointers to the decoded picture in it, so expand can use them as the source for alpha-blending osd onto the dr buffer in video memory. No reads from video memory are needed! Actually for the OSD/expand example here, it should be possible to do the alpha-blending during the actual draw_slice calls, as long as OSD contents are already known at the time. But there could be other situations where it would be useful to do some computations at slice-rendering time (certain localized computations that don't modify the image -- maybe edge or combing detection) while the data is still in the cache, and then use the results later once the whole frame is available for more large-scale or global filtering. A couple proposed rules for slices: 1. No two slices for the same frame may overlap. 2. With hybrid-type slices, source may NOT modify any region of the buffer which has already been submitted via draw_slice. I'm still a bit undecided on rule #2; it may be better to make this behavior dependent upon the "reusable" buffer flag. Updated buffer types list (indirect renamed): Direct -- owned by dest, allows direct rendering Indirect -- owned by dest, but no pointers (slices required) Export -- owned by source Auto[matic] -- allocated/owned by vp link layer API issues: I'm a bit at a loss as for how to make the hybrid slices API clean and usable (so that a filter/vd can detect availability of different methods and select the optimal one), but plain simple slices is, well, simple. You get the indirect buffer with vp_get_buffer, then call vp_draw_slice to draw into/through it, and eventually return the indirect buffer (not necessarily in order; out-of-order rendering is possible just like with DR) to the caller. The problem with hybrid slices is that some filters may only accept hybrid slices (if they need to write into nonreadable memory but also need to be able to read the source image again later -- see my OSD example above) while some filters and decoders (e.g. libmpeg2 rendering a B frame) will prefer simple slices, and might only support simple slices. The situation gets more complicated if slices propagate through several filters. A thought for Ivan: Slices and XVMC. >From what I understand, the current XVMC code uses slices to pass the motion vectors & dct coefficients to the vo, so that a function in the vo will get called in coded-frame-order. But since slices are used rather than direct rendering, this wastes an extra copy (data has to be copied from mplayer's codec's slice buffer to shared-mem/X buffer). If we find a good way to do hybrid slices with direct-type buffers, then the codec could DR into shared mem to begin with, and the draw_slice calls would just notify the vo that the data is ready. If someone's using XVMC, they probably have a system that's barely fast enough for DVD playback, so eliminating a copy could make the difference that allows full-framerate DVD. "Enough of slices..." or "Examples with IPB codecs" (also for Ivan) Let's say we have a codec with IPB frames, rendering to VO via direct rendering. Coded frame order is IPB and display order is IBP (for the first three frames). The first time codec's pull_image is called, the decoder... 1. Gets direct buffer from vo via vp_get_buffer. 2. Decodes first I frame into the buffer. 3. Adds a reference count to the buffer with vp_lock_buffer so it can be kept for predicting the next frame, and stores pointer in private data area. 4. Returns the buffer. Simple enough. Now the next call. The decoder... 1. Gets direct buffer from vo via vp_get_buffer. 2. Looks up the pointer for the previous I frame from its private data area. 3. Decodes the P frame into the new buffer. 4. Also stores the pointer to the new buffer in private area. 5. Gets another direct buffer from vo. 6. Renders B frame into new buffer based on the I and P buffers. 7. Returns the pointer to the B frame buffer without locking it. We've now decoded 3 frames and output 2. On the third call, the decoder does the following: 1. Sees that it's time to output the P frame, so the old I frame is no longer useful for prediction. 2. Releases the old I buffer. As far as the codec is concerned now, that buffer no longer exists. 3. Locks the P buffer so it won't be lost when the vo releases it. 4. Returns the P buffer. The same procedure works in principle for slices, except the decoder must keep both indirect buffers (from the vo, for the purpose of returning in order to show them) and automatic buffers (from the link layer, for the purpose of prediction). As the slices API is not yet finalized, it may be preferred to merge these buffer pairs into one. Rich