From alex at fsn.hu Tue Dec 9 10:58:03 2003 From: alex at fsn.hu (Alex Beregszaszi) Date: Tue, 9 Dec 2003 10:58:03 +0100 Subject: [MPlayer-G2-dev] basic g2 vp (video pipeline) code In-Reply-To: <20031102070921.GY2856@brightrain.aerifal.cx> References: <20031102070921.GY2856@brightrain.aerifal.cx> Message-ID: <20031209105803.4c3ab8fb.alex@fsn.hu> Hi, any progress on this lately? -- Alex Beregszaszi (MPlayer Core Developer -- http://www.mplayerhq.hu/) From dalias at aerifal.cx Tue Dec 9 11:18:20 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Tue, 9 Dec 2003 05:18:20 -0500 Subject: [MPlayer-G2-dev] basic g2 vp (video pipeline) code In-Reply-To: <20031209105803.4c3ab8fb.alex@fsn.hu> References: <20031102070921.GY2856@brightrain.aerifal.cx> <20031209105803.4c3ab8fb.alex@fsn.hu> Message-ID: <20031209101820.GZ7833@brightrain.aerifal.cx> On Tue, Dec 09, 2003 at 10:58:03AM +0100, Alex Beregszaszi wrote: > Hi, > > any progress on this lately? Ask again in about a week. :) Rich From saschasommer at freenet.de Tue Dec 9 14:25:21 2003 From: saschasommer at freenet.de (Sascha Sommer) Date: Tue, 9 Dec 2003 14:25:21 +0100 Subject: [MPlayer-G2-dev] Concerning TODO item re-think parent-child connection, move to vf_vo2.c maybe Message-ID: <016d01c3be57$e64946e0$566954d9@oemcomputer> Ok lets see what we have and what we need. As far as I know there are two kinds of vo modules. Exclusive and nonexclusive or windowed drivers. Exclusive drivers are fbdev, vesa, svga, dga etc. and the windowed x11, xv, directx, gl. Exclusive drivers are not really interesting for GUIs and their setup won't change. The image will always stay at the same position and will always have the same size. Their main advantage is to change resolution and depth. This is completely different for windowed drivers. Their image position and size will change but the screen setup won't. I think all of them need nothing more than a handle to the window in which they should display the window, it's x, y, width and height and the colorkey. Therefore I propose that the UI creates the windows for them and passes its parameters to the vo when they change. The control calls for this are already there: VOCTRL_SET_WINDOW VOCTRL_SET_COLORKEY VOCTRL_RESIZE_DEST But what about vidix and the kernel mode drivers like mga_vid and tdfx_vid? In G1 these can be used either as windowed drivers via [x/win]vidix or as subdriver of an exclusive driver like vo vesa. Using them as windowed drivers in G2 is no problem as vo xvidix's job is already done by the GUI. One would only need to adjust their api to match the libvo2 api. To avoid the code duplication when using them in exclusive mode as it exist in G1 I would propose to make a special video filter that opens two vos. It would open the exclusive driver and would call its preinit and config and maybe paint the background with the colorkey.As the next step it could open a child driver like vidix and configurate it for playback ontop of the exclusive driver. This would allow to use vidix and the kernel mode drivers toghether with exclusive vo drivers without adapting the later ones to support them. Feel free to comment. Sascha From joey at nicewarrior.org Tue Dec 9 14:46:09 2003 From: joey at nicewarrior.org (Joey Parrish) Date: Tue, 9 Dec 2003 07:46:09 -0600 Subject: [MPlayer-G2-dev] Concerning TODO item re-think parent-child connection, move to vf_vo2.c maybe In-Reply-To: <016d01c3be57$e64946e0$566954d9@oemcomputer> References: <016d01c3be57$e64946e0$566954d9@oemcomputer> Message-ID: <20031209134609.GA7259@nicewarrior.org> On Tue, Dec 09, 2003 at 02:25:21PM +0100, Sascha Sommer wrote: > I think all of them need nothing more than a handle to the window > in which they should display the window, it's x, y, width and height > and the colorkey. Therefore I propose that the UI creates the windows > for them and passes its parameters to the vo when they change. Does this mean that each UI calls vo->create_window or some other vo function, or should th? UI know how to create a window for each vo it plans to use? I don't think that vo-dependant code should be in the UI, so I hope you don't mean to do this. > But what about vidix and the kernel mode drivers like mga_vid and tdfx_vid? > In G1 these can be used either as windowed drivers via [x/win]vidix or as [snip] I like your suggestion for the kernel mode drivers, though. And everything else sounds good. --Joey -- "Of the seven dwarves, the only one who shaved was Dopey. That should tell us something about the wisdom of shaving." From saschasommer at freenet.de Tue Dec 9 14:50:21 2003 From: saschasommer at freenet.de (Sascha Sommer) Date: Tue, 9 Dec 2003 14:50:21 +0100 Subject: [MPlayer-G2-dev] Concerning TODO item re-think parent-childconnection, move to vf_vo2.c maybe References: <016d01c3be57$e64946e0$566954d9@oemcomputer> <20031209134609.GA7259@nicewarrior.org> Message-ID: <018901c3be5b$63cebb60$566954d9@oemcomputer> > On Tue, Dec 09, 2003 at 02:25:21PM +0100, Sascha Sommer wrote: > > I think all of them need nothing more than a handle to the window > > in which they should display the window, it's x, y, width and height > > and the colorkey. Therefore I propose that the UI creates the windows > > for them and passes its parameters to the vo when they change. > > Does this mean that each UI calls vo->create_window or some other vo > function, or should th? UI know how to create a window for each vo it > plans to use? I don't think that vo-dependant code should be in the UI, > so I hope you don't mean to do this. > I think the vos only use two kinds of windows X11 and windows windows. If it is a linux gui it should know how to create a X11 window or call a helper function like the one in libvo2/x11_helper.c. Sascha From jiri.svoboda at seznam.cz Tue Dec 9 15:22:41 2003 From: jiri.svoboda at seznam.cz (Jiri Svoboda) Date: Tue, 9 Dec 2003 15:22:41 +0100 Subject: [MPlayer-G2-dev] Concerning TODO item re-thinkparent-childconnection, move to vf_vo2.c maybe In-Reply-To: <018901c3be5b$63cebb60$566954d9@oemcomputer> Message-ID: > > On Tue, Dec 09, 2003 at 02:25:21PM +0100, Sascha Sommer wrote: > > > I think all of them need nothing more than a handle to > the window in > > > which they should display the window, it's x, y, width and height > > > and the colorkey. Therefore I propose that the UI creates the > > > windows for them and passes its parameters to the vo when > they change. > > > > Does this mean that each UI calls vo->create_window or some > other vo > > function, or should th? UI know how to create a window for > each vo it > > plans to use? I don't think that vo-dependant code should > be in the > > UI, so I hope you don't mean to do this. > > > > I think the vos only use two kinds of windows X11 and windows windows. > If it is a linux gui it should know how to create a X11 > window or call a helper function like the one in libvo2/x11_helper.c. And uder DirectFB You can create windows too... So there should generic - not x11 dependant way. I can imagine that there is an GUI written using gtk and uses x11 based vo under xwindows and different vo under directfb. JS From joey at nicewarrior.org Tue Dec 9 17:12:10 2003 From: joey at nicewarrior.org (Joey Parrish) Date: Tue, 9 Dec 2003 10:12:10 -0600 Subject: [MPlayer-G2-dev] Concerning TODO item re-thinkparent-childconnection, move to vf_vo2.c maybe In-Reply-To: References: <018901c3be5b$63cebb60$566954d9@oemcomputer> Message-ID: <20031209161210.GC7575@nicewarrior.org> On Tue, Dec 09, 2003 at 03:22:41PM +0100, Jiri Svoboda wrote: > > I think the vos only use two kinds of windows X11 and windows windows. > > If it is a linux gui it should know how to create a X11 > > window or call a helper function like the one in libvo2/x11_helper.c. > > And uder DirectFB You can create windows too... So there should generic - > not x11 dependant way. > I can imagine that there is an GUI written using gtk and uses x11 based vo > under xwindows and different vo under directfb. I agree. Even if directx and x11 are the only two at the moment, we can't rely on that forever. People will want to implement new things in MPlayer later, and we should have a design that introduces as few roadblocks as possible. That's why we're talking about G2, right? Because G1 isn't flexible enough in some places. --Joey -- All philosophy is naive. From saschasommer at freenet.de Tue Dec 9 18:18:04 2003 From: saschasommer at freenet.de (Sascha Sommer) Date: Tue, 9 Dec 2003 18:18:04 +0100 Subject: [MPlayer-G2-dev] Concerning TODO itemre-thinkparent-childconnection, move to vf_vo2.c maybe References: Message-ID: <008301c3be78$68d13580$ed6254d9@oemcomputer> > > > On Tue, Dec 09, 2003 at 02:25:21PM +0100, Sascha Sommer wrote: > > > > I think all of them need nothing more than a handle to > > the window in > > > > which they should display the window, it's x, y, width and height > > > > and the colorkey. Therefore I propose that the UI creates the > > > > windows for them and passes its parameters to the vo when > > they change. > > > > > > Does this mean that each UI calls vo->create_window or some > > other vo > > > function, or should th? UI know how to create a window for > > each vo it > > > plans to use? I don't think that vo-dependant code should > > be in the > > > UI, so I hope you don't mean to do this. > > > > > > > I think the vos only use two kinds of windows X11 and windows windows. > > If it is a linux gui it should know how to create a X11 > > window or call a helper function like the one in libvo2/x11_helper.c. > > And uder DirectFB You can create windows too... So there should generic - > not x11 dependant way. > I can imagine that there is an GUI written using gtk and uses x11 based vo > under xwindows and different vo under directfb. > JS > This is not a problem as vo_directfb can still create windows itself. My proposal is only for the vos that share their window handling code. Look at the ontop/fullscreen /eventhandling/windowcreation code in vo x11, xv, xvidix, xover,gl, gl2 and on the windows side gl2, directx, winvidix. That are imho things that do not belong in the vos. The vos ideally should only export their buffers and display the buffer content based on given x,y,widht,height and colorkey. At least when it comes to the windowed vos... Sascha From joey at nicewarrior.org Tue Dec 9 20:15:08 2003 From: joey at nicewarrior.org (Joey Parrish) Date: Tue, 9 Dec 2003 13:15:08 -0600 Subject: [MPlayer-G2-dev] Concerning TODO itemre-thinkparent-childconnection, move to vf_vo2.c maybe In-Reply-To: <008301c3be78$68d13580$ed6254d9@oemcomputer> References: <008301c3be78$68d13580$ed6254d9@oemcomputer> Message-ID: <20031209191508.GA8182@nicewarrior.org> On Tue, Dec 09, 2003 at 06:18:04PM +0100, Sascha Sommer wrote: > > I can imagine that there is an GUI written using gtk and uses x11 based vo > > under xwindows and different vo under directfb. > > This is not a problem as vo_directfb can still create windows itself. My > proposal is only > for the vos that share their window handling code. Look at the > ontop/fullscreen > /eventhandling/windowcreation code in vo x11, xv, xvidix, xover,gl, gl2 and > on the windows > side gl2, directx, winvidix. That are imho things that do not belong in the > vos. The vos ideally > should only export their buffers and display the buffer content based on > given x,y,widht,height and > colorkey. At least when it comes to the windowed vos... No, it does not belong in vo, but not in ui either. Window creation on different windowing platforms should be it's own issue. --Joey -- "Living in the complex world of the future is somewhat like having bees live in your head. But, there they are." From kinali at gmx.net Wed Dec 10 22:42:31 2003 From: kinali at gmx.net (Attila Kinali) Date: Wed, 10 Dec 2003 22:42:31 +0100 Subject: [MPlayer-G2-dev] Concerning TODO item re-think parent-child connection, move to vf_vo2.c maybe In-Reply-To: <016d01c3be57$e64946e0$566954d9@oemcomputer> References: <016d01c3be57$e64946e0$566954d9@oemcomputer> Message-ID: <20031210224231.34f8fa5f.kinali@gmx.net> On Tue, 9 Dec 2003 14:25:21 +0100 "Sascha Sommer" wrote: > Ok lets see what we have and what we need. > As far as I know there are two kinds of vo modules. Exclusive and > nonexclusive or > windowed drivers. Exclusive drivers are fbdev, vesa, svga, dga etc. and the > windowed > x11, xv, directx, gl. Sofar, all drivers beside x11 and Xv on multiple screens are exclusive. Exclusive in that sense that you can only open one instance at a time. I think what you mean is that you can/cannot open another (x11) window for the GUI. Now, i don't think that we should overcomplicate G2 by adding a distinction for exclusive/nonexclusive drivers. G2 is only a lib to be used by frontends. Thus a frontend should know what it can do with which vo module. > VOCTRL_SET_WINDOW > VOCTRL_SET_COLORKEY > VOCTRL_RESIZE_DEST Those controls are imho enough... i've not read all the vo code yet, so maybe there is another one needed but i dont think so. > But what about vidix and the kernel mode drivers like mga_vid and tdfx_vid? > In G1 these can be used either as windowed drivers via [x/win]vidix or as > subdriver > of an exclusive driver like vo vesa. mga_vid and tdfx_vid only give you an abstraction on the hardware, just like vesa or Xv, nothing more. I dont think we have to change the drivers themselfs but (if there is any need to change ofcourse) the vo modules.. > Using them as windowed drivers in G2 is no problem as vo xvidix's job is > already done > by the GUI. One would only need to adjust their api to match the libvo2 api. > To avoid the code duplication when using them in exclusive mode as it exist > in G1 I would propose to make a special video filter that opens two vos. > It would open the exclusive driver and would call its preinit and config and > maybe paint the background with > the colorkey.As the next step it could open a child driver like vidix and > configurate it for playback ontop of the exclusive driver. This would allow > to use vidix and the kernel mode drivers toghether with exclusive vo drivers > without adapting the later ones to support them. > Feel free to comment. Afaik there was a diskussion about the xover system, where drivers could say "i can work in a x11 env with an overlay" and iirc this is already implemented. Though this is neither very generic nor portable to other systems, it's imho the way how it should be done. Everything else i can currently think of would be overcomplicated just for being able to handle special cases that most probably never occur. Attila Kinali -- egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger -- reeler in +kaosu From dalias at aerifal.cx Sun Dec 14 21:17:10 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sun, 14 Dec 2003 15:17:10 -0500 Subject: [MPlayer-G2-dev] buffer management and libvo2... Message-ID: <20031214201709.GA13399@brightrain.aerifal.cx> Hi again. It's almost time for more vp coding (actually I've already done a little), but I'm running into a problem interfacing with libvo2. With the old mpi system, the destination vf/vo you were direct rendering into knew exactly how many buffers of each type it would have to provide. This was based on assumptions about standard IPB codec behavior, which we want to do away with in G2 to allow direct rendering by more advanced codecs with multiple reference frames and such. Even in the absence of such codecs, it's very useful for a filter to be able to keep multiple reference frames for temporal filtering without copying. Thus, I really don't want to sacrifice this flexibility. So, what's the problem? Suppose your vo module can only provide 3 buffers. Suppose all three are already in use (either displayed, if in video memory, or still locked by the filter/codec). Then the filter chain gives you an export-type or automatic-type (allocated) image for the vo to display. Where does it go?? Before you object that filters should not grab many dr buffers at a time, let's look at a few examples... 1. vf_tfields. Let's say we've optimized tfields so that it does slice rendering. Thus, it grabs 2 dr buffers from the vo, and whenever it gets a slice, it draws the deinterlaced top field into the first buffer, and the deinterlaced bottom field into the second buffer. Think 2 buffers is no problem? Now consider what happens when the source codec doing the slice-rendering through tfields uses B frames (i.e. out of order rendering)... :))) 2. Codecs with multiple reference frames (h264, vp3, theora, ...). The codec might request and lock two readable buffers and also request another buffer for B-type (non-reference) frames. 3. vf_pullup. Right now it allocates its own buffers for deep buffering, but it would improve performance to use dr buffers from the vo when possible. It requires at least 3 buffers, and probably 4 or more. Cases 2 and 3 apply when the vo's buffers are readable, which _probably_ means they're in system memory and the vo can allocate as many as it wants. But I don't think it's good to rely on this assumption. Case 1 is definitely a problem, since it occurs with vo's that are letting you write directly into video memory (and therefore which have a very limited number of buffers!). Let's see how case 1 can actually lead to a deadlock: A. Codec is decoding a sequence IBP. The I frame was already finished. Now it's time to decode the P and then the B. B. Codec requests indirect buffer (slices) from tfields (for P frame). C. tfields gets 2 dr buffers from the vo to render slices into. D. ... P frame gets rendered ... E. Codec requests indirect buffer (for B frame). F. tfields tries to get dr buffers, but all buffers are exhausted, so it has to fall back to using auto (allocated) buffers. G. tfields returns the first field extracted from the B frame, which happens to be in an automatic buffer. H. vo tries to display the image, but all buffers are in use!!! I. Deadlock! What can be done about it? Solution 1. Never allow dr to the last buffer. Pros: Deadlock impossible. Simple. Cons: Sometimes we can't know how many buffers are available. Even if we can, refusing to use one buffer may prevent DR entirely!! Solution 2. When we run into deadlock, shuffle buffers. I.e. allocate new buffer in system memory and move the contents of one dr buffer there to free up space. Pros: Deadlock impossible. All buffers can be used. Cons: Very slow if dr buffers are in video memory! Also requires horrible hacks to change a dr-type mpi into an auto-type one. Solution 3. Require codecs/filters to report how many buffers they will need at config-time. Pros: Deadlock impossible. All buffers can be used. No ugly hacks. Cons: Very confusing! Requires considerably more logic in filters. For example, in case 1 above, tfields has to figure out it will need 4 buffers based on the info the codec gives it. How to do this?! Solution 4. ??? Any better ideas? I don't like any of my 3 proposals at all! BTW, vo_mga should definitely be updated to allow the full max of 4 buffers rather than just 3! Rich From dalias at aerifal.cx Mon Dec 15 07:27:21 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 15 Dec 2003 01:27:21 -0500 Subject: [MPlayer-G2-dev] buffer management and libvo2... In-Reply-To: <20031214201709.GA13399@brightrain.aerifal.cx> References: <20031214201709.GA13399@brightrain.aerifal.cx> Message-ID: <20031215062721.GO7833@brightrain.aerifal.cx> On Sun, Dec 14, 2003 at 03:17:10PM -0500, D Richard Felker III wrote: > What can be done about it? > > Solution 1. Never allow dr to the last buffer. > Pros: Deadlock impossible. Simple. > Cons: Sometimes we can't know how many buffers are available. Even if > we can, refusing to use one buffer may prevent DR entirely!! > > Solution 2. When we run into deadlock, shuffle buffers. I.e. allocate > new buffer in system memory and move the contents of one dr buffer > there to free up space. > Pros: Deadlock impossible. All buffers can be used. > Cons: Very slow if dr buffers are in video memory! Also requires > horrible hacks to change a dr-type mpi into an auto-type one. > > Solution 3. Require codecs/filters to report how many buffers they > will need at config-time. > Pros: Deadlock impossible. All buffers can be used. No ugly hacks. > Cons: Very confusing! Requires considerably more logic in filters. For > example, in case 1 above, tfields has to figure out it will need 4 > buffers based on the info the codec gives it. How to do this?! > > Solution 4. ??? I think I have a solution #4 that might work. This is a little bit hackish, but I can't think of anything better. * When requesting a DR or indirect buffer, a codec/filter MUST flag whether it will want any more buffers before releasing the one it requests. * If the "will want more buffers" flag is set, the destination filter/vo MUST not give up its last buffer. Thus, if only one buffer remains, the destination MUST reject direct rendering/slices. Sound ok? The only remaining question is how to know if a buffer is the last one. For vo's with a fixed number of buffers this is trivial, but if buffers are allocated on demand, maybe we have to attempt to allocate one buffer ahead...? Rich From dalias at aerifal.cx Mon Dec 15 10:07:28 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 15 Dec 2003 04:07:28 -0500 Subject: [MPlayer-G2-dev] vp layer and config Message-ID: <20031215090728.GA13461@brightrain.aerifal.cx> Despite it already being somewhat ugly and complicated, I'd actually like to propose adding some more config-time negotiations: * Strides for export & direct buffers. * Sample aspect ratio. * Time base. Time base doesn't seem controversial. Maybe some people want float time, but IMO using rational time base makes things easier for filters, and more precise. It's also very helpful for mencoder-g2, since you get a good default for the output framerate for fixed-framerate formats. Next up is sample aspect ratio (SAR). Using SAR is MUCH BETTER than using DAR, because it doesn't have to be adjusted by most filters. That means no messy rational arithmetic and trying to reduce fractions with huge denominators. Also, this means the beginning of the filter chain (codec end) doesn't need to know anything about the monitor's aspect ratio. The SAR from the source file can just pass all the way down the chain (except through filters like scale, halfpack, il, ... which will have to modify it) and get used for hardware scaling at the very end. Now finally the controversial one: strides. There are two examples in MPlayer G1 which provided the motivation for this: One is a nasty conflict between lavc and mga_vid. Due to hardware limitations, mga_vid requires stride to be aligned at 32-byte boundaries. Due to software limitations (maybe a performance issue too?) lavc requires stride not to change while decoding. Now, suppose you try to play a 720x480 movie with lavc on mga_vid with direct rendering of B frames. For the first (I) frame, MPlayer allocates a 720x480 buffer in readable system memory and lavc decodes into this buffer. Same for the following P frame. Then lavc tries to decode the B frame, so it requests a DR buffer from vo_mga. But this time, the stride is aligned to 736, so lavc fails to decode the frame!! Not only does DR fail to work, but all B frames get dropped entirely. This is not acceptable. Motivation #2 is a little more subtle, but still relevant. Recently Michael wrote vf_fil, a filter which interleaves or deinterleaves the two fields of an image by halving or doubling the stride. This is very useful, but it can't work properly (unless the user gets lucky!) because it needs to know the stride of the source images it will be accepting when it configures the next filter (output width depends on input stride). One workaround would be to delay configuring the rest of the filter chain until it gets the first image (much like how codecs delay config until they begin decoding the first frame), but I don't think it's a very clean solution. So the question that remains is _how_ to negotiate stride. Here's a rather elaborate proposal that should do the job... * Define a structure for representing stride restrictions. This structure will contain flags for various restriction types, as well as custom alignment values or explicit values the strides must take. * Provide a function for comparing stride restrictions, so that it becomes easy to test whether a source and destination filter are compatible, and whether the source is compatible with the destination's DR buffers. * When calling vp_config to configure the next filter, the source MUST pass its stride restrictions. It is ONLY allowed to pass a NULL structure if it can draw into a buffer of ANY stride. * When a filter's config function is called, it MUST verify that it can accept input from buffers meeting the source filter's stride restrictions. It may ONLY ignore the source stride restrictions if it can read from a buffer of ANY stride. * If a filter is going to provide direct-rendering buffers, it MUST verify during its config function that it will be able to provide DR buffers meeting the source filter's stride restrictions. If not, it MUST disable direct rendering. Furthermore, the filter SHOULD store the stride it will use for DR buffers in the link structure before returning. If the filter does not store such a stride, then it MUST use whatever stride the vp layer stores in the link structure after config returns. * After a filter's config function returns, the vp layer will choose an appropriate stride and store it in the link structure, if one was not already selected by the filter. * When allocating automatic-type buffers, the vp layer will always use the stride stored in the link. * When using export-type buffers, a source filter MUST ensure that they meet the stride restrictions of the destination. However, the source filter is not required to use the stride stored in the link structure, so long as it meets all the requirements. Now a list of possible stride restrictions: * Byte aligned: Stride must be aligned on the given byte boundary. * Pixel aligned: Stride must be aligned on the given pixel boundary. * Exact value: Stride must be the exact value specified. * Positive: All strides must be positive numbers. * Common stride: Stride for U and V planes are both equal to the stride of the Y plane, shifted by horizontal chroma shift. * Common plane: Planes follow one another immediately in memory. * Static: Once a particular stride has been used, buffers with any other stride are not permitted. Anything I'm missing? Rich From arpi at thot.banki.hu Mon Dec 15 10:13:02 2003 From: arpi at thot.banki.hu (Arpi) Date: Mon, 15 Dec 2003 10:13:02 +0100 Subject: [MPlayer-G2-dev] vp layer and config In-Reply-To: <20031215090728.GA13461@brightrain.aerifal.cx> Message-ID: <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> Hi, > Despite it already being somewhat ugly and complicated, I'd actually > like to propose adding some more config-time negotiations: Instead of hacking and adding more, i would suggest to drop g1's vf coimpletely and re-design from scratch for g2. Yes i know i was the one against this way, but i've changed my mind :) some issues to solve: - runtime re-configuration (aspect ratio, size, stride, colorspace(?) changes) - aspect ratio negotation through the vf layer to vo (pass thru size & aspect to the vo layer, as some vo (directx, xv) doesnt like all resolutions) - window resizing issue (user resizes vo window, then reconfigure scale expand etc filters to produce image in new size) - better buffer management (get/put_buffer method) - split mp_image to colorspace descriptor (see thread on this list) and buffer descriptor (stride, pointers), maybe a 3rd part containing frame descriptor (frame/field flags, timestamp, etc so info related to the visual content of the image, not the phisical buffer itself, so linear converters (colorspace conf, scale, expand etc) could simply passthru this info and change buffer desc only) - correct support for slices (note there are 2 kind of strides: one when you call next filter's draw_slice after each slice rendering to next vf's buffer completed, and the other type is when you have own small buffer where one slice overwrites the previous one) - somehow solve framedropping support (now its near impossible in g2, as you hav eto decode and pass a frame through the vf layer to get its timestamp, to be used to decide if you drop it, but then it's already too late to drop) i think the new vf layer is the key of near everything. A'rpi / Astral & ESP-team -- Developer of MPlayer G2, the Movie Framework for all - http://www.MPlayerHQ.hu From dalias at aerifal.cx Mon Dec 15 11:09:26 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 15 Dec 2003 05:09:26 -0500 Subject: [MPlayer-G2-dev] vp layer and config In-Reply-To: <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> References: <20031215090728.GA13461@brightrain.aerifal.cx> <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> Message-ID: <20031215100926.GR7833@brightrain.aerifal.cx> On Mon, Dec 15, 2003 at 10:13:02AM +0100, Arpi wrote: > Hi, > > > Despite it already being somewhat ugly and complicated, I'd actually > > like to propose adding some more config-time negotiations: > > Instead of hacking and adding more, i would suggest to drop g1's > vf coimpletely and re-design from scratch for g2. > Yes i know i was the one against this way, but i've changed my mind :) That's basically what I'm doing already... :) > some issues to solve: > - runtime re-configuration (aspect ratio, size, stride, colorspace(?) changes) Some filters will naturally be able to support changes like this, but many won't without some sort of reset/discontinuity in output. Pretty much all temporal filters will at least show momentary artefacts when reconfiguring; this is inevitable. Some codecs (even prominent ones like lavc!!) will not allow you to change stride after the first frame is decoded. I had to add a new stride restriction type just for this. Anyway, here's a model I see for runtime reconfiguration: * User changes parameters for a filter through some sort of interface (gui widgets, osd, lirc, slavemode, keyboard, whatever). * New config gets passed to the filter via the cfg layer. * The filter then calls vp_config on its output link to renegotiate the image size, format, stride restrictions, etc. it will need. Similar idea if a new filter gets inserted at runtime, or if one gets removed. > - aspect ratio negotation through the vf layer to vo > (pass thru size & aspect to the vo layer, as some vo (directx, xv) doesnt > like all resolutions) Hm? I think what I said about just passing SAR instead of DAR pretty much covers it. No negotiation is needed. As long as the correct SAR is available at the end of the filter chain, the calling app and/or vo stuff can use the SAR to generate an appropriate and compatible display size. > - window resizing issue (user resizes vo window, then reconfigure scale > expand etc filters to produce image in new size) Let me propose a solution. As we discussed before, having resize events propagate back through the filter chain is very bad -- in many cases you'll get bogus output. How about this instead: if software zoom is enabled, something in the vo module inserts a scale filter and keeps a reference to it, so it can at a later time reconfigure this filter. Same idea for expand. Thus we reduce it to the same problem as runtime reconfiguration by the user, except that it's controlled by the vo module instead of by the gui/osd/whatever widgets the user is interacting with. > - better buffer management (get/put_buffer method) Already doing it. > - split mp_image to colorspace descriptor (see thread on this list) > and buffer descriptor (stride, pointers), maybe a 3rd part containing > frame descriptor (frame/field flags, timestamp, etc so info related to > the visual content of the image, not the phisical buffer itself, so > linear converters (colorspace conf, scale, expand etc) could simply > passthru this info and change buffer desc only) Agree, nice ideas! IMO it's not ok to just point to the original frame's descriptor (since they might not have the same lifetime!) so the info will have to be copied instead, but that's still easy as long as we put it in a nice struct without multiple levels of pointers inside. I'll make these changes to my working copies of vp.[ch]. > - correct support for slices (note there are 2 kind of strides: one > when you call next filter's draw_slice after each slice rendering > to next vf's buffer completed, and the other type is when you have > own small buffer where one slice overwrites the previous one) Hmm, I said this too in a post a while back, but then I worried that it was too complicated... Do you have a proposal for how it should work? > - somehow solve framedropping support > (now its near impossible in g2, as you hav eto decode and pass a > frame through the vf layer to get its timestamp, to be used to > decide if you drop it, but then it's already too late to drop) No, it's very easy with my design! :)) All frames, even dropped frames pass through the vf chain. But if you set the drop flag when calling pull_image, the source codec/vo isn't required to output any valid image, just the metadata. In fact we could create a new buffer type "DUMMY" for it, where there are no actual buffers. The benefit of this system is that the pull_image call still propagates all the way back through the chain to the codec. The only difference is that the drop flag is set. So all filters naturally know if they're missing frames, and if they really need to see every frame, they can refuse to propagate the drop flag any further. (IMO there should be some policy for what they're required to do, and perhaps several levels of dropflag, the highest of which must always be honored.) > i think the new vf layer is the key of near everything. Agree totally! Maybe new af layer too...? :) Rich From dalias at aerifal.cx Mon Dec 15 11:49:51 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 15 Dec 2003 05:49:51 -0500 Subject: [MPlayer-G2-dev] vp layer and config In-Reply-To: <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> References: <20031215090728.GA13461@brightrain.aerifal.cx> <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> Message-ID: <20031215104951.GS7833@brightrain.aerifal.cx> On Mon, Dec 15, 2003 at 10:13:02AM +0100, Arpi wrote: > - split mp_image to colorspace descriptor (see thread on this list) > and buffer descriptor (stride, pointers), maybe a 3rd part containing > frame descriptor (frame/field flags, timestamp, etc so info related to > the visual content of the image, not the phisical buffer itself, so > linear converters (colorspace conf, scale, expand etc) could simply > passthru this info and change buffer desc only) I've been working on implementing this, but there's one element of mp_image_t I'm not sure where to put. Actually this has been bothering me for a while now. The exported quant_store (qscale). In G1 the pointer just gets copied when passing it on through filters, but this is probably between mildly and seriously incorrect, especially with out-of-order rendering. IMO storing quant table in the framedesc isn't a good idea, since quantizers are only valid for the original buffer arrangement. Actually, I tend to think they belong in the buffer descriptor, almost like a fourth plane. But who should be responsible for allocating and freeing the quant plane? IMO the only way it can really work properly is to have the same code that allocates the ordinary planes be responsible for the quant plane too.. This would mean: When exporting buffers, you just set quant to point at whatever you like (as long as it won't be destroyed until the buffer is released). When using automatic buffers, the vp layer would allocate quant plane for you (but how does it know the right size?) and you have to fill it (or else don't mark it as valid). When using direct rendering, the target filter has to allocate the quant plane (again, how does it determine the size?). Somehow this sounds awkward. Any better ideas? Rich From michaelni at gmx.at Mon Dec 15 12:10:48 2003 From: michaelni at gmx.at (Michael Niedermayer) Date: Mon, 15 Dec 2003 12:10:48 +0100 Subject: [MPlayer-G2-dev] vp layer and config In-Reply-To: <20031215104951.GS7833@brightrain.aerifal.cx> References: <20031215090728.GA13461@brightrain.aerifal.cx> <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> <20031215104951.GS7833@brightrain.aerifal.cx> Message-ID: <200312151210.48523.michaelni@gmx.at> Hi On Monday 15 December 2003 11:49, D Richard Felker III wrote: > On Mon, Dec 15, 2003 at 10:13:02AM +0100, Arpi wrote: > > - split mp_image to colorspace descriptor (see thread on this list) > > and buffer descriptor (stride, pointers), maybe a 3rd part containing > > frame descriptor (frame/field flags, timestamp, etc so info related to > > the visual content of the image, not the phisical buffer itself, so > > linear converters (colorspace conf, scale, expand etc) could simply > > passthru this info and change buffer desc only) > > I've been working on implementing this, but there's one element of > mp_image_t I'm not sure where to put. Actually this has been bothering > me for a while now. The exported quant_store (qscale). In G1 the > pointer just gets copied when passing it on through filters, but this > is probably between mildly and seriously incorrect, especially with > out-of-order rendering. > > IMO storing quant table in the framedesc isn't a good idea, since > quantizers are only valid for the original buffer arrangement. > Actually, I tend to think they belong in the buffer descriptor, almost > like a fourth plane. But who should be responsible for allocating and > freeing the quant plane? IMO the only way it can really work properly > is to have the same code that allocates the ordinary planes be > responsible for the quant plane too.. btw, we could also pass other stuff like motion vectors around, these maybe usefull for fast transcoding > > This would mean: > > When exporting buffers, you just set quant to point at whatever you > like (as long as it won't be destroyed until the buffer is released). > > When using automatic buffers, the vp layer would allocate quant plane > for you (but how does it know the right size?) and you have to fill it > (or else don't mark it as valid). > > When using direct rendering, the target filter has to allocate the > quant plane (again, how does it determine the size?). the quant plane is always (width+15)/16 x (height+15)/16 big, but we could use something like enum PlaneType{ Y_PLANE, CB_PLANE, CR_PLANE, ALPHA_PLANE, QUANT_PLANE, FORWARD_MOTION_PLANE, BACKWARD_MOTION_PLANE, } struct PlaneDescriptor{ int bpp; //bits per pixel (32bit for 2x16bit motion vectors) int log2_subsample[2]; // like chroma_w/h_shift int offset[2]; //x/y offsets of the 0,0 sample relative to the luma plane in 1/2 sample precission } [...] -- Michael level[i]= get_vlc(); i+=get_vlc(); (violates patent EP0266049) median(mv[y-1][x], mv[y][x-1], mv[y+1][x+1]); (violates patent #5,905,535) buf[i]= qp - buf[i-1]; (violates patent #?) for more examples, see http://mplayerhq.hu/~michael/patent.html stop it, see http://petition.eurolinux.org & http://petition.ffii.org/eubsa/en From andrej at lucky.net Mon Dec 15 12:50:15 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Mon, 15 Dec 2003 13:50:15 +0200 Subject: [MPlayer-G2-dev] Re: vp layer and config In-Reply-To: <20031215100926.GR7833@brightrain.aerifal.cx> References: <20031215090728.GA13461@brightrain.aerifal.cx> <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> <20031215100926.GR7833@brightrain.aerifal.cx> Message-ID: <20031215115015.GB58730@lucky.net> Hi, D Richard Felker III! Sometime (on Monday, December 15 at 11:57) I've received something... >Anyway, here's a model I see for runtime reconfiguration: >* User changes parameters for a filter through some sort of interface > (gui widgets, osd, lirc, slavemode, keyboard, whatever). >* New config gets passed to the filter via the cfg layer. >* The filter then calls vp_config on its output link to renegotiate > the image size, format, stride restrictions, etc. it will need. >Similar idea if a new filter gets inserted at runtime, or if one gets >removed. Let me insert my 2c. ;) Agree with that above. I really thought the same way a while ago (and I even wrote here about that but in other words). :) With best wishes. Andriy. From dalias at aerifal.cx Mon Dec 15 13:06:49 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 15 Dec 2003 07:06:49 -0500 Subject: [MPlayer-G2-dev] vp layer and config In-Reply-To: <200312151210.48523.michaelni@gmx.at> References: <20031215090728.GA13461@brightrain.aerifal.cx> <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> <20031215104951.GS7833@brightrain.aerifal.cx> <200312151210.48523.michaelni@gmx.at> Message-ID: <20031215120649.GT7833@brightrain.aerifal.cx> On Mon, Dec 15, 2003 at 12:10:48PM +0100, Michael Niedermayer wrote: > Hi > > On Monday 15 December 2003 11:49, D Richard Felker III wrote: > > On Mon, Dec 15, 2003 at 10:13:02AM +0100, Arpi wrote: > > > - split mp_image to colorspace descriptor (see thread on this list) > > > and buffer descriptor (stride, pointers), maybe a 3rd part containing > > > frame descriptor (frame/field flags, timestamp, etc so info related to > > > the visual content of the image, not the phisical buffer itself, so > > > linear converters (colorspace conf, scale, expand etc) could simply > > > passthru this info and change buffer desc only) > > > > I've been working on implementing this, but there's one element of > > mp_image_t I'm not sure where to put. Actually this has been bothering > > me for a while now. The exported quant_store (qscale). In G1 the > > pointer just gets copied when passing it on through filters, but this > > is probably between mildly and seriously incorrect, especially with > > out-of-order rendering. > > > > IMO storing quant table in the framedesc isn't a good idea, since > > quantizers are only valid for the original buffer arrangement. > > Actually, I tend to think they belong in the buffer descriptor, almost > > like a fourth plane. But who should be responsible for allocating and > > freeing the quant plane? IMO the only way it can really work properly > > is to have the same code that allocates the ordinary planes be > > responsible for the quant plane too.. > btw, we could also pass other stuff like motion vectors around, these maybe > usefull for fast transcoding And MB types too?? :) There's lots of stuff we _could_ pass around. The problem is doing it in a sane way that doesn't overcomplicate things for codecs/filters. > the quant plane is always (width+15)/16 x (height+15)/16 big, but we could use > something like This is true for mpeg1/2/4. But is it the same for mpeg2 with 4:2:2 sampling? And what about strange codecs like svq3? > enum PlaneType{ > Y_PLANE, > CB_PLANE, > CR_PLANE, > ALPHA_PLANE, > QUANT_PLANE, > FORWARD_MOTION_PLANE, > BACKWARD_MOTION_PLANE, > } > struct PlaneDescriptor{ > int bpp; //bits per pixel (32bit for 2x16bit motion vectors) > int log2_subsample[2]; // like chroma_w/h_shift > int offset[2]; //x/y offsets of the 0,0 sample relative to the luma plane > in 1/2 sample precission > } I think something in between what you proposed and the old "dumb" way of doing it is probably appropriate. If you have too much flexibility, then filters and codecs have to handle all the cases that allows. :( If we do want to define extra planes like this (quant & palette are definitely needed, and maybe your mv's too) they should be at fixed indices in the planes array (maybe that's what you meant by using the planetype enum). Maybe there should be a separate array of 'extradata planes'...? Quant, mv's, etc. are definitely extra, but IMO palette isn't! So I'm not sure how to best sort it all. Rich From michaelni at gmx.at Mon Dec 15 13:22:24 2003 From: michaelni at gmx.at (Michael Niedermayer) Date: Mon, 15 Dec 2003 13:22:24 +0100 Subject: [MPlayer-G2-dev] vp layer and config In-Reply-To: <20031215120649.GT7833@brightrain.aerifal.cx> References: <20031215090728.GA13461@brightrain.aerifal.cx> <200312151210.48523.michaelni@gmx.at> <20031215120649.GT7833@brightrain.aerifal.cx> Message-ID: <200312151322.24285.michaelni@gmx.at> Hi On Monday 15 December 2003 13:06, D Richard Felker III wrote: > On Mon, Dec 15, 2003 at 12:10:48PM +0100, Michael Niedermayer wrote: > > Hi > > > > On Monday 15 December 2003 11:49, D Richard Felker III wrote: > > > On Mon, Dec 15, 2003 at 10:13:02AM +0100, Arpi wrote: > > > > - split mp_image to colorspace descriptor (see thread on this list) > > > > and buffer descriptor (stride, pointers), maybe a 3rd part > > > > containing frame descriptor (frame/field flags, timestamp, etc so > > > > info related to the visual content of the image, not the phisical > > > > buffer itself, so linear converters (colorspace conf, scale, expand > > > > etc) could simply passthru this info and change buffer desc only) > > > > > > I've been working on implementing this, but there's one element of > > > mp_image_t I'm not sure where to put. Actually this has been bothering > > > me for a while now. The exported quant_store (qscale). In G1 the > > > pointer just gets copied when passing it on through filters, but this > > > is probably between mildly and seriously incorrect, especially with > > > out-of-order rendering. > > > > > > IMO storing quant table in the framedesc isn't a good idea, since > > > quantizers are only valid for the original buffer arrangement. > > > Actually, I tend to think they belong in the buffer descriptor, almost > > > like a fourth plane. But who should be responsible for allocating and > > > freeing the quant plane? IMO the only way it can really work properly > > > is to have the same code that allocates the ordinary planes be > > > responsible for the quant plane too.. > > > > btw, we could also pass other stuff like motion vectors around, these > > maybe usefull for fast transcoding > > And MB types too?? :) yes > > There's lots of stuff we _could_ pass around. The problem is doing it > in a sane way that doesn't overcomplicate things for codecs/filters. > > > the quant plane is always (width+15)/16 x (height+15)/16 big, but we > > could use something like > > This is true for mpeg1/2/4. But is it the same for mpeg2 with 4:2:2 > sampling? And what about strange codecs like svq3? its true for svq3 & mpeg2 4:*:* too AFAIK > > > enum PlaneType{ > > Y_PLANE, > > CB_PLANE, > > CR_PLANE, > > ALPHA_PLANE, > > QUANT_PLANE, > > FORWARD_MOTION_PLANE, > > BACKWARD_MOTION_PLANE, > > } > > struct PlaneDescriptor{ > > int bpp; //bits per pixel (32bit for 2x16bit motion vectors) > > int log2_subsample[2]; // like chroma_w/h_shift > > int offset[2]; //x/y offsets of the 0,0 sample relative to the luma > > plane in 1/2 sample precission > > } > > I think something in between what you proposed and the old "dumb" way > of doing it is probably appropriate. If you have too much flexibility, > then filters and codecs have to handle all the cases that allows. :( we could enforce some restrictions and still keep it flexible, allthough we probably must then put some checks in the code to ensure that noone violates them cuz of lack of RTFM (or lack of WTFM) > > If we do want to define extra planes like this (quant & palette are > definitely needed, and maybe your mv's too) they should be at fixed > indices in the planes array (maybe that's what you meant by using the > planetype enum). yes, that was exactly the reason [...] -- Michael level[i]= get_vlc(); i+=get_vlc(); (violates patent EP0266049) median(mv[y-1][x], mv[y][x-1], mv[y+1][x+1]); (violates patent #5,905,535) buf[i]= qp - buf[i-1]; (violates patent #?) for more examples, see http://mplayerhq.hu/~michael/patent.html stop it, see http://petition.eurolinux.org & http://petition.ffii.org/eubsa/en From dalias at aerifal.cx Wed Dec 17 06:33:09 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 17 Dec 2003 00:33:09 -0500 Subject: [MPlayer-G2-dev] Re: slices in g2 Message-ID: <20031217053309.GS7833@brightrain.aerifal.cx> [Intentionally removed this header to start a new thread :] In-Reply-To: <200312150913.hBF9D2aD001616 at mail.mplayerhq.hu> On Mon, Dec 15, 2003 at 10:13:02AM +0100, Arpi wrote: > - correct support for slices (note there are 2 kind of strides: one > when you call next filter's draw_slice after each slice rendering > to next vf's buffer completed, and the other type is when you have > own small buffer where one slice overwrites the previous one) Arpi (and others): I'd like some advice on designing the new slice support for g2. It's a delicate balancing act of functionality and complexity, because if using slices is too complicated, no one will use slices in their filters and they'll be no help, but if there isn't enough functionality, they'll also be useless... I'm thinking about various conditions the source and destination filter might want to impose on slices... * x=0 and w=fullwidth * Rendering slices in order (top to bottom) * Rendering slices in order but bottom-to-top (dumb codecs) * Some specified amount of context around the slice (for filtering) * Direct rendering with notification as slices are finished * Ability to examine the entire source buffer slices are being drawn from, including already-completed parts... * Alignment? Certainly we should have (x%8)==0... And y should be divisible by the chroma subsampling... Here are some possible scenarios using slices, just for things to think about: 1. Video codec is using internal buffers, rendering into a filter via slices. The filter will do most of its processing while receiving slices (maybe just taking some metrics from the image), but it also wants to see the final picture. 2. SwScale has (hypothetically) been replaced by separate horizontal and vertical scale filters and colorspace pre/post-converters. And we want to render slices through the while damn thing... 3. Tfields wants to improve cache performance by splitting fields a slice at a time, but it needs a few pixels of context to do the filtering for quarterpixel translate. Now some further explanation of the issues and questions I have: With the first scenario, let's simplify by saying that the filter is just going to pass the image through unchanged. Thus, there's a very clear need for it to have the final picture available; otherwise it would have to waste time making a copy while processing slices. Having an arrangement where a codec draws into a buffer and notifies the filter as slices are completed is fairly straightforward when the buffer is a DR buffer provided by the filter, because the filter is the _owner_ of this buffer and already has the private data areas and locks to keep track of it. But if the buffer is AUTO(-allocated) or EXPORTed from the codec, some sort of mechanism is going to be needed to inform the filter about its existence and attach it to slice rendering. In scenario 3, there are several ways to ensure tfields has the necessary context. One way is for it to store a few lines from the previous slice, but this is incredibly PAINFUL for the filter author if slices can come in any order, any shape, and any sizes! Another possible solution is forcing the source that's sending the slices to assure that there's a border around the slice with sufficient context, but that makes things painful for the source filter author. As for scenario 2, I don't even want to think about it... :) All the above concerns about alignment, rendering order, extra lines of context, etc. etc. etc. come into play. So in summary... The basic question is, what degree of freedom should codecs/filters rendering with slices have, and what sort of restrictions should be replaced on them? If the restrictions are going to be allowed to vary from filter to filter, how should they be negotiated? I'll post a rough proposal soon. Rich From dalias at aerifal.cx Wed Dec 17 08:18:33 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 17 Dec 2003 02:18:33 -0500 Subject: [MPlayer-G2-dev] Re: slices in g2 In-Reply-To: <20031217053309.GS7833@brightrain.aerifal.cx> References: <20031217053309.GS7833@brightrain.aerifal.cx> Message-ID: <20031217071833.GT7833@brightrain.aerifal.cx> OK, here's the proposal. Arpi says the two types of slices are: > when you call next filter's draw_slice after each slice rendering > to next vf's buffer completed, and the other type is when you have > own small buffer where one slice overwrites the previous one) So let's start there. First, a refresher on the vp (video pipeline) layer and buffer types for mp_images... Video pipeline consists of a collection of decoders, filters, output devices, and encoders (vd/vf/vo/ve), collectively referred to as nodes. All can either serve as a source of images, or a destination, and some (filters) can serve as both. The nodes of the pipeline are connected by links. A link maintains the negotiated parameters for passing images between nodes: image format, dimensions, aspect, strides, etc., and manages the buffers that will pass between its endpoints. Most filters have exactly one input and one output link, but the possibility is allowed for a filter to have multiple inputs or multiple outputs. Again, the link serves as the broker for images that will pass over it. Image structures (still called mp_image_t for now) belong to the link they originate with, and cannot be passed around to other parts of the pipeline. However, via EXPORT and DIRECT type images, the same buffers can be passed on in either direction in the chain. Buffer types for images are as follows: AUTO(matic): allocated and managed by the vp layer. Automatically uses the negotiated strides for the link. No owner. DIRECT(rendering): buffer pointers are filled in by the destination node's get_buffer function. The destination node becomes the owner, and will be informed via release_buffer when the reference count reaches zero. EXPORT: buffer pointers are filled in by the source node after requesting the image. Keep in mind that they need not point to a codec-internal buffer. They might point to the buffers from an AUTO-type image earlier in the pipeline, or (in some very fancy multiple-vo setups :) a DIRECT-type buffer from another branch of the pipeline. The source node which obtains the EXPORT-type buffer becomes its owner and will be informed when it's released. [Note that both EXPORT and DIRECT buffers are very useful for avoiding excess copying between filters. EXPORT should not be thought of as a backwards-compatibility type for old codecs, because it can do a lot more than just that!] INDIRECT: the image has no buffer pointers. Instead, it must be drawn into via slices. The image structure, which is owned by the destination node, exists only to carry meta-information about the frame (pts, etc.). It should also carry in its private data area some way for the destination node to identify which hidden buffer it corresponds to, if there is more than one such buffer. DUMMY: used for dropped frames. No buffers whatsoever, only pts and perhaps some other metainformation. OK, now for the new stuff: If you recall, we were considering Arpi's point that there are two types of slices. For now (but not in the final design) we'll call them smart slices (notification as parts are completed) and dumb slices (source buffer for slices becomes invalid after use). First, dumb slices. I'd like to propose that drawing via dumb slices can _always_ be done into DIRECT or INDIRECT type buffers. The first thing that should come to mind here is that, if a node supports INDIRECT buffers, it _must_ accept whatever slices it's given and it _must_ be able to function without any additional source context. This means that INDIRECT buffers are appropriate for vo drivers to use, and for filters that just operate locally on pixels (for instance, equalizer, format/colorspace converter, non-filtered field splitter). However, filters that can't just operate locally should not provide INDIRECT buffers, and should instead use the smart slices method described below. Note that for the DIRECT case, the vp layer just copies the region and then notifies the destination node, so in effect you get smart slices for free! Now, smart slices! :) After obtaining an AUTO or EXPORT image, a source node which supports slices should register with the link layer that it wishes to do smart slices. This gives the destination node an opportunity to report whether it supports slices, and to request any buffers it needs from the next node in the pipeline, etc. If the destination node accepts the request for slices, the source node is _required_ to call commit_slice [exactly?] once for each region of the image. Now for some API fun... Functions provided by the vp link layer: vp_attach_slices: takes an image (AUTO or EXPORT type) and attempts to initiate smart slice rendering through it. vp_commit_slice: takes a slice-capable image as a source and notifies the destination node that the specified slice has been completed. vp_draw_slice: takes a source from arbitrary pointers and draws the specified slice to a DIRECT or INDIRECT image, passed as its destination. Functions implemented by vp nodes: attach_slices: Called to request that the destination node prepare for smart slice rendering from the image passed as an argument. The destination node should store any references it needs to keep in the appropriate private data area of the image structure. commit_slice: The destination node's commit_slice is called by the vp layer when the source node has finished rendering a slice. draw_slice: Similar to commit_slice, but only called when rendering into an INDIRECT buffer or if the destination node does not implement commit_slice. get_buffer: Used to obtain DIRECT and INDIRECT buffers from a node. release_buffer: Called when the reference count on a DIRECT, INDIRECT, or EXPORT image reaches zero. detach_slices: Called when the reference count on an image used for smart slices reaches zero. Source node implementation: A source should _never_ request an INDIRECT buffer unless there is a performance benefit, e.g. reusing a small buffer that will remain in the processor's cache, or unless it is wrapping an external codec or filter that only supports dumb style slices. Instead, sources should use vp_attach_slices when possible. If the destination node supports INDIRECT buffers but not attach_slices, then the vp link layer will emulate attach_slices by obtaining an INDIRECT buffer (i.e. smart slices can be emulated with dumb slices). Destination node implementation: In general, a destination node should not support both INDIRECT buffers and attach_slices; to do so is redundant. Supporting both may be useful in (very rare) cases where the filter can enjoy better performance by using smart slices instead of dumb slices. Again, destination nodes should not support INDIRECT buffers unless they only perform pixel-localized filterring, stride arithmetic, or other tasks that do not require the context of nearby pixels. The destination node should check the buffer type when receiving commit_slice or draw_slice calls. Often different types call for different reactions. OK, I think that's about it for now. Keep in mind that I at least roughly know what I'm talking about here, since I'm writing the code as I write the specs. (To make sure what I say is possible in code. :) Sometime soon I'll post some code, but if I did that now I'd be drinking cola 'til after the newyear... :)) Rich From dalias at aerifal.cx Wed Dec 17 23:41:03 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 17 Dec 2003 17:41:03 -0500 Subject: [MPlayer-G2-dev] g2 & sh_video Message-ID: <20031217224103.GY7833@brightrain.aerifal.cx> Looking at the sh_video structure (which I must admit I never understood very well in G1 ;) it seems like a lot of it is obsolete with the new video layer. Aside from holding some codec parameters obtained from the demuxer layer, its main purpose seems to have been providing a point for the player to monitor and control a/v sync, decoding, and seeking. But now: 1. The entry point for decoding has been moved to the video pipeline, and decoding is driven from the output end rather than the decoder end. And the vd wrapper no longer exists; decoders are native nodes in the video pipeline that use the same api. 2. A/V sync is performed on the final output frames (possibly with adjusted pts) rather than on the original decoded frames. 3. NEW: I recommend that seeking also be performed through the vp layer, as a control that passes up the pipeline to the codec and demuxer. This way editlists can be implemented as a filter that remaps pts and seek requests, allowing full interactive seeking in an editlist'ed video. With these changes (or even with just #1 and 2), most of the sh_video structure is useless. But there's a little bit that's still useful -- the actual "stream header" data from the demuxer, i.e. the stuff that sh_video was probably meant for in the beginning. Should we keep sh_video, and just strip it down to this minimal level, or should the data be moved to some other (demuxer?) structures? Rich From dalias at aerifal.cx Thu Dec 18 01:04:49 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 17 Dec 2003 19:04:49 -0500 Subject: [MPlayer-G2-dev] vp layer and config In-Reply-To: <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> References: <20031215090728.GA13461@brightrain.aerifal.cx> <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> Message-ID: <20031218000449.GZ7833@brightrain.aerifal.cx> Time to address what's been solved and what hasn't: On Mon, Dec 15, 2003 at 10:13:02AM +0100, Arpi wrote: > some issues to solve: > - runtime re-configuration (aspect ratio, size, stride, colorspace(?) changes) There are two issues here: 1. New cfg-system configuration from the user/calling app. 2. New config() from the previous filter/codec. Actually from an implementation standpoint, both are fairly similar. A given node in the pipeline should be able to support either, both, or neither type of reconfiguration. If a node does not support reconfiguration, the vp layer should close and reopen it with the new configuration. If a reopen will cause a discontinuity in the video (think temporal filters) then the vp layer should instead insert conversion filters to avoid reconfiguration, if possible. Summary: NOT done. Some of this is outside the realm of the vp layer, and more related to config layer, so I'm willing to ignore it for now. > - aspect ratio negotation through the vf layer to vo > (pass thru size & aspect to the vo layer, as some vo (directx, xv) doesnt > like all resolutions) I'm not quite sure what this one means. IMO passing the sample aspect ratio all the way through the chain handles all problems. The vo driver is free to display the video at wrong aspect (if it's not capable of scaling) or reject the config request and force software scaler to get loaded to fix aspect. > - window resizing issue (user resizes vo window, then reconfigure scale > expand etc filters to produce image in new size) Addressed in former post. Basic idea is that the app/interface should insert filters and control them directly itself, rather than trying to pass messages by chain. > - better buffer management (get/put_buffer method) Done, mostly. ;) There are some nasty issues of how to clean up when closing a node while another node still has locks on its buffers, which need to be worked out. > - split mp_image to colorspace descriptor (see thread on this list) > and buffer descriptor (stride, pointers), maybe a 3rd part containing > frame descriptor (frame/field flags, timestamp, etc so info related to > the visual content of the image, not the phisical buffer itself, so > linear converters (colorspace conf, scale, expand etc) could simply > passthru this info and change buffer desc only) Done. > - correct support for slices (note there are 2 kind of strides: one > when you call next filter's draw_slice after each slice rendering > to next vf's buffer completed, and the other type is when you have > own small buffer where one slice overwrites the previous one) Mostly done. See slices thread regarding various issues on slice restrictions which we may need to consider. > - somehow solve framedropping support > (now its near impossible in g2, as you hav eto decode and pass a > frame through the vf layer to get its timestamp, to be used to > decide if you drop it, but then it's already too late to drop) If duration is valid, you have an easy solution! Present frame's duration is next frame's rel_pts! Otherwise you're pretty much out of luck. It would be possible to put a framedropping filter at an arbitrary place in the video pipeline and have the calling app control it, but I expect bad results. IMO, the best solution is to use duration when it's valid, and otherwise, wait to start dropping frames until we're already behind schedule. So, I consider this issue closed. :) > i think the new vf layer is the key of near everything. I still like this line. :)))))))) Rich From joey at nicewarrior.org Thu Dec 18 02:39:54 2003 From: joey at nicewarrior.org (Joey Parrish) Date: Wed, 17 Dec 2003 19:39:54 -0600 Subject: [MPlayer-G2-dev] vp layer and config In-Reply-To: <20031218000449.GZ7833@brightrain.aerifal.cx> References: <20031215090728.GA13461@brightrain.aerifal.cx> <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> <20031218000449.GZ7833@brightrain.aerifal.cx> Message-ID: <20031218013953.GA3042@nicewarrior.org> On Wed, Dec 17, 2003 at 07:04:49PM -0500, D Richard Felker III wrote: > > i think the new vf layer is the key of near everything. > > I still like this line. :)))))))) I agree fully. The vp layer is what I'm waiting for before I begin G2 development. It's hard for me to think about writing a UI or a VO or a simple codec when I don't know what the right api will end up being. But with vp finished, I can start clean ports of VO drivers, all that GIF code nobody else uses, etc. :) Most of what I want to accomplish in MPlayer can't be done in G1. So my windows packages accumulate one cheap hack after another, I write more and more code that should _NEVER_ end up in CVS, etc. If G2 were stable, I don't think I'd give G1 another thought. In short, finish vp soon, please. :) Love and cola, --Joey -- "I know Kung Fu." --Darth Vader From dalias at aerifal.cx Thu Dec 18 05:35:52 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 17 Dec 2003 23:35:52 -0500 Subject: [MPlayer-G2-dev] vp layer and config In-Reply-To: <20031218013953.GA3042@nicewarrior.org> References: <20031215090728.GA13461@brightrain.aerifal.cx> <200312150913.hBF9D2aD001616@mail.mplayerhq.hu> <20031218000449.GZ7833@brightrain.aerifal.cx> <20031218013953.GA3042@nicewarrior.org> Message-ID: <20031218043552.GB7833@brightrain.aerifal.cx> On Wed, Dec 17, 2003 at 07:39:54PM -0600, Joey Parrish wrote: > On Wed, Dec 17, 2003 at 07:04:49PM -0500, D Richard Felker III wrote: > > > i think the new vf layer is the key of near everything. > > > > I still like this line. :)))))))) > > I agree fully. The vp layer is what I'm waiting for before I begin > G2 development. It's hard for me to think about writing a UI or a VO > or a simple codec when I don't know what the right api will end up > being. But with vp finished, I can start clean ports of VO drivers, > all that GIF code nobody else uses, etc. :) > > Most of what I want to accomplish in MPlayer can't be done in G1. > So my windows packages accumulate one cheap hack after another, > I write more and more code that should _NEVER_ end up in CVS, etc. > If G2 were stable, I don't think I'd give G1 another thought. Could you please explain, either here or in a new thread on this list, what some of the major limitations you've found in G1 are? I want to make sure we avoid the same limitations in G2, and although I've been thinking a lot about all the stuff that could go wrong, I'm sure I've missed a few things. > In short, finish vp soon, please. :) Yes, I'd like to... It keeps getting bigger and bigger. When it started I was just looking at replacing the _filter_ layer, but then I realized that the decoder and vo layers were a huge mess too and could be simplified a lot by using the common vp api. And they needed to be rewritten anyway to get rid of the horrible old buffer system and replace it with the new one. > Love and cola, :))) > --Joey Rich From joey at nicewarrior.org Thu Dec 18 19:16:50 2003 From: joey at nicewarrior.org (Joey Parrish) Date: Thu, 18 Dec 2003 12:16:50 -0600 Subject: [MPlayer-G2-dev] G1 limitations Message-ID: <20031218181650.GA5472@nicewarrior.org> Hello, My thoughts on G1 limitations... Well, the biggest ones to me are the lack of certain runtime features. For example, switching audio/video/subtitle tracks on the fly, or seeking in DVD's by title/chapter. Another great thing would be DVD menu support. I wrote a hack to switch DVD audio and subtitle at runtime, but was told that this will never be CVS because G1 can't reinit audio codecs. Luckily, none of my DVDs have audio tracks in different codecs. :) I also want to design a clean native windows GUI for MPlayer, but digging through G1's mplayer.c is very overwhelming. G2 makes this easier as well. I would like to change -vo gif89a into an encoder for MEncoder. It makes much more sense there. Why would I want to transcode from one format to another in MPlayer's vo? Because I believe MEncoder is too much of a hassle to add a new container format. Hopefully G2 will provide a solution to this too. I don't think that any of my complaints on G1 are new or have not been discussed already about G2. It's just basically a list of everything I ever tried to do in G1 but couldn't figure out, as well as everything I did in G1 but was told "This will never be in CVS, because there's no right way to do it," or "Wait for G2." :) Oh, and one more thing on my G2 wishlist. Please provide complete documentation for every API very early on. (As soon as they are mostly stable, perhaps.) It will be extremely encouraging to potential developers. --Joey -- "Living in the complex world of the future is somewhat like having bees live in your head. But, there they are." From dalias at aerifal.cx Thu Dec 18 21:03:40 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Thu, 18 Dec 2003 15:03:40 -0500 Subject: [MPlayer-G2-dev] G1 limitations In-Reply-To: <20031218181650.GA5472@nicewarrior.org> References: <20031218181650.GA5472@nicewarrior.org> Message-ID: <20031218200340.GA21908@brightrain.aerifal.cx> On Thu, Dec 18, 2003 at 12:16:50PM -0600, Joey Parrish wrote: > Hello, > > My thoughts on G1 limitations... Well, the biggest ones to me are the > lack of certain runtime features. For example, switching > audio/video/subtitle tracks on the fly, or seeking in DVD's by > title/chapter. This shouldn't be too difficult, once the new _audio_ layer is done... :)))))) Seriously, though, I think the new audio layer can wait for a little bit. Part of the implementation philosophy for G2 is to replace components one at a time until it's all new and all the broken G1 stuff is gone. Video is much more complicated and much more important; the audio redesign should be fairly easy when it comes time. > Another great thing would be DVD menu support. I wrote > a hack to switch DVD audio and subtitle at runtime, but was told that > this will never be CVS because G1 can't reinit audio codecs. Luckily, > none of my DVDs have audio tracks in different codecs. :) I'm still not sure how to do this correctly, but IMO it needs to be there. Actually I don't like DVD menus at all, but some people do, and (here's the real reason ;) I'd also like MPlayer to be able to play flash movies, which require similar user interaction. Unfortunately this involves a few considerations in the vp layer, mainly the ability to "freeze" the chain, notifying filters and the calling app that no new images will be available until further user interaction. Sort of a "temporary eof" condition. > I also want to design a clean native windows GUI for MPlayer, but > digging through G1's mplayer.c is very overwhelming. G2 makes this > easier as well. Agree totally. > I would like to change -vo gif89a into an encoder for MEncoder. It > makes much more sense there. Why would I want to transcode from one > format to another in MPlayer's vo? Because I believe MEncoder is too > much of a hassle to add a new container format. Hopefully G2 will > provide a solution to this too. I agree here too. I always found the output-to-file vo's in mplayer rather silly, but like you say, mencoder was too ugly and broken to add new features... > I don't think that any of my complaints on G1 are new or have not been > discussed already about G2. It's just basically a list of everything I > ever tried to do in G1 but couldn't figure out, as well as everything > I did in G1 but was told "This will never be in CVS, because there's no > right way to do it," or "Wait for G2." :) OK, well thanks anyway for spelling it out clearly. > Oh, and one more thing on my G2 wishlist. Please provide complete > documentation for every API very early on. (As soon as they are mostly > stable, perhaps.) It will be extremely encouraging to potential > developers. Yes. Unfortunately there are lots of APIs for vp layer, but you can get started with just the basics if you don't want to do fancy direct rendering/slices setups. I'll write up a doc based on my emails to this list and my own notes as soon as it's finished. I intend for it to be _very_ detailed and complete. Rich From dalias at aerifal.cx Fri Dec 19 09:44:48 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Fri, 19 Dec 2003 03:44:48 -0500 Subject: [MPlayer-G2-dev] Limitations in vo2 api :( Message-ID: <20031219084448.GG7833@brightrain.aerifal.cx> Hey, Arpi. Libvo2 is your baby, so I need some help overcoming limitations in libvo2 so that I can write the vp wrapper for it. The problem is that the vo driver just exports an array of buffers without any information about which ones are in use internally and expects the wrapper to manage that information. This is _somewhat_ possible, except that the wrapper doesn't have enough info to go on. For example, vo_mga with triple buffering has three buffer states: unused, displayed, and pending-display. Since a maximum of only four buffers are available (and currently only 3 are allowed!!), we need to make sure we make very good use of them! I believe it's possible to tell from the hardware (dunno whether mga_vid module implements this tho) whether the pending-display buffer has actually been displayed yet. Knowing this is important since it means we can reuse the previously displayed buffer. Without this knowledge, we're either forced to give up direct rendering (and maybe slices too), or else we run the risk of _tearing_ and out-of-order display if the cpu is too fast! Consider the following example: A: displayed B: pending-display C: free The codec decodes a P-frame, slice-rendering it into buffer C, then needs to decode a B-frame to display before it. Provided the cpu isn't too fast and buffer B has already been displayed, we can put the B-frame in buffer A. But if buffer A is still visible, very bad things will happen! There are even worse things that can happen if the movie fps is greater than the video refresh rate, but anyone running the player in that situation is stupid to begin with... The above example can be solved by having the wrapper layer remember the last 2 buffers it showed, and refusing to use those. However, this will disable direct rendering of B-frames unless you have 4 buffers available. So, what kind of solution would I like?? I'm not really sure. :( I tend to think that the vp layer should call the vo layer's get_buffer and release_buffer every time it wants to use buffers, rather than just getting them all at the beginning. This way, the vo driver would be free to decide when it can or can't give you buffers. But right now, the vo drivers are NOT written to support that sort of use. For example, the x11 driver allocates an image when you call get_buffer, and frees it when you call release_buffer -- very inefficient! Please comment on what you'd like to do. IMO this isn't a fundamental design issue, but it will probably require changes to the vp/vo wrapper interface to be made at some point. Rich From kinali at gmx.net Fri Dec 19 12:08:43 2003 From: kinali at gmx.net (Attila Kinali) Date: Fri, 19 Dec 2003 12:08:43 +0100 Subject: [MPlayer-G2-dev] Limitations in vo2 api :( In-Reply-To: <20031219084448.GG7833@brightrain.aerifal.cx> References: <20031219084448.GG7833@brightrain.aerifal.cx> Message-ID: <20031219120843.191ee139.kinali@gmx.net> Heyo, (First: dont assume that i understand anything about the vp/vo layer at all. This here is just a random blurb) On Fri, 19 Dec 2003 03:44:48 -0500 D Richard Felker III wrote: > I tend to think that the vp layer should call the vo layer's > get_buffer and release_buffer every time it wants to use buffers, > rather than just getting them all at the beginning. This way, the vo > driver would be free to decide when it can or can't give you buffers. > But right now, the vo drivers are NOT written to support that sort of > use. For example, the x11 driver allocates an image when you call > get_buffer, and frees it when you call release_buffer -- very > inefficient! Imho this can be easily solved on x11's side by allocating some buffers (X11images) before hand and manage them with get/release_buffer. Every X11 programming book tells you anyways to allocate everything you need at the beginning of your programm to allow caching at the server side. Also, as i already said on irc, we should IMHO integrate the vo api into the vp and drop it. It would get us one api less to document/learn and also remove some duplicated work as a vo modules is much like a vf w/o an output. Attila Kinali -- egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger -- reeler in +kaosu From dalias at aerifal.cx Fri Dec 19 13:38:58 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Fri, 19 Dec 2003 07:38:58 -0500 Subject: [MPlayer-G2-dev] VP layer progress Message-ID: <20031219123858.GH7833@brightrain.aerifal.cx> In the interest of promoting discussion, I'd like to open up my in-progress VP code for public viewing. Keep in mind this is under constant revision, and probably doesn't compile, and would earn me crates full of cola if it were actually used in some sort of release. :)))) http://brightrain.aerifal.cx/~dalias/vp-in-progress/ I would recommend looking mainly at the API docs, vp.h, vf_pullup.c (essentially finished), and vd_ffmpeg.c (almost finished). If you're interested in the internals, you could look at vp.c too. The vo2 wrapper is mostly incomplete, and might change completely anyway. Please send comments (or questions, if you don't understand) to the list. Naturally the code by itself doesn't tell you what you can _do_ with it. Rich From andrej at lucky.net Fri Dec 19 15:49:11 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Fri, 19 Dec 2003 16:49:11 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031219120843.191ee139.kinali@gmx.net> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> Message-ID: <20031219144911.GB70845@lucky.net> Hi, Attila Kinali! Sometime (on Friday, December 19 at 13:19) I've received something... >(First: dont assume that i understand anything about the vp/vo layer >at all. This here is just a random blurb) >On Fri, 19 Dec 2003 03:44:48 -0500 >D Richard Felker III wrote: >> I tend to think that the vp layer should call the vo layer's >> get_buffer and release_buffer every time it wants to use buffers, >> rather than just getting them all at the beginning. This way, the vo >> driver would be free to decide when it can or can't give you buffers. >> But right now, the vo drivers are NOT written to support that sort of >> use. For example, the x11 driver allocates an image when you call >> get_buffer, and frees it when you call release_buffer -- very >> inefficient! >Imho this can be easily solved on x11's side by allocating some >buffers (X11images) before hand and manage them with get/release_buffer. >Every X11 programming book tells you anyways to allocate everything >you need at the beginning of your programm to allow caching at the >server side. >Also, as i already said on irc, we should IMHO integrate the >vo api into the vp and drop it. It would get us one api less >to document/learn and also remove some duplicated work as >a vo modules is much like a vf w/o an output. In addition to that we will have unified API for muxer too, i.e. muxer will be "connected" to vp API the same way as vo and this will help a lot. :) Encoder software will just make that "connection" and do a-v sync. I think it may be done alike mplayer do it but instead of real time for sync it must be audio PTS. So add my vote for unifying vo and vp APIs. :) With best wishes. Andriy. From dalias at aerifal.cx Fri Dec 19 21:10:53 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Fri, 19 Dec 2003 15:10:53 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031219144911.GB70845@lucky.net> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> Message-ID: <20031219201053.GJ7833@brightrain.aerifal.cx> On Fri, Dec 19, 2003 at 04:49:11PM +0200, Andriy N. Gritsenko wrote: > >Also, as i already said on irc, we should IMHO integrate the > >vo api into the vp and drop it. It would get us one api less > >to document/learn and also remove some duplicated work as > >a vo modules is much like a vf w/o an output. > > In addition to that we will have unified API for muxer too, i.e. > muxer will be "connected" to vp API the same way as vo and this will > help a lot. :) Encoder software will just make that "connection" and > do a-v sync. I think it may be done alike mplayer do it but instead of > real time for sync it must be audio PTS. > So add my vote for unifying vo and vp APIs. :) I'm still not sure about this. There are two sides to it: On the one hand, unified api means fewer apis to learn and document, less wrapper code, and simpler implementation for the player app. On the other, having separate apis means you can take the layers apart and use one without the other. For instance, demuxer and muxer layer could be used without any codecs or video processing to repair broken files, or to move data from mkv container to nut container (hey, I guess that also qualifies as repairing broken files :)))). Actually I'm somewhat inclined to integrate more into the vp layer directly, but certainly not until after the audio subsystem has been overhauled too. And I _don't_ want a repeat of the "let's make everything part of the config layer!!" fiasco... IMO the vp/vo/muxer[/demuxer?] integration is only appropriate if it can be done at a level where the parts share a common api, but don't in any way depend on one another -- so they can still all be used independently. Rich From dalias at aerifal.cx Sat Dec 20 06:56:32 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 00:56:32 -0500 Subject: [MPlayer-G2-dev] PTS and AVI files, bleh! Message-ID: <20031220055632.GM7833@brightrain.aerifal.cx> Could someone who understands it (probably this means Arpi, sorry to keep asking you questions!) explain to me how one obtains PTS from AVI files? I'm reading the G2 source, but it's very confusing. I'm working on interfacing codecs with the VP layer, and trying to write the code for handling the PTS from the demuxers... Rich From dalias at aerifal.cx Sat Dec 20 09:18:40 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 03:18:40 -0500 Subject: [MPlayer-G2-dev] PTS and AVI files, bleh! In-Reply-To: <20031220055632.GM7833@brightrain.aerifal.cx> References: <20031220055632.GM7833@brightrain.aerifal.cx> Message-ID: <20031220081840.GN7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 12:56:32AM -0500, D Richard Felker III wrote: > Could someone who understands it (probably this means Arpi, sorry to > keep asking you questions!) explain to me how one obtains PTS from AVI > files? I'm reading the G2 source, but it's very confusing. I'm working > on interfacing codecs with the VP layer, and trying to write the code > for handling the PTS from the demuxers... Hmm, I'm starting to understand it more, and I must say AVI is the most horrible abomination of a file format ever created!! It's quite miraculous that MPlayer is able to play these things at all. Wait, I take that back....I just read demux_mkv.cpp. ;))))) Rich From dalias at aerifal.cx Sat Dec 20 09:34:34 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 03:34:34 -0500 Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer Message-ID: <20031220083434.GO7833@brightrain.aerifal.cx> I've been reading some demuxer code to figure out how pts is computed for various demuxers, in order to understand how it needs to be handled by the new video (and eventually new audio!) layer. In the process, I've come up with a few recommendations for changes. 1. Some demuxers, such as AVI, seek into the middle of an audio chunk without understanding audio packet boundaries at all (because the container format sucks too much to distinguish packets), forcing the decoder to recover. This also means (a) the demuxer will output a broken packet, which is bad if you just want to remux without using any codecs, and (b) pts is no longer exact, only approximate, which IMO sucks really bad. My recommendation would be to _always_ seek to a boundary the demuxer understands. That way you have exact pts, and no broken packets for the decoder or muxer to deal with. The demuxer can skip video frames up to the next keyframe (the point you were trying to seek to) and the audio pipeline can skip the audio _after_ decoding it so that it can keep track of the exact number of samples. (Since audio decoding is very fast, this should not impact performance when seeking.) 2. After seeking, demuxers call resync_audio_stream, which depends on there being an audio decoder! I found this problem a long time ago while adding seeking support to mencoder: it was crashing with -oac copy! It's bad because it makes the demuxer layer dependent on the codec layer. My recommendation is to eliminate resync_audio_stream, and instead just report a discontinuity the next time the demuxer stream is read. That way the codec, if one exists, can decide what to do when it reads from the demuxer, without having to use a callback from the demuxer layer to the codec. Also, resync should become unnecessary for most codecs if my above seeking recommendation is implemented. Please comment on whether you think these changes are acceptible. Rich From dalias at aerifal.cx Sat Dec 20 10:15:27 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 04:15:27 -0500 Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031220083434.GO7833@brightrain.aerifal.cx> References: <20031220083434.GO7833@brightrain.aerifal.cx> Message-ID: <20031220091526.GP7833@brightrain.aerifal.cx> OK, one more... 3. PTS handling is really bogus in some demuxers. Sometimes ds->pts is scaled by rate_d, sometimes not. My recommendation is for pts to always be in units of rate_d/rate_m. This is consistent with the design of NUT, which was made for non-idiotic handling of pts. If a demuxer really needs units of 1/rate_m, it should set rate_d to 1 (like mpeg does). In any case, the vp layer uses time bases like NUT. To pick a good time base, it needs to know _correct_ values for rate_d and rate_m. Not only is this important for filters that want to adjust timestamps; it also matters so that mencoder-g2's muxers can auto-select a time base that won't waste bits! If you insist on leaving rate_d and rate_m as-is, I can just divide pts by rate_d myself....but this will give inaccurate timestamps for audio with AVI when -nobps is used! BTW, I think AVI is the only demuxer that uses rate_d!=1, so this is a fairly trivial change. Rich From andrej at lucky.net Sat Dec 20 11:26:23 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Sat, 20 Dec 2003 12:26:23 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031219201053.GJ7833@brightrain.aerifal.cx> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> Message-ID: <20031220102623.GA7397@lucky.net> Hi, D Richard Felker III! Sometime (on Friday, December 19 at 21:58) I've received something... >On Fri, Dec 19, 2003 at 04:49:11PM +0200, Andriy N. Gritsenko wrote: >> >Also, as i already said on irc, we should IMHO integrate the >> >vo api into the vp and drop it. It would get us one api less >> >to document/learn and also remove some duplicated work as >> >a vo modules is much like a vf w/o an output. >> >> In addition to that we will have unified API for muxer too, i.e. >> muxer will be "connected" to vp API the same way as vo and this will >> help a lot. :) Encoder software will just make that "connection" and >> do a-v sync. I think it may be done alike mplayer do it but instead of >> real time for sync it must be audio PTS. >> So add my vote for unifying vo and vp APIs. :) >I'm still not sure about this. There are two sides to it: >On the one hand, unified api means fewer apis to learn and document, >less wrapper code, and simpler implementation for the player app. >On the other, having separate apis means you can take the layers apart >and use one without the other. For instance, demuxer and muxer layer >could be used without any codecs or video processing to repair broken >files, or to move data from mkv container to nut container (hey, I >guess that also qualifies as repairing broken files :)))). >Actually I'm somewhat inclined to integrate more into the vp layer >directly, but certainly not until after the audio subsystem has been >overhauled too. And I _don't_ want a repeat of the "let's make >everything part of the config layer!!" fiasco... >IMO the vp/vo/muxer[/demuxer?] integration is only appropriate if it >can be done at a level where the parts share a common api, but don't >in any way depend on one another -- so they can still all be used >independently. Fully agree with that. There must be common (stream-independent) part of the API - that part will be used by any layer and it must contain such things as stream/codec type, PTS, duration, some frame/buffer pointer, control and config proc pointers, and may be some others. Layer-specific data (such as audio, video, subs, menu, etc. specific) must be specified only in that layer. This way we could manipulate "connections" from streams to muxer in some generic way and be free to have any number of audios/videos/subs in resulting file. Also when we have some common part then wrapper may use only that common part and it'll be as simple as possible and player/encoder don't need to know layer internals and will be simpler too. This includes your example above about muxer/demuxer without any codecs too. :) Also when we have common part in some common API then we'll document that common part only once and specific parts also once so it'll reduce API documentation too. :) With best wishes. Andriy. From dalias at aerifal.cx Sat Dec 20 12:11:42 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 06:11:42 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220102623.GA7397@lucky.net> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> Message-ID: <20031220111142.GQ7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 12:26:23PM +0200, Andriy N. Gritsenko wrote: > >IMO the vp/vo/muxer[/demuxer?] integration is only appropriate if it > >can be done at a level where the parts share a common api, but don't > >in any way depend on one another -- so they can still all be used > >independently. > > Fully agree with that. I don't think you were agreeing with what I said, but with something different... > There must be common (stream-independent) part > of the API - that part will be used by any layer and it must contain such > things as stream/codec type, PTS, duration, some frame/buffer pointer, > control and config proc pointers, and may be some others. Layer-specific > data (such as audio, video, subs, menu, etc. specific) must be specified > only in that layer. Eh? Are you saying to use a common transport/pipeline api for audio, video, and subs??? This is nonsense!! They have _very_ different requirements. That's not to say you can't interconnect them; you could have a module that serves both as a video and audio node. The most common case for this would be a 'visualization' effect for music. Another more esoteric but fun example is: vd ----> vf_split ----------------------------------------> vo \ [video layer] `---> vf/sf_ocr \ [subtitle layer] `--> sf/af_speechsynth \ \ [audio layer] `--> af_merge -------------> ao /` ad ----------------------------------' While this looks nice in my pretty ascii diagram, the truth of the matter is that audio, video, and text are nothing alike, and can't be passed around via the same interfaces. For example, video requires ultra-efficient buffering to avoid wasted copies, and comes in discrete frames. Text/subtitles are very small and can be passed around in any naive way you want. Audio should be treated as a continuous stream, with automatic buffering between filters. > This way we could manipulate "connections" from streams to muxer in > some generic way and be free to have any number of audios/videos/subs in > resulting file. The idea is appealing, but I don't think it can be done... If you have a suggestion that's not a horrible hack, please make it. > Also when we have some common part then wrapper may use only that > common part and it'll be as simple as possible and player/encoder don't > need to know layer internals and will be simpler too. This includes your > example above about muxer/demuxer without any codecs too. :) > Also when we have common part in some common API then we'll document > that common part only once and specific parts also once so it'll reduce > API documentation too. :) And now we can be slow like Xine!! :)))) Rich From attila at kinali.ch Sat Dec 20 12:00:26 2003 From: attila at kinali.ch (Attila Kinali) Date: Sat, 20 Dec 2003 12:00:26 +0100 Subject: [MPlayer-G2-dev] RCS for G2 Message-ID: <20031220120026.5a0ee320.attila@kinali.ch> Heyo People, I think it's time to put G2 onto a RCS. Currently there are a few developments going on, afaik Arpi, Rich and Alex are working on G2 but very little of their work is seen because there is no place they can put it. IMHO it would also help to attract other developers to review the already existing code and to contribute on the current development. So, the question is what shall we use, currently there are a few different systems around: cvs, bk, svn, tla, darcs. cvs has some known problems (which we currently workaround) and limited abilities in mirroring or distributed development. bk has proven its quality in the kernel development, but its biggest disadvantage is its non-free license that'll force some developers out of the project. svn seems to solve most problems with cvs, is free, allows simple mirroring, branching and distributed development. The client usage is quite similar to cvs which helps in getting used to it (didnt need more than 15m for me) But its server is quite bloated imho. tla has some nice concepts and aims in the same direction as svn, but it is still unfinished and has IMHO some design problems that can leed to problems in the future. i can't say much about darcs as i just heard about it yesterday and didn't had but a short look at the docs. It seems to be similar to svn and tla. Currently i vote for using svn as it seems to be superior to all others and it's quit mature. I'm also using it for my personal projects for a few weeks and didnt run in any problems so far. Attila Kinali -- egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger -- reeler in +kaosu From dalias at aerifal.cx Sat Dec 20 12:40:54 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 06:40:54 -0500 Subject: [MPlayer-G2-dev] RCS for G2 In-Reply-To: <20031220120026.5a0ee320.attila@kinali.ch> References: <20031220120026.5a0ee320.attila@kinali.ch> Message-ID: <20031220114054.GS7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 12:00:26PM +0100, Attila Kinali wrote: > Heyo People, > > I think it's time to put G2 onto a RCS. Not quite yet, IMO. No sense in lots of people committing code until the new cores are ready. > Currently there are a few developments going on, > afaik Arpi, Rich and Alex are working on G2 but very > little of their work is seen because there is no place > they can put it. Actually it's just not done... Even if we had RCS, I wouldn't commit this code yet since it doesn't work. Only commit _after_ the code works... :) > cvs has some known problems (which we currently workaround) > and limited abilities in mirroring or distributed development. I don't think we care about those. The main problem with cvs is that it can't handle keeping revision history when a file is moved (without ugly hacks). > bk has proven its quality in the kernel development, but > its biggest disadvantage is its non-free license that'll > force some developers out of the project. I don't see how it's proven quality. It's just proven that people sold-out to commercial interests like it... > svn seems to solve most problems with cvs, is free, allows > simple mirroring, branching and distributed development. > The client usage is quite similar to cvs which helps in > getting used to it (didnt need more than 15m for me) > But its server is quite bloated imho. The use of Berkeley DB as a backend is seen as a serious negative by some developers. > tla has some nice concepts and aims in the same direction > as svn, but it is still unfinished and has IMHO some design > problems that can leed to problems in the future. Not familiar with it. > i can't say much about darcs as i just heard about it > yesterday and didn't had but a short look at the docs. > It seems to be similar to svn and tla. IMO it's worth checking into a bit. > Currently i vote for using svn as it seems to be superior > to all others and it's quit mature. I'm also using it > for my personal projects for a few weeks and didnt > run in any problems so far. My vote is for cvs, but I'm not opposed to svn. I am _strongly_ opposed to bk!!! Rich From moritz at bunkus.org Sat Dec 20 13:07:42 2003 From: moritz at bunkus.org (Moritz Bunkus) Date: Sat, 20 Dec 2003 13:07:42 +0100 Subject: [MPlayer-G2-dev] PTS and AVI files, bleh! In-Reply-To: <20031220081840.GN7833@brightrain.aerifal.cx> References: <20031220055632.GM7833@brightrain.aerifal.cx> <20031220081840.GN7833@brightrain.aerifal.cx> Message-ID: <20031220120742.GP6694@bunkus.org> Hi, > Wait, I take that back....I just read demux_mkv.cpp. ;))))) ;) -- If Darl McBride was in charge, he'd probably make marriage unconstitutional too, since clearly it de-emphasizes the commercial nature of normal human interaction, and probably is a major impediment to the commercial growth of prostitution. - Linus Torvalds From dalias at aerifal.cx Sat Dec 20 13:25:50 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 07:25:50 -0500 Subject: [MPlayer-G2-dev] PTS and AVI files, bleh! In-Reply-To: <20031220120742.GP6694@bunkus.org> References: <20031220055632.GM7833@brightrain.aerifal.cx> <20031220081840.GN7833@brightrain.aerifal.cx> <20031220120742.GP6694@bunkus.org> Message-ID: <20031220122550.GT7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 01:07:42PM +0100, Moritz Bunkus wrote: > Hi, > > > Wait, I take that back....I just read demux_mkv.cpp. ;))))) > > ;) Hey, 26 #includes and 3157 lines of code?! I have to believe it would be simpler to implement a Matroska demuxer _without_ their libs... > -- > If Darl McBride was in charge, he'd probably make marriage > unconstitutional too, since clearly it de-emphasizes the commercial > nature of normal human interaction, and probably is a major impediment > to the commercial growth of prostitution. - Linus Torvalds :)) Rich From moritz at bunkus.org Sat Dec 20 13:26:38 2003 From: moritz at bunkus.org (Moritz Bunkus) Date: Sat, 20 Dec 2003 13:26:38 +0100 Subject: [MPlayer-G2-dev] PTS and AVI files, bleh! In-Reply-To: <20031220122550.GT7833@brightrain.aerifal.cx> References: <20031220055632.GM7833@brightrain.aerifal.cx> <20031220081840.GN7833@brightrain.aerifal.cx> <20031220120742.GP6694@bunkus.org> <20031220122550.GT7833@brightrain.aerifal.cx> Message-ID: <20031220122638.GR6694@bunkus.org> Heya, > Hey, 26 #includes and 3157 lines of code?! I have to believe it would > be simpler to implement a Matroska demuxer _without_ their libs... Quite likely. Ronlad Bultje from the gstreamer project has written (not yet completed, but nearly so) libs for EBML/Matroska in C, and when I get around to implementing a demuxer for G2 I'll base it upon them. At least the number of includes will go down then ;) Mosu -- If Darl McBride was in charge, he'd probably make marriage unconstitutional too, since clearly it de-emphasizes the commercial nature of normal human interaction, and probably is a major impediment to the commercial growth of prostitution. - Linus Torvalds From andrej at lucky.net Sat Dec 20 13:31:42 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Sat, 20 Dec 2003 14:31:42 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220111142.GQ7833@brightrain.aerifal.cx> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> Message-ID: <20031220123142.GB96070@lucky.net> Hi, D Richard Felker III! Sometime (on Saturday, December 20 at 12:58) I've received something... >On Sat, Dec 20, 2003 at 12:26:23PM +0200, Andriy N. Gritsenko wrote: >> >IMO the vp/vo/muxer[/demuxer?] integration is only appropriate if it >> >can be done at a level where the parts share a common api, but don't >> >in any way depend on one another -- so they can still all be used >> >independently. >> >> Fully agree with that. >I don't think you were agreeing with what I said, but with something >different... It may be. First of all my native language is too different from English so may be (and it seems they are) some misunderstandings there. And also each person has own way of thinking. :) But I hope we anyways aren't too far one from other. At least we can speak out all and find the best. :) >> There must be common (stream-independent) part >> of the API - that part will be used by any layer and it must contain such >> things as stream/codec type, PTS, duration, some frame/buffer pointer, >> control and config proc pointers, and may be some others. Layer-specific >> data (such as audio, video, subs, menu, etc. specific) must be specified >> only in that layer. >Eh? Are you saying to use a common transport/pipeline api for audio, >video, and subs??? This is nonsense!! They have _very_ different >requirements. Hmm, it seems I've told some misunderstandable again, sorry. I meant not common transport but common API only, i.e. muxer must don't know about video/audio/subs/etc. - encoder program must pass some common API structure to it and so on. ;) I.e. I meant muxer don't have rights to know about _any_ layer and must be fully independent layer. And I think it's possible for demuxer too. I didn't dig into demuxer but I don't see any problem in that - demuxer just splits input file/stream to streams, gets stream type and PTS (if there is one in container), and pass all streams to application. Distinguish between stream types isn't work for demuxer but for application instead and application will pass stream to appropriate decoder. Decoder must be layer specific so it has to have layer-specific API and be first section of some chunk (audio, video, or some other). Last section of that chunk will be or ao/vo/sub driver, or codec to encode. All chunk must use the same API between all sections of it. It's only place to be layer-specific. Let's illustrate it in diagramm: ,---------> vd ---------> vf ---------> vo / [common] [video] [video] file --> demuxer \ `---------> ad ---------> af ---------> ao [common] [audio] [audio] ,---------> vd -------> vf -------> ovc -. [common] / [common] [video] [video] \ file --> demuxer muxer --> file \ / `---------> ad -------> af -------> oac -' [common] [audio] [audio] [common] [video] [audio] [common] -- are API structures of video layer, audio layer, and common respectively. I hope now I said it clean enough to be understandable. :) > That's not to say you can't interconnect them; you could >have a module that serves both as a video and audio node. The most >common case for this would be a 'visualization' effect for music. >Another more esoteric but fun example is: >vd ----> vf_split ----------------------------------------> vo > \ [video layer] > `---> vf/sf_ocr > \ [subtitle layer] > `--> sf/af_speechsynth > \ > \ [audio layer] > `--> af_merge -------------> ao > /` >ad ----------------------------------' >While this looks nice in my pretty ascii diagram, the truth of the >matter is that audio, video, and text are nothing alike, and can't be >passed around via the same interfaces. For example, video requires >ultra-efficient buffering to avoid wasted copies, and comes in >discrete frames. Text/subtitles are very small and can be passed >around in any naive way you want. Audio should be treated as a >continuous stream, with automatic buffering between filters. Yes, I think the same. What I said is just API from vd to vo must be unified (let's say - video stream API) but that API must be inside video layer. Application must call video layer API for "connection" all from vd to vo in some chunk and that's all. Application will know only common data structures of members of the chunk. And here goes the same for audio chunk(s) and any other. Common data structures means application's developers may learn only common API structure and API calls for layers and it's all. Let's make it simpler. ;) >> This way we could manipulate "connections" from streams to muxer in >> some generic way and be free to have any number of audios/videos/subs in >> resulting file. >The idea is appealing, but I don't think it can be done... If you have >a suggestion that's not a horrible hack, please make it. I have that idea in thoughts and I've tried to explain it above. If something isn't clear yet, feel free to ask. I'll glad to explain it. >> Also when we have some common part then wrapper may use only that >> common part and it'll be as simple as possible and player/encoder don't >> need to know layer internals and will be simpler too. This includes your >> example above about muxer/demuxer without any codecs too. :) >> Also when we have common part in some common API then we'll document >> that common part only once and specific parts also once so it'll reduce >> API documentation too. :) >And now we can be slow like Xine!! :)))) I'm not sure about that. :) With best wishes. Andriy. From andrej at lucky.net Sat Dec 20 13:37:01 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Sat, 20 Dec 2003 14:37:01 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220123142.GB96070@lucky.net> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> Message-ID: <20031220123701.GC96070@lucky.net> OOPS, my bad English... :/ >layer-specific API and be first section of some chunk (audio, video, or ^chain and elsewhere the same. I'm sorry. Andriy. From dalias at aerifal.cx Sat Dec 20 15:05:22 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 09:05:22 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220123142.GB96070@lucky.net> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> Message-ID: <20031220140522.GX7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 02:31:42PM +0200, Andriy N. Gritsenko wrote: > >I don't think you were agreeing with what I said, but with something > >different... > > It may be. First of all my native language is too different from > English so may be (and it seems they are) some misunderstandings there. No problem. You write in english rather well, actually. > >> There must be common (stream-independent) part > >> of the API - that part will be used by any layer and it must contain such > >> things as stream/codec type, PTS, duration, some frame/buffer pointer, > >> control and config proc pointers, and may be some others. Layer-specific > >> data (such as audio, video, subs, menu, etc. specific) must be specified > >> only in that layer. > > >Eh? Are you saying to use a common transport/pipeline api for audio, > >video, and subs??? This is nonsense!! They have _very_ different > >requirements. > > Hmm, it seems I've told some misunderstandable again, sorry. I meant > not common transport but common API only, i.e. muxer must don't know > about video/audio/subs/etc. - encoder program must pass some common API > structure to it and so on. ;) *snip* > Let's illustrate it in diagramm: *snip* > [video] [audio] [common] -- are API structures of video layer, audio > layer, and common respectively. > > I hope now I said it clean enough to be understandable. :) Yeah, IMO it's a bit simpler than that even, though. For example, the vp (and eventually ap :) layer just uses the demuxer api to get packets from the demuxer. The link is much simpler than links inside the actual video pipeline: the vd or demuxer-wrapper (not sure which we'll use yet) just has a pointer to the demuxer stream, which was given to it by the calling app. > >While this looks nice in my pretty ascii diagram, the truth of the > >matter is that audio, video, and text are nothing alike, and can't be > >passed around via the same interfaces. For example, video requires > >ultra-efficient buffering to avoid wasted copies, and comes in > >discrete frames. Text/subtitles are very small and can be passed > >around in any naive way you want. Audio should be treated as a > >continuous stream, with automatic buffering between filters. > > Yes, I think the same. What I said is just API from vd to vo must be > unified (let's say - video stream API) but that API must be inside video > layer. Application must call video layer API for "connection" all from vd > to vo in some chunk and that's all. Application will know only common > data structures of members of the chunk. And here goes the same for audio > chunk(s) and any other. Common data structures means application's > developers may learn only common API structure and API calls for layers > and it's all. Let's make it simpler. ;) That's actually an interesting idea: using the same node/link structure for all the different types of pipelines, so that the app only has to know one api for connecting and configuring them. I'm inclined to try something like that, except that it would mean more layers of structures and more levels of indirection... In any case, I'll try to make the api look similar for all three pipelines (video/audio/sub), but I'm just not sure if it's reasonable to use the same structures and functions for setting them up. > >The idea is appealing, but I don't think it can be done... If you have > >a suggestion that's not a horrible hack, please make it. > > I have that idea in thoughts and I've tried to explain it above. If > something isn't clear yet, feel free to ask. I'll glad to explain it. Basically the problem is this. Video pipeline nodes have a structure very specific to the needs of video processing. The links between the vp nodes have structures very specific to image/buffer pool management. As for the audio pipeline, I envision its links having a different sort of nature, managing dynamic shrinking/growing fifo-type buffers between nodes and keeping track of several positions in the buffer. I see two approaches to having a unified api for building the pipelines: 1. Storing the linked-list pointers at the beginning of the structures for each type, and type casting so that they can all be treated the same. IMO this is a very ugly hack. 2. Having a second layer on top of the underlying pipeline layers for the app to use in building the pipelines. This seems somewhat wasteful, but perhaps it's not. > >> Also when we have common part in some common API then we'll document > >> that common part only once and specific parts also once so it'll reduce > >> API documentation too. :) > > >And now we can be slow like Xine!! :)))) > > I'm not sure about that. :) OK, sorry, that was a really cheap flame... :) > With best wishes. > Andriy. Rich From attila at kinali.ch Sat Dec 20 15:19:53 2003 From: attila at kinali.ch (Attila Kinali) Date: Sat, 20 Dec 2003 15:19:53 +0100 Subject: [MPlayer-G2-dev] Information about video coding Message-ID: <20031220151953.3be01369.attila@kinali.ch> Heyo, As i keep asking more and more questions about video coding the last few days, i'd like to ask whether there are any good resources (webpages, papers, books) about video coding, especialy in practical applications not only academic stuff. Thanks Attila Kinali -- egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger -- reeler in +kaosu From dalias at aerifal.cx Sat Dec 20 15:49:22 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 09:49:22 -0500 Subject: [MPlayer-G2-dev] Information about video coding In-Reply-To: <20031220151953.3be01369.attila@kinali.ch> References: <20031220151953.3be01369.attila@kinali.ch> Message-ID: <20031220144922.GC7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 03:19:53PM +0100, Attila Kinali wrote: > Heyo, > > As i keep asking more and more questions about video coding > the last few days, i'd like to ask whether there are any > good resources (webpages, papers, books) about video coding, > especialy in practical applications not only academic stuff. Actually I don't know any. :) But what are you looking for? There are good (but expensive) resources about video compression and filtering techniques for enhancement and special effects. But I seriously doubt you'll find any good documentation on efficient implementation by avoiding unnecessary copies and optimizing the cpu cache (MPlayer basically wrote the book on this, except there is no book, unfortunately, just code :). I'd also be really surprised if you managed to find any decent documents on inverse-telecine. Even good deinterlacing docs are probably rare. IMO the best way to learn is to RTFS, talk to the developers, or flame us to WTFM... Rich From andrej at lucky.net Sat Dec 20 15:58:15 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Sat, 20 Dec 2003 16:58:15 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220140522.GX7833@brightrain.aerifal.cx> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> Message-ID: <20031220145815.GD96070@lucky.net> Hi, D Richard Felker III! Sometime (on Saturday, December 20 at 15:52) I've received something... [...skipped...] >I see two approaches to having a unified api for building the >pipelines: >1. Storing the linked-list pointers at the beginning of the structures > for each type, and type casting so that they can all be treated the > same. IMO this is a very ugly hack. >2. Having a second layer on top of the underlying pipeline layers for > the app to use in building the pipelines. This seems somewhat > wasteful, but perhaps it's not. 3. Have some common structure type stream_t. Then, for example, video stream structure will be: typedef struct { stream_t st; ....... /* rest are video-specific */ } video_stream_t; Then just assuming from application point of view some *vidstr1 is stream_t*, from video filter (demuxer, vo, etc.) point of view it's video_stream_t*. Anyway that video_stream_t may (and must) be created only by video layer API so there isn't any problem with that. :) >> >> Also when we have common part in some common API then we'll document >> >> that common part only once and specific parts also once so it'll reduce >> >> API documentation too. :) >> >And now we can be slow like Xine!! :)))) >> I'm not sure about that. :) >OK, sorry, that was a really cheap flame... :) No problem, I've understood that was a joke. I liked it. :))) With best wishes. Andriy. From pnis at coder.hu Sat Dec 20 16:05:15 2003 From: pnis at coder.hu (Balatoni Denes) Date: Sat, 20 Dec 2003 16:05:15 +0100 Subject: [MPlayer-G2-dev] Information about video coding In-Reply-To: <20031220151953.3be01369.attila@kinali.ch> References: <20031220151953.3be01369.attila@kinali.ch> Message-ID: <200312201605.15380.pnis@coder.hu> Hi! http://www.vcodex.fsnet.co.uk/resources.html h.264 and mpeg4 (detailed) overview On 2003. december 20. 15.19, Attila Kinali wrote: > Heyo, > > As i keep asking more and more questions about video coding > the last few days, i'd like to ask whether there are any > good resources (webpages, papers, books) about video coding, > especialy in practical applications not only academic stuff. > > > Thanks > > Attila Kinali > -- > egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger > -- reeler in +kaosu > > _______________________________________________ > MPlayer-G2-dev mailing list > MPlayer-G2-dev at mplayerhq.hu > http://mplayerhq.hu/mailman/listinfo/mplayer-g2-dev From dalias at aerifal.cx Sat Dec 20 16:21:13 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 10:21:13 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220145815.GD96070@lucky.net> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> Message-ID: <20031220152113.GE7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 04:58:15PM +0200, Andriy N. Gritsenko wrote: > Hi, D Richard Felker III! > > Sometime (on Saturday, December 20 at 15:52) I've received something... > > [...skipped...] > > >I see two approaches to having a unified api for building the > >pipelines: > > >1. Storing the linked-list pointers at the beginning of the structures > > for each type, and type casting so that they can all be treated the > > same. IMO this is a very ugly hack. > > >2. Having a second layer on top of the underlying pipeline layers for > > the app to use in building the pipelines. This seems somewhat > > wasteful, but perhaps it's not. > > 3. Have some common structure type stream_t. Then, for example, video > stream structure will be: > > typedef struct { > stream_t st; > ....... /* rest are video-specific */ > } video_stream_t; > > Then just assuming from application point of view some *vidstr1 is > stream_t*, from video filter (demuxer, vo, etc.) point of view it's > video_stream_t*. Anyway that video_stream_t may (and must) be created > only by video layer API so there isn't any problem with that. :) This is the same as what I said in approach 1. And here's where it gets messy: In order to be useful to the common api layer, the common structure at the beginning needs to contain the pointers for the node's inputs and outputs. Otherwise it won't help the app build chains. So inside here, you have pointers to links. But to what kind of links? They have to just be generic links, not audio or video links. And this means every time a video filter wants to access its links, it has to cast the pointers!! :( Very, very ugly... Rich From andrej at lucky.net Sat Dec 20 17:35:04 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Sat, 20 Dec 2003 18:35:04 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220152113.GE7833@brightrain.aerifal.cx> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> <20031220152113.GE7833@brightrain.aerifal.cx> Message-ID: <20031220163504.GE96070@lucky.net> Hi, D Richard Felker III! Sometime (on Saturday, December 20 at 17:40) I've received something... [...skipped...] >This is the same as what I said in approach 1. And here's where it >gets messy: In order to be useful to the common api layer, the common >structure at the beginning needs to contain the pointers for the >node's inputs and outputs. Otherwise it won't help the app build >chains. So inside here, you have pointers to links. But to what kind >of links? They have to just be generic links, not audio or video >links. And this means every time a video filter wants to access its >links, it has to cast the pointers!! :( Very, very ugly... Hmm. And how about to put these pointers in layer-specific part of the structure (outside of common part) while any layer has it's own type? I don't think application anyway will want these pointers since they aren't need for sync and other application-level stuff. If node has two or more inputs or outputs and we'll need to change some partial link then it's sufficient to pass two pointers to members of chain to level's API. I don't see any example when two the same filters may have more than one connection in the same chain so it's easy. API just will have some proc alike (assume that stream_t is common node structure, I don't make smth better yet): stream_t *open_video_filter(char *name); int link_video_chain(stream_t *prev, stream_t *next); int unlink_video_chain(stream_t *prev, stream_t *next); so only application will track all changes within that chain. :) If you thinks about frame processing then it's not a problem at all. Since we leave only pull-way getting frames then each last processing unit (vo/ao/muxer) will have some proc alike: int vo_process_frame(stream_t *vos, double pts, double duration); I understand it's defferent from G1's API but it's simpler and more clean IMHO. I think it also will help a lot to do A-V sync. With best wishes. Andriy. From joey at nicewarrior.org Sat Dec 20 18:17:04 2003 From: joey at nicewarrior.org (Joey Parrish) Date: Sat, 20 Dec 2003 11:17:04 -0600 Subject: [MPlayer-G2-dev] RCS for G2 In-Reply-To: <20031220114054.GS7833@brightrain.aerifal.cx> References: <20031220120026.5a0ee320.attila@kinali.ch> <20031220114054.GS7833@brightrain.aerifal.cx> Message-ID: <20031220171703.GA11990@nicewarrior.org> On Sat, Dec 20, 2003 at 06:40:54AM -0500, D Richard Felker III wrote: > Not quite yet, IMO. No sense in lots of people committing code until > the new cores are ready. I agree, no sense in publishing code when there's not even a stable core API yet. > My vote is for cvs, but I'm not opposed to svn. I am _strongly_ > opposed to bk!!! I second that. I already have and use CVS, and the problems it has are already being solved to some degree by MPHQ. I wouldn't mind looking into svn. But for bitkeeper, I refuse. I have not had to pay for software in years, and I'm not about to start doing it, especially for the sake of open-source development. --Joey -- "I know Kung Fu." --Darth Vader From joey at nicewarrior.org Sat Dec 20 18:19:45 2003 From: joey at nicewarrior.org (Joey Parrish) Date: Sat, 20 Dec 2003 11:19:45 -0600 Subject: [MPlayer-G2-dev] Information about video coding In-Reply-To: <20031220151953.3be01369.attila@kinali.ch> References: <20031220151953.3be01369.attila@kinali.ch> Message-ID: <20031220171945.GB11990@nicewarrior.org> On Sat, Dec 20, 2003 at 03:19:53PM +0100, Attila Kinali wrote: > Heyo, > > As i keep asking more and more questions about video coding > the last few days, i'd like to ask whether there are any > good resources (webpages, papers, books) about video coding, > especialy in practical applications not only academic stuff. Similar topic, I"ve just been reminded... I've wanted to encode a video composed of lots of scrolling ascii-art text. ?nyone know of a codec that is designed for stupid things like this without insanely high bitrates? (Very simple images, but very little in common between frames.) --Joey -- "Of the seven dwarves, the only one who shaved was Dopey. That should tell us something about the wisdom of shaving." From dalias at aerifal.cx Sat Dec 20 19:16:50 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 13:16:50 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220163504.GE96070@lucky.net> References: <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> <20031220152113.GE7833@brightrain.aerifal.cx> <20031220163504.GE96070@lucky.net> Message-ID: <20031220181650.GH7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 06:35:04PM +0200, Andriy N. Gritsenko wrote: > Hi, D Richard Felker III! > > Sometime (on Saturday, December 20 at 17:40) I've received something... > [...skipped...] > > >This is the same as what I said in approach 1. And here's where it > >gets messy: In order to be useful to the common api layer, the common > >structure at the beginning needs to contain the pointers for the > >node's inputs and outputs. Otherwise it won't help the app build > >chains. So inside here, you have pointers to links. But to what kind > >of links? They have to just be generic links, not audio or video > >links. And this means every time a video filter wants to access its > >links, it has to cast the pointers!! :( Very, very ugly... > > Hmm. And how about to put these pointers in layer-specific part of > the structure (outside of common part) while any layer has it's own type? > I don't think application anyway will want these pointers since they These pointers are _exactly_ the thing the app will want to see, so it can build the pipeline. How can you build the pipeline if you can't connect pieces or tell when pieces are connected? :) > I don't see any example when two the same filters may have more than one > connection in the same chain so it's easy. Hrrm, that's the whole point. The speech synth thing was just for fun. Normally multiple inputs/outputs WILL be in the same chain, e.g. for merging video from multiple sources, processing subimages of the video, displaying output on multiple devices at the same time, pvr-style encoding+watching at the same time (!!) etc. In any case, I was talking about all the links including primary input and output, not just the extras. > API just will have some proc alike (assume that stream_t is common > node structure, I don't make smth better yet): > > stream_t *open_video_filter(char *name); > int link_video_chain(stream_t *prev, stream_t *next); > int unlink_video_chain(stream_t *prev, stream_t *next); If these functions have _video_ in their name, there's no use in having a generic "stream" structure. vp_node_t is just as good! > so only application will track all changes within that chain. :) If you > thinks about frame processing then it's not a problem at all. Since we > leave only pull-way getting frames then each last processing unit > (vo/ao/muxer) will have some proc alike: > > int vo_process_frame(stream_t *vos, double pts, double duration); > > I understand it's defferent from G1's API but it's simpler and more clean > IMHO. I think it also will help a lot to do A-V sync. A/V sync doesn't need help here. The only place there's a problem right now is handling broken file formats and codecs with bogus pts. Timestamps (exact, no float mess) already pass all the way down the pipeline, and then the calling app regulates a/v sync and telling the vo when to display the next frame. Rich From michaelni at gmx.at Sat Dec 20 19:14:22 2003 From: michaelni at gmx.at (Michael Niedermayer) Date: Sat, 20 Dec 2003 19:14:22 +0100 Subject: [MPlayer-G2-dev] RCS for G2 In-Reply-To: <20031220114054.GS7833@brightrain.aerifal.cx> References: <20031220120026.5a0ee320.attila@kinali.ch> <20031220114054.GS7833@brightrain.aerifal.cx> Message-ID: <200312201914.22799.michaelni@gmx.at> Hi On Saturday 20 December 2003 12:40, D Richard Felker III wrote: [...] > > Currently i vote for using svn as it seems to be superior > > to all others and it's quit mature. I'm also using it > > for my personal projects for a few weeks and didnt > > run in any problems so far. > > My vote is for cvs, but I'm not opposed to svn. I am _strongly_ > opposed to bk!!! just for the record, i am also strongly opposed to bk [...] -- Michael level[i]= get_vlc(); i+=get_vlc(); (violates patent EP0266049) median(mv[y-1][x], mv[y][x-1], mv[y+1][x+1]); (violates patent #5,905,535) buf[i]= qp - buf[i-1]; (violates patent #?) for more examples, see http://mplayerhq.hu/~michael/patent.html stop it, see http://petition.eurolinux.org & http://petition.ffii.org/eubsa/en From andrej at lucky.net Sat Dec 20 20:01:17 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Sat, 20 Dec 2003 21:01:17 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220181650.GH7833@brightrain.aerifal.cx> References: <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> <20031220152113.GE7833@brightrain.aerifal.cx> <20031220163504.GE96070@lucky.net> <20031220181650.GH7833@brightrain.aerifal.cx> Message-ID: <20031220190117.GF96070@lucky.net> Hi, D Richard Felker III! Sometime (on Saturday, December 20 at 20:04) I've received something... >On Sat, Dec 20, 2003 at 06:35:04PM +0200, Andriy N. Gritsenko wrote: >> Hi, D Richard Felker III! >> >> Sometime (on Saturday, December 20 at 17:40) I've received something... >> [...skipped...] >> >> >This is the same as what I said in approach 1. And here's where it >> >gets messy: In order to be useful to the common api layer, the common >> >structure at the beginning needs to contain the pointers for the >> >node's inputs and outputs. Otherwise it won't help the app build >> >chains. So inside here, you have pointers to links. But to what kind >> >of links? They have to just be generic links, not audio or video >> >links. And this means every time a video filter wants to access its >> >links, it has to cast the pointers!! :( Very, very ugly... >> >> Hmm. And how about to put these pointers in layer-specific part of >> the structure (outside of common part) while any layer has it's own type? >> I don't think application anyway will want these pointers since they >These pointers are _exactly_ the thing the app will want to see, so it >can build the pipeline. How can you build the pipeline if you can't >connect pieces or tell when pieces are connected? :) Here goes the example: typedef struct { int (*add_out) (struct vp_node_t *node, struct vp_node_t *next, .......); int (*rm_out) (struct vp_node_t *node, struct vp_node_t *next, .......); vp_frame_t *(*pull_frame) (struct vp_node_t *node, .......); ....... } vp_funcs; typedef struct vp_node_t { node_t n; struct vp_node_t *prev; struct vp_node_t *next; vp_funcs *func; ....... } vp_node_t; So when we call link_video_chain(node,next) it will at first test if node->func->add_out() exists and call it, otherwise if node->next was filled then return error, else set node->next. After that do the same for node next. If there was no errors then we assume nodes are linked. For example, on pull_frame(node) we could pull frame from previous node by node->prev->pull_frame. Calling unlink_video_chain(node,next) we will do the same thing as on link_video_chain(). Since node_t is part of vp_node_t and pointed to the same then both structures above may be only in video_internal.h - application will know nothing about it but it will work anyway. :) >> I don't see any example when two the same filters may have more than one >> connection in the same chain so it's easy. >Hrrm, that's the whole point. The speech synth thing was just for fun. >Normally multiple inputs/outputs WILL be in the same chain, e.g. for >merging video from multiple sources, processing subimages of the >video, displaying output on multiple devices at the same time, >pvr-style encoding+watching at the same time (!!) etc. Multiple sources are really multiple subchains so we have: -----> vf_aaa -. [node 1] [subchain1] \ [node 3] vf_mix -----> / -----> vd_bbb -` [node 2] [subchain2] ,----> vo1 [node 2] / [subchain1] -----> vf_split [node 1] \ [subchain2] `----> vo2 [node 3] As I said before chains must be supported by application only so it's not layer's care to keep all subchains in mind. :) About muliple links between nodes - you are really suppose that may be something alike that: /----------\ [subchain one] -------> vf_a [node 1] vf_b [node 2] -------> \----------/ [subchain two] All other causes will be just partial subchains which will be handled by filters. :) >In any case, I was talking about all the links including primary input >and output, not just the extras. See my explanations above. :) >> API just will have some proc alike (assume that stream_t is common >> node structure, I don't make smth better yet): >> >> stream_t *open_video_filter(char *name); >> int link_video_chain(stream_t *prev, stream_t *next); >> int unlink_video_chain(stream_t *prev, stream_t *next); >If these functions have _video_ in their name, there's no use in >having a generic "stream" structure. vp_node_t is just as good! But what I said before is just simple structure for application's developers so we prevent all level-specific data from touching by application and application's developers will learn only common API. :) With best wishes. Andriy. From andrej at lucky.net Sat Dec 20 20:22:25 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Sat, 20 Dec 2003 21:22:25 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220190117.GF96070@lucky.net> References: <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> <20031220152113.GE7833@brightrain.aerifal.cx> <20031220163504.GE96070@lucky.net> <20031220181650.GH7833@brightrain.aerifal.cx> <20031220190117.GF96070@lucky.net> Message-ID: <20031220192225.GA8311@lucky.net> Hi again. Additional comments here... Sometime (on Saturday, December 20 at 21:02) I've written something... >typedef struct vp_node_t { > node_t n; > struct vp_node_t *prev; > struct vp_node_t *next; > vp_funcs *func; > ....... >} vp_node_t; >So when we call link_video_chain(node,next) it will at first test if >node->func->add_out() exists and call it, otherwise if node->next was >filled then return error, else set node->next. After that do the same for >node next. If there was no errors then we assume nodes are linked. For >example, on pull_frame(node) we could pull frame from previous node by >node->prev->pull_frame. Calling unlink_video_chain(node,next) we will >do the same thing as on link_video_chain(). Since node_t is part of >vp_node_t and pointed to the same then both structures above may be only >in video_internal.h - application will know nothing about it but it will >work anyway. :) Just keep in mind some filter may have multiple _equal_ inputs or outputs so there isn't some "primary" input or output node. Also if we will eventually delete some node which is "primary" but leave "secondary" ones then it may be full a mess since both application and filters must reconsider which of "secondary" nodes must be set "primary" after that. So filters with multiple inputs must handle it internally and node->prev will be left NULL in that case. So we have no right to application to manipulate any prev or next pointers - it must be done only by functions link_video_chain() and unlink_video_chain(). :) With best wishes. Andriy. From andrej at lucky.net Sat Dec 20 20:27:07 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Sat, 20 Dec 2003 21:27:07 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220190117.GF96070@lucky.net> References: <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> <20031220152113.GE7833@brightrain.aerifal.cx> <20031220163504.GE96070@lucky.net> <20031220181650.GH7833@brightrain.aerifal.cx> <20031220190117.GF96070@lucky.net> Message-ID: <20031220192707.GB8311@lucky.net> Hi again. Another late notes. ;) Sometime (on Saturday, December 20 at 21:02) I've written something... >About muliple links between nodes - you are really suppose that may be >something alike that: > > /----------\ [subchain one] >-------> vf_a [node 1] vf_b [node 2] -------> > \----------/ [subchain two] And if there will be something like that then we could insert in subchain two some "null" filter and it'll solve that problem. :) With best wishes. Andriy. From dalias at aerifal.cx Sat Dec 20 21:03:29 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 15:03:29 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220190117.GF96070@lucky.net> References: <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> <20031220152113.GE7833@brightrain.aerifal.cx> <20031220163504.GE96070@lucky.net> <20031220181650.GH7833@brightrain.aerifal.cx> <20031220190117.GF96070@lucky.net> Message-ID: <20031220200329.GL7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 09:01:17PM +0200, Andriy N. Gritsenko wrote: > Hi, D Richard Felker III! > > Sometime (on Saturday, December 20 at 20:04) I've received something... > >On Sat, Dec 20, 2003 at 06:35:04PM +0200, Andriy N. Gritsenko wrote: > >> Hi, D Richard Felker III! > >> > >> Sometime (on Saturday, December 20 at 17:40) I've received something... > >> [...skipped...] > >> > >> >This is the same as what I said in approach 1. And here's where it > >> >gets messy: In order to be useful to the common api layer, the common > >> >structure at the beginning needs to contain the pointers for the > >> >node's inputs and outputs. Otherwise it won't help the app build > >> >chains. So inside here, you have pointers to links. But to what kind > >> >of links? They have to just be generic links, not audio or video > >> >links. And this means every time a video filter wants to access its > >> >links, it has to cast the pointers!! :( Very, very ugly... > >> > >> Hmm. And how about to put these pointers in layer-specific part of > >> the structure (outside of common part) while any layer has it's own type? > >> I don't think application anyway will want these pointers since they > > >These pointers are _exactly_ the thing the app will want to see, so it > >can build the pipeline. How can you build the pipeline if you can't > >connect pieces or tell when pieces are connected? :) > > Here goes the example: > > typedef struct { > int (*add_out) (struct vp_node_t *node, struct vp_node_t *next, .......); > int (*rm_out) (struct vp_node_t *node, struct vp_node_t *next, .......); > vp_frame_t *(*pull_frame) (struct vp_node_t *node, .......); > ....... > } vp_funcs; > > typedef struct vp_node_t { > node_t n; > struct vp_node_t *prev; > struct vp_node_t *next; > vp_funcs *func; > ....... > } vp_node_t; > > So when we call link_video_chain(node,next) it will at first test if > node->func->add_out() exists and call it, otherwise if node->next was > filled then return error, else set node->next. After that do the same for > node next. If there was no errors then we assume nodes are linked. For > example, on pull_frame(node) we could pull frame from previous node by > node->prev->pull_frame. Calling unlink_video_chain(node,next) we will > do the same thing as on link_video_chain(). Since node_t is part of > vp_node_t and pointed to the same then both structures above may be only > in video_internal.h - application will know nothing about it but it will > work anyway. :) This is all the exact design (well actually it's more elaborate since you have the link layer in between nodes) I've been describing for months. :) All I was saying is that we can't use the same structs/functions for video as we do for audio, etc. Each needs its own api, even if they are all similar from the calling app's perspective. > >> I don't see any example when two the same filters may have more than one > >> connection in the same chain so it's easy. > > >Hrrm, that's the whole point. The speech synth thing was just for fun. > >Normally multiple inputs/outputs WILL be in the same chain, e.g. for > >merging video from multiple sources, processing subimages of the > >video, displaying output on multiple devices at the same time, > >pvr-style encoding+watching at the same time (!!) etc. > > Multiple sources are really multiple subchains so we have: > > -----> vf_aaa -. [node 1] > [subchain1] \ [node 3] > vf_mix -----> > / > -----> vd_bbb -` [node 2] > [subchain2] > > ,----> vo1 [node 2] > / [subchain1] > -----> vf_split [node 1] > \ [subchain2] > `----> vo2 [node 3] > > As I said before chains must be supported by application only so it's not > layer's care to keep all subchains in mind. :) Actually the layer needs to know about the whole thing to make rendering work. It has to walk the whole chain to get source frames as filters need them. Also you forgot to consider the most fun example: -------> vf_subfilter ------> | /|\ \|/ | your_favorite_filters_here Where vf_subfilter is a filter that processes a subsection of the image with whatever filters you hook up to its auxiliary output/input. This example shows why it's not good to think in subchains/subtrees: the filter pipeline can have loops! > About muliple links between nodes - you are really suppose that may be > something alike that: > > /----------\ [subchain one] > -------> vf_a [node 1] vf_b [node 2] -------> > \----------/ [subchain two] > > All other causes will be just partial subchains which will be handled by > filters. :) I agree it's probably stupid, but there's no reason you can't have something like that with the current design. > >If these functions have _video_ in their name, there's no use in > >having a generic "stream" structure. vp_node_t is just as good! > > But what I said before is just simple structure for application's > developers so we prevent all level-specific data from touching by > application and application's developers will learn only common API. :) They have to be aware of the nodes anyway, since they're loading them, providing gui panels to configure them, etc.... Rich From dalias at aerifal.cx Sat Dec 20 21:05:29 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 20 Dec 2003 15:05:29 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220192225.GA8311@lucky.net> References: <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> <20031220152113.GE7833@brightrain.aerifal.cx> <20031220163504.GE96070@lucky.net> <20031220181650.GH7833@brightrain.aerifal.cx> <20031220190117.GF96070@lucky.net> <20031220192225.GA8311@lucky.net> Message-ID: <20031220200529.GM7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 09:22:25PM +0200, Andriy N. Gritsenko wrote: > Hi again. > > Additional comments here... > > Sometime (on Saturday, December 20 at 21:02) I've written something... > > >typedef struct vp_node_t { > > node_t n; > > struct vp_node_t *prev; > > struct vp_node_t *next; > > vp_funcs *func; > > ....... > >} vp_node_t; > > >So when we call link_video_chain(node,next) it will at first test if > >node->func->add_out() exists and call it, otherwise if node->next was > >filled then return error, else set node->next. After that do the same for > >node next. If there was no errors then we assume nodes are linked. For > >example, on pull_frame(node) we could pull frame from previous node by > >node->prev->pull_frame. Calling unlink_video_chain(node,next) we will > >do the same thing as on link_video_chain(). Since node_t is part of > >vp_node_t and pointed to the same then both structures above may be only > >in video_internal.h - application will know nothing about it but it will > >work anyway. :) > > Just keep in mind some filter may have multiple _equal_ inputs or > outputs so there isn't some "primary" input or output node. Also if we I created the idea of primary input/output on purpose. A filter that doesn't want to distinguish is free to ignore the difference, or to use only secondary links if it prefers. > So we have no right to application to > manipulate any prev or next pointers - it must be done only by functions > link_video_chain() and unlink_video_chain(). :) Something like that. Rich From diego at biurrun.de Sat Dec 20 21:04:51 2003 From: diego at biurrun.de (Diego Biurrun) Date: Sat, 20 Dec 2003 21:04:51 +0100 Subject: [MPlayer-G2-dev] RCS for G2 In-Reply-To: <200312201914.22799.michaelni@gmx.at> References: <20031220120026.5a0ee320.attila@kinali.ch> <20031220114054.GS7833@brightrain.aerifal.cx> <200312201914.22799.michaelni@gmx.at> Message-ID: <20031220200451.GA16170@pool.informatik.rwth-aachen.de> On Sat, Dec 20, 2003 at 07:14:22PM +0100, Michael Niedermayer wrote: > On Saturday 20 December 2003 12:40, D Richard Felker III wrote: > [...] > > > Currently i vote for using svn as it seems to be superior > > > to all others and it's quit mature. I'm also using it > > > for my personal projects for a few weeks and didnt > > > run in any problems so far. > > > > My vote is for cvs, but I'm not opposed to svn. I am _strongly_ > > opposed to bk!!! > just for the record, i am also strongly opposed to bk Me too. No matter how good bk may (or may not) be, alienating core developers is far too high a price to pay. Apart from that I am open to all suggestions. But please let us evaluate svn (and the others) based on whatever merit they have for us developers as _users_. Esthetical considerations (bloated or not, written in C++, etc) should be secondary IMHO. Diego From attila at kinali.ch Sat Dec 20 22:00:11 2003 From: attila at kinali.ch (Attila Kinali) Date: Sat, 20 Dec 2003 22:00:11 +0100 Subject: [MPlayer-G2-dev] RCS for G2 In-Reply-To: <20031220200451.GA16170@pool.informatik.rwth-aachen.de> References: <20031220120026.5a0ee320.attila@kinali.ch> <20031220114054.GS7833@brightrain.aerifal.cx> <200312201914.22799.michaelni@gmx.at> <20031220200451.GA16170@pool.informatik.rwth-aachen.de> Message-ID: <20031220220011.3d5e5437.attila@kinali.ch> On Sat, 20 Dec 2003 21:04:51 +0100 Diego Biurrun wrote: > Apart from that I am open to all suggestions. But please let us evaluate > svn (and the others) based on whatever merit they have for us developers as > _users_. Esthetical considerations (bloated or not, written in C++, etc) > should be secondary IMHO. As i already wrote, svn's usage is pretty much the same as cvs. The only big difference is that svn keeps a lokal copy of the unmodified code to compare with, ie most operations (including diff) act only localy- tla has a quite different philosophy and thus has a different usage. I dont know whether it's harder to learn than svn/cvs for someone who never used a RCS, but it's definitly a new thing for someone who already knows cvs. Beside i think that tla overcomplicates things a bit. I still haven't had a look at darcs. But maybe Rich can tell us more Attila Kinali -- egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger -- reeler in +kaosu From attila at kinali.ch Sat Dec 20 22:08:47 2003 From: attila at kinali.ch (Attila Kinali) Date: Sat, 20 Dec 2003 22:08:47 +0100 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220111142.GQ7833@brightrain.aerifal.cx> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> Message-ID: <20031220220847.36028970.attila@kinali.ch> On Sat, 20 Dec 2003 06:11:42 -0500 D Richard Felker III wrote: > > vd ----> vf_split ----------------------------------------> vo > \ [video layer] > `---> vf/sf_ocr > \ [subtitle layer] > `--> sf/af_speechsynth > \ > \ [audio layer] > `--> af_merge -------------> ao > /` > ad ----------------------------------' Just a little question here, do you have an idea how you'll handle frame delays in the chain. Ie if the chain is split into two parts who are merged later again. One is a direct connection (0 delay) and the other does some fancy computation where it needs a few frames in advance to compute a frame (x frames delay). Ie here you'd have not only to store x frames on the 0 delay chain but also to realize that these two chainse have different delays and that you have to pass more frames to one side to get one frame out at the merge point. It gets even complicater if you have 2 different sources providing 2 chains with different delays which are merged together at the end. Or is this already handled ? Attila Kinali -- egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger -- reeler in +kaosu From dalias at aerifal.cx Sun Dec 21 07:04:37 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sun, 21 Dec 2003 01:04:37 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220220847.36028970.attila@kinali.ch> References: <20031219084448.GG7833@brightrain.aerifal.cx> <20031219120843.191ee139.kinali@gmx.net> <20031219144911.GB70845@lucky.net> <20031219201053.GJ7833@brightrain.aerifal.cx> <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220220847.36028970.attila@kinali.ch> Message-ID: <20031221060437.GO7833@brightrain.aerifal.cx> On Sat, Dec 20, 2003 at 10:08:47PM +0100, Attila Kinali wrote: > On Sat, 20 Dec 2003 06:11:42 -0500 > D Richard Felker III wrote: > > > > > vd ----> vf_split ----------------------------------------> vo > > \ [video layer] > > `---> vf/sf_ocr > > \ [subtitle layer] > > `--> sf/af_speechsynth > > \ > > \ [audio layer] > > `--> af_merge -------------> ao > > /` > > ad ----------------------------------' > > Just a little question here, do you have an idea how you'll handle > frame delays in the chain. Ie if the chain is split into two parts > who are merged later again. One is a direct connection (0 delay) > and the other does some fancy computation where it needs a few > frames in advance to compute a frame (x frames delay). > Ie here you'd have not only to store x frames on the 0 delay > chain but also to realize that these two chainse have different > delays and that you have to pass more frames to one side to get > one frame out at the merge point. > It gets even complicater if you have 2 different sources > providing 2 chains with different delays which are merged > together at the end. > > Or is this already handled ? It's naturally handled just by the way the system works. Suppose in the above example, the ao wants data in advance, to buffer. Well then af_merge will be requesting data from both its inputs, which will propagate back, causing vf_split to request input too. Thus the video will be decoded in advance too. vf_split is responsible for delivering frames whenever one of its outputs asks for them, so since the vo won't be asking for them as soon, vf_split is going to have to implement some sort of queue for output frames. IMO this isn't a design issue with the layer itself, only an implementation issue for filters with multiple outputs. Rich From alex at fsn.hu Sun Dec 21 14:08:04 2003 From: alex at fsn.hu (Alex Beregszaszi) Date: Sun, 21 Dec 2003 14:08:04 +0100 Subject: [MPlayer-G2-dev] RCS for G2 In-Reply-To: <20031220114054.GS7833@brightrain.aerifal.cx> References: <20031220120026.5a0ee320.attila@kinali.ch> <20031220114054.GS7833@brightrain.aerifal.cx> Message-ID: <20031221140804.14ac2839.alex@fsn.hu> Hi, > > Currently i vote for using svn as it seems to be superior > > to all others and it's quit mature. I'm also using it > > for my personal projects for a few weeks and didnt > > run in any problems so far. > > My vote is for cvs, but I'm not opposed to svn. I am _strongly_ > opposed to bk!!! Agree! -- Alex Beregszaszi (MPlayer Core Developer -- http://www.mplayerhq.hu/) From andrej at lucky.net Sun Dec 21 14:52:49 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Sun, 21 Dec 2003 15:52:49 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220200329.GL7833@brightrain.aerifal.cx> References: <20031220102623.GA7397@lucky.net> <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> <20031220152113.GE7833@brightrain.aerifal.cx> <20031220163504.GE96070@lucky.net> <20031220181650.GH7833@brightrain.aerifal.cx> <20031220190117.GF96070@lucky.net> <20031220200329.GL7833@brightrain.aerifal.cx> Message-ID: <20031221135249.GA12754@lucky.net> Hi, D Richard Felker III! Sometime (on Saturday, December 20 at 21:53) I've received something... >> As I said before chains must be supported by application only so it's not >> layer's care to keep all subchains in mind. :) >Actually the layer needs to know about the whole thing to make >rendering work. It has to walk the whole chain to get source frames as >filters need them. Each filter has all it's own previous nodes so it'll just pull out frames from those and it's all. It even don't need to know if previous nodes are filters or decoders. So layer has no needs to know all chain structure. :) > Also you forgot to consider the most fun example: >-------> vf_subfilter ------> > | /|\ > \|/ | > your_favorite_filters_here >Where vf_subfilter is a filter that processes a subsection of the >image with whatever filters you hook up to its auxiliary output/input. >This example shows why it's not good to think in subchains/subtrees: >the filter pipeline can have loops! Filter vf_subfilter may have loops only if it may mix main input with looped input so we free to call link_video_chain(vf_instance,vf_instance) i.e. prev node == next node. It will work anyway since all I said doesn't deny it. :) >> But what I said before is just simple structure for application's >> developers so we prevent all level-specific data from touching by >> application and application's developers will learn only common API. :) >They have to be aware of the nodes anyway, since they're loading them, >providing gui panels to configure them, etc.... Exactly what I said - application must be aware about common part of nodes, not about layer-specific part. It will know nodes pointers itself and keep all link structure in some own form (it may be some tree and so on in GUI or anything), independently from nodes structure. Why it must be so - tree of chains may contain not only nodes pointers but some GUI info and may be much of other. And layer-specific part (that including node-to-node pointers) must not be touched by application. But on other hand, full structure (parallel chains, crossed chains, mixed or splitted chains, sources and destinations) has no need to be known to layers. So it's all I think. I hope I wrote it understandable. :) With best wishes. Andriy. From andrej at lucky.net Mon Dec 22 12:51:16 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Mon, 22 Dec 2003 13:51:16 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031220200529.GM7833@brightrain.aerifal.cx> References: <20031220111142.GQ7833@brightrain.aerifal.cx> <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> <20031220152113.GE7833@brightrain.aerifal.cx> <20031220163504.GE96070@lucky.net> <20031220181650.GH7833@brightrain.aerifal.cx> <20031220190117.GF96070@lucky.net> <20031220192225.GA8311@lucky.net> <20031220200529.GM7833@brightrain.aerifal.cx> Message-ID: <20031222115116.GB27868@lucky.net> Hi, D Richard Felker III! Sometime (on Saturday, December 20 at 21:56) I've received something... >> Just keep in mind some filter may have multiple _equal_ inputs or >> outputs so there isn't some "primary" input or output node. Also if we >I created the idea of primary input/output on purpose. A filter that >doesn't want to distinguish is free to ignore the difference, or to >use only secondary links if it prefers. Yes, I thought the same - main (layer-specific) node structure have only one link pointer and if filter may have more that one link here it will use own private array anyway and it may use or may use not that link pointer. Otherwise we will have some add_input() stub (default proc) that will use that only link ptr. :) >> So we have no right to application to >> manipulate any prev or next pointers - it must be done only by functions >> link_video_chain() and unlink_video_chain(). :) >Something like that. I'm glad we agree with that. :) So as resume - we don't need to have prev/next pointers in main (common,public,etc.) node structure since no application may use it. Summary of arguments to have layer-independent node structure: 1. Application will not see layer-specific data. 2. Application cannot alternate links pointers so we get rid of bad developers who may want to manipulate it but they will use (un)link_* functions instead. 3. Application's developers will see less docs so it will be simpler for they. 4. We will have only one node structure instead of 5 (1: compressed (or undefined) stream between demuxer and decoder or between codec and muxer, 2: video, 3: audio, 4: text/sub, 5: menu/etc.) so will have less of mess. Do you see any arguments against it? With best wishes. Andriy. From dalias at aerifal.cx Mon Dec 22 19:03:48 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 22 Dec 2003 13:03:48 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031222115116.GB27868@lucky.net> References: <20031220123142.GB96070@lucky.net> <20031220140522.GX7833@brightrain.aerifal.cx> <20031220145815.GD96070@lucky.net> <20031220152113.GE7833@brightrain.aerifal.cx> <20031220163504.GE96070@lucky.net> <20031220181650.GH7833@brightrain.aerifal.cx> <20031220190117.GF96070@lucky.net> <20031220192225.GA8311@lucky.net> <20031220200529.GM7833@brightrain.aerifal.cx> <20031222115116.GB27868@lucky.net> Message-ID: <20031222180347.GG7833@brightrain.aerifal.cx> On Mon, Dec 22, 2003 at 01:51:16PM +0200, Andriy N. Gritsenko wrote: > >I created the idea of primary input/output on purpose. A filter that > >doesn't want to distinguish is free to ignore the difference, or to > >use only secondary links if it prefers. > > Yes, I thought the same - main (layer-specific) node structure have Uhg... > >> So we have no right to application to > >> manipulate any prev or next pointers - it must be done only by functions > >> link_video_chain() and unlink_video_chain(). :) > > >Something like that. > > I'm glad we agree with that. :) So as resume - we don't need to have > prev/next pointers in main (common,public,etc.) node structure since no > application may use it. > > Summary of arguments to have layer-independent node structure: > 1. Application will not see layer-specific data. > 2. Application cannot alternate links pointers so we get rid of bad > developers who may want to manipulate it but they will use (un)link_* > functions instead. > 3. Application's developers will see less docs so it will be simpler for > they. > 4. We will have only one node structure instead of 5 (1: compressed (or > undefined) stream between demuxer and decoder or between codec and muxer, > 2: video, 3: audio, 4: text/sub, 5: menu/etc.) so will have less of mess. > > Do you see any arguments against it? Yes, it's stupid. We are not C++ coders. Go take your ideas to avifile or something. 1. An application doesn't see internals unless it tries to. 2. I don't care if an app stupidly tries to use internals. It will crash when we change stuff and then the author will look stupid. 3. Non sequitur. 4. There are only 3, not five: video/audio/sub. WTF are you smoking? A menu chain?!?! Anyway, Andriy, I'm really not interested in talking with you much more about this. You refuse to RTFS and RTFp (p=proposal) and instead pretend you're coming up with new ideas when you say what's been there for months. Second, you refuse to understand the details, and only think like a stupid C++ class hierarchy designer rather than a coder. And third, you keep suggesting nonsense wrappers and obfuscation that just slow things down (to code and to run). Maybe you're well-meaning, but it's not coming across that way. If you want to make further comments, _please_ RTFS in detail and respond to it, rather than writing these general OOP-fantasy emails. Rich From pozsy at uhulinux.hu Mon Dec 22 19:29:18 2003 From: pozsy at uhulinux.hu (Pozsar Balazs) Date: Mon, 22 Dec 2003 19:29:18 +0100 (CET) Subject: [MPlayer-G2-dev] RCS for G2 In-Reply-To: <20031220120026.5a0ee320.attila@kinali.ch> Message-ID: On Sat, 20 Dec 2003, Attila Kinali wrote: > I think it's time to put G2 onto a RCS. [...] I would strongly recommend svn, and even offer help to set it up :) I think it is always a good idea to use an rcs _even_ if the code is changing rapidly or only very few people are allowed to commit. It lets other people to track development much much easier, and the more people review the code the better. -- pozsy From dalias at aerifal.cx Tue Dec 23 02:10:26 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 22 Dec 2003 20:10:26 -0500 Subject: [MPlayer-G2-dev] Information about video coding In-Reply-To: <20031220171945.GB11990@nicewarrior.org> Message-ID: <20031223011026.GN7833@brightrain.aerifal.cx> [this bounced once, resending] On Sat, Dec 20, 2003 at 11:19:45AM -0600, Joey Parrish wrote: > On Sat, Dec 20, 2003 at 03:19:53PM +0100, Attila Kinali wrote: > > Heyo, > > > > As i keep asking more and more questions about video coding > > the last few days, i'd like to ask whether there are any > > good resources (webpages, papers, books) about video coding, > > especialy in practical applications not only academic stuff. > > Similar topic, I"ve just been reminded... > I've wanted to encode a video composed of lots of scrolling > ascii-art text. ?nyone know of a codec that is designed for > stupid things like this without insanely high bitrates? > (Very simple images, but very little in common between frames.) Yes, it's called "text file with ansi escapes"... :))))) Rich From dalias at aerifal.cx Tue Dec 23 02:11:35 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 22 Dec 2003 20:11:35 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031221135249.GA12754@lucky.net> Message-ID: <20031223011135.GO7833@brightrain.aerifal.cx> [bounded once, resending] On Sun, Dec 21, 2003 at 03:52:49PM +0200, Andriy N. Gritsenko wrote: > Hi, D Richard Felker III! > > Sometime (on Saturday, December 20 at 21:53) I've received something... > >> As I said before chains must be supported by application only so it's not > >> layer's care to keep all subchains in mind. :) > > >Actually the layer needs to know about the whole thing to make > >rendering work. It has to walk the whole chain to get source frames as > >filters need them. > > Each filter has all it's own previous nodes so it'll just pull out > frames from those and it's all. It even don't need to know if previous > nodes are filters or decoders. Of course not. Isn't that obvious? > > Also you forgot to consider the most fun example: > > > >-------> vf_subfilter ------> > > | /|\ > > \|/ | > > your_favorite_filters_here > > >Where vf_subfilter is a filter that processes a subsection of the > >image with whatever filters you hook up to its auxiliary output/input. > >This example shows why it's not good to think in subchains/subtrees: > >the filter pipeline can have loops! > > Filter vf_subfilter may have loops only if it may mix main input with > looped input so we free to call link_video_chain(vf_instance,vf_instance) > i.e. prev node == next node. This is nonsense!! The whole point is that there are _different_ inputs and outputs of each node; they're not interchangible! So link_video_chain() api is dumb! If you want to know how it actually works in g2 vp code, you could RTFS. Normally for filters you call vp_insert_filter on an existing link, and the link forks to insert the filter in the middle. This also works fine on something like vf_subfilter, which would by default have a looped link to itself. For split/merge type filters, different api is needed to manipulate it interactively. And of course it has to protect against idiotic infinite loops and such. Designing the api for this isn't a priority right now since I just want to get g2 working... :) > It will work anyway since all I said doesn't deny it. :) Last I checked you weren't writing this, so what you say doesn't really have anything to do with whether it works. > Exactly what I said - application must be aware about common part of > nodes, not about layer-specific part. OK, by "layer-specific" part, you mean the internals. To me, layer-specific would mean the things that hold for one layer and not another, e.g. audio vs video. So there's where our misunderstanding was. > It will know nodes pointers itself > and keep all link structure in some own form (it may be some tree and so > on in GUI or anything), independently from nodes structure. Why it must > be so - tree of chains may contain not only nodes pointers but some GUI > info and may be much of other. And layer-specific part (that including > node-to-node pointers) must not be touched by application. But on other > hand, full structure (parallel chains, crossed chains, mixed or splitted > chains, sources and destinations) has no need to be known to layers. So The internal vp code has to be able to walk the pipeline for rendering, so it needs to know something... > it's all I think. I hope I wrote it understandable. :) Better now. Rich From dalias at aerifal.cx Tue Dec 23 01:25:42 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 22 Dec 2003 19:25:42 -0500 Subject: [MPlayer-G2-dev] VP layer progress [update!] In-Reply-To: <20031219123858.GH7833@brightrain.aerifal.cx> References: <20031219123858.GH7833@brightrain.aerifal.cx> Message-ID: <20031223002542.GK7833@brightrain.aerifal.cx> [resent because mphq lost it] I've made a lot more progress on the vp layer. There are still some key things left to be done, but I'm posting my TODO list here and reminding you of the source-browse url: http://brightrain.aerifal.cx/~dalias/vp-in-progress/ If anyone has objections to it (even silly stuff like bad names for the functions or struct members) please post and we can discuss. Please keep in mind that the side you should be critiquing now is the internal/filter-implementation/codec-implementation side, not the external api for apps to setup rendering chains, which doesn't exist yet. (In fact that probably won't exist for quite a while! First I'll just make a simple -vf list parser to build a chain...) TODO follows. Cheers, Rich TODO as of 2003-12-22 vo2 layer ========= * buffer pool system to avoid dumb alloc/free * get rid of crud in query_format * improve config/query system so we can negotiate mode * report which buffers are potentially-visible vo2-vp wrapper ============== * remove and make vo drivers vp-native (?????) * handle out-of-buffers deadlock!!! * wait to blit auto/export images if buffer is visible (??) vp layer ======== * design config negotiation system * implement stride restrictions * implement working query functions * provide api for building and examining the pipeline/links * work out safely closing nodes (!!) * find a way to keep track of buffer age (lavc performance!!) * support for palette-based images * support for metadata planes (skip, qp, mv, etc...) * do we need restrictions on slices? * vp_reget_image???? demuxers ======== * convert to use rate_d/rate_m based pts rather than 1/rate_d * clean up nonsense pts flags * pts==0 is used to indicate "not available", this is broken. change! * always seek to a point with known exact pts! * remove resync_audio_stream * instead have seek be driven from vp/ap layer demuxer-codec interface ======================= * figure out how to get lavc to decode from arpi's mpeg packets :) * specify requirements so that usable pts is _always_ available * figure out if codec wrapper is needed/useful or not codecs ====== * adapt the whole codecs.conf/selection system to work with vp... :( test-play.c =========== * use new vp api :) From dalias at aerifal.cx Tue Dec 23 01:26:03 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 22 Dec 2003 19:26:03 -0500 Subject: [MPlayer-G2-dev] basic g2 vp (video pipeline) code Message-ID: <20031223002603.GL7833@brightrain.aerifal.cx> [resent because mphq lost it] [cut to keep from digging up an old thread with mostly outdated info] In-Reply-To: <200311021858.hA2IwT9B025362 at mail.mplayerhq.hu> On Sun, Nov 02, 2003 at 07:58:29PM +0100, Arpi wrote: > > Maybe stride should be negotiated at config-time? That would help > > nope, its impossiblee imho in many cases, and adds yet another > limitation... I decided to negotiate stride at config-time anyway. :))))) So now it's time to address these issues... > dont forget that config() happens befor ethe first frame is passed, > so in case of 3rd party codecs, or anything special, you wont know > the wanted stride before processing first frame >From what I can tell, there are two cases: 1. The codec is rendering into its own internal buffer, which we'll export. In that case, we can wait to call config until it decodes the first frame, and then we're safe! In fact, in this case the only stride restrictions will be that the output filter has to accept the stride that the codec exports. 2. The codec is using a buffer provided by MPlayer, either automatic or direct rendering. So we already had to allocate an image before decoding the first frame! And thus we must have already known (somehow!) what stride we wanted. Actually there's a slices case too, but we can handle that just like with vd_ffmpeg. If there are any problems with this, we can put stride restrictions in the codecs.conf data. In fact, one is already there. The old "flip" flag is dead, replaced by negative stride. This makes it so every MPlayer-native filter can support flipped images, without having to write special support for them. So, I don't see any obstacles to config-time stride decision. Now I just have to design a system of negotiation to decide where to put the scale/expand/etc. filters to maximize performance... Rich From arpi at thot.banki.hu Tue Dec 23 11:33:49 2003 From: arpi at thot.banki.hu (Arpi) Date: Tue, 23 Dec 2003 11:33:49 +0100 (CET) Subject: [MPlayer-G2-dev] basic g2 vp (video pipeline) code In-Reply-To: <20031223002603.GL7833@brightrain.aerifal.cx> Message-ID: <20031223103349.3AE332071F@mail.mplayerhq.hu> Hi, > [resent because mphq lost it] > > [cut to keep from digging up an old thread with mostly outdated info] > In-Reply-To: <200311021858.hA2IwT9B025362 at mail.mplayerhq.hu> > > On Sun, Nov 02, 2003 at 07:58:29PM +0100, Arpi wrote: > > > Maybe stride should be negotiated at config-time? That would help > > > > nope, its impossiblee imho in many cases, and adds yet another > > limitation... > > I decided to negotiate stride at config-time anyway. :))))) > So now it's time to address these issues... BAD! A'rpi / Astral & ESP-team -- Developer of MPlayer G2, the Movie Framework for all - http://www.MPlayerHQ.hu From saschasommer at freenet.de Tue Dec 23 11:44:23 2003 From: saschasommer at freenet.de (Sascha Sommer) Date: Tue, 23 Dec 2003 11:44:23 +0100 Subject: [MPlayer-G2-dev] basic g2 vp (video pipeline) code References: <20031223103349.3AE332071F@mail.mplayerhq.hu> Message-ID: <003b01c3c941$bbb0f6e0$766f54d9@oemcomputer> > Hi, > > > [resent because mphq lost it] > > > > [cut to keep from digging up an old thread with mostly outdated info] > > In-Reply-To: <200311021858.hA2IwT9B025362 at mail.mplayerhq.hu> > > > > On Sun, Nov 02, 2003 at 07:58:29PM +0100, Arpi wrote: > > > > Maybe stride should be negotiated at config-time? That would help > > > > > > nope, its impossiblee imho in many cases, and adds yet another > > > limitation... > > > > I decided to negotiate stride at config-time anyway. :))))) > > So now it's time to address these issues... > > BAD! > ;) Stride may and does change between mutliple lock/unlocks. At least with directx when the buffers are in videoram. And imho it should only be treaded valid between frame_start/frame_done. Of course planes[] may change, too. Sascha From andrej at lucky.net Tue Dec 23 12:14:06 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Tue, 23 Dec 2003 13:14:06 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031223011135.GO7833@brightrain.aerifal.cx> References: <20031221135249.GA12754@lucky.net> <20031223011135.GO7833@brightrain.aerifal.cx> Message-ID: <20031223111406.GA67067@lucky.net> Hi, D Richard Felker III! I've already decided to don't continue that thread but it seems I must to resolve last possible misunderstandings. First of all, Rich, thank you for all criticism. :) I would never think if my sincere wish to avoid manipulating node structure by application is something like C++ (since I don't like C++ very much). OK, I've stopped on that. I've seen your VP layer progress. I don't have any objections. And since you said it's internal side for now it looks fine. Only thing I want ask you is to keep in mind (when you'll do application-side API) two moments: 1. muxer must get frames in some unified form independent from it's type. It's why I wanted layer-independent node structure in the beginning. 2. since filters may put links in the node structure as they wish then application cannot touch any of vp_link_t pointers in vp_node_t so for application-side API I want to have some functions to manipulate (set and remove) these pointers. So it's all. :) Sometime (on Tuesday, December 23 at 12:08) I've received something... >The internal vp code has to be able to walk the pipeline for >rendering, so it needs to know something... It seems I don't understand something. We construct pipeline to get frames via pull_frame(), don't we? So if pipeline is something alike --> vp_node1 --> vp_node2 --> vp_node3 then vp_node3 will pull frame from vp_node2 but vp_node3 has no needs to know if vp_node1 exists or not. vp_node2 will give frame to vp_node3 from pending buffer or after pulling it out from vp_node1. So no filter in pipeline need to know _all_ pipeline but _only_ own previous and next nodes, i.e. own links. If I'm wrong then kick me there, please. ;) With best wishes. Thank you for your work. Andriy. From andrej at lucky.net Tue Dec 23 12:35:12 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Tue, 23 Dec 2003 13:35:12 +0200 Subject: [MPlayer-G2-dev] Re: VP layer progress [update!] In-Reply-To: <20031223002542.GK7833@brightrain.aerifal.cx> References: <20031219123858.GH7833@brightrain.aerifal.cx> <20031223002542.GK7833@brightrain.aerifal.cx> Message-ID: <20031223113512.GB67067@lucky.net> Hi, D Richard Felker III! Sometime (on Tuesday, December 23 at 12:09) I've received something... >I've made a lot more progress on the vp layer. There are still some >key things left to be done, but I'm posting my TODO list here and >reminding you of the source-browse url: > http://brightrain.aerifal.cx/~dalias/vp-in-progress/ >If anyone has objections to it (even silly stuff like bad names for >the functions or struct members) please post and we can discuss. Are you sure vp_node_t must have vp_link_t** pointers? As I may see some filter may want it as NULL-terminated list, some as counted list (but there is no counter in the structure so filter will have it in vp_priv_s), some may want to sort list in own order so have some own structure for that. So I think you have to remove these xin and xout pointers from vp_node_t structure since it may be just a waste. If any filter may have more than one input or output link then that filter may always have these pointers in vp_priv_s structure. :) With best wishes. Andriy. From dalias at aerifal.cx Tue Dec 23 16:50:29 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Tue, 23 Dec 2003 10:50:29 -0500 Subject: [MPlayer-G2-dev] basic g2 vp (video pipeline) code In-Reply-To: <003b01c3c941$bbb0f6e0$766f54d9@oemcomputer> References: <20031223103349.3AE332071F@mail.mplayerhq.hu> <003b01c3c941$bbb0f6e0$766f54d9@oemcomputer> Message-ID: <20031223155029.GA257@brightrain.aerifal.cx> On Tue, Dec 23, 2003 at 11:44:23AM +0100, Sascha Sommer wrote: > > Hi, > > > > > [resent because mphq lost it] > > > > > > [cut to keep from digging up an old thread with mostly outdated info] > > > In-Reply-To: <200311021858.hA2IwT9B025362 at mail.mplayerhq.hu> > > > > > > On Sun, Nov 02, 2003 at 07:58:29PM +0100, Arpi wrote: > > > > > Maybe stride should be negotiated at config-time? That would help > > > > > > > > nope, its impossiblee imho in many cases, and adds yet another > > > > limitation... > > > > > > I decided to negotiate stride at config-time anyway. :))))) > > > So now it's time to address these issues... > > > > BAD! > > > > ;) > Stride may and does change between mutliple lock/unlocks. At least with > directx > when the buffers are in videoram. And imho it should only be treaded valid > between > frame_start/frame_done. Of course planes[] may change, too. If you'd RTFS'd, you'd see that for DIRECT and EXPORT type buffers, the creator of the buffer is always allowed to select the stride, except in the case where STRIDE_STATIC restriction is in place. The negotiated stride will just be used by AUTO buffers, and _should_ be used by DIRECT buffers whenever possible. Examples of when STRIDE_STATIC should be used are: libavcodec, vf_fil. Arpi, do you have concrete objections rather than "BAD!"?? Rich From dalias at aerifal.cx Tue Dec 23 16:54:14 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Tue, 23 Dec 2003 10:54:14 -0500 Subject: [MPlayer-G2-dev] Re: VP layer progress [update!] In-Reply-To: <20031223113512.GB67067@lucky.net> References: <20031219123858.GH7833@brightrain.aerifal.cx> <20031223002542.GK7833@brightrain.aerifal.cx> <20031223113512.GB67067@lucky.net> Message-ID: <20031223155414.GB257@brightrain.aerifal.cx> On Tue, Dec 23, 2003 at 01:35:12PM +0200, Andriy N. Gritsenko wrote: > Hi, D Richard Felker III! > > Sometime (on Tuesday, December 23 at 12:09) I've received something... > >I've made a lot more progress on the vp layer. There are still some > >key things left to be done, but I'm posting my TODO list here and > >reminding you of the source-browse url: > > > http://brightrain.aerifal.cx/~dalias/vp-in-progress/ > > >If anyone has objections to it (even silly stuff like bad names for > >the functions or struct members) please post and we can discuss. > > Are you sure vp_node_t must have vp_link_t** pointers? As I may see > some filter may want it as NULL-terminated list, some as counted list > (but there is no counter in the structure so filter will have it in > vp_priv_s), some may want to sort list in own order so have some own > structure for that. So I think you have to remove these xin and xout > pointers from vp_node_t structure since it may be just a waste. If any > filter may have more than one input or output link then that filter may > always have these pointers in vp_priv_s structure. :) Impossible. If the pointers are hidden, vp_pull_frame can't walk the chain! Originally when I was planning for recursive calling, I intended to do something like this, but now I think there has to be some unified structure or else the vp layer and the program that set up the pipeline can lose track of what's in it! Rich From dalias at aerifal.cx Tue Dec 23 16:56:07 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Tue, 23 Dec 2003 10:56:07 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031223111406.GA67067@lucky.net> References: <20031221135249.GA12754@lucky.net> <20031223011135.GO7833@brightrain.aerifal.cx> <20031223111406.GA67067@lucky.net> Message-ID: <20031223155607.GC257@brightrain.aerifal.cx> On Tue, Dec 23, 2003 at 01:14:06PM +0200, Andriy N. Gritsenko wrote: > >The internal vp code has to be able to walk the pipeline for > >rendering, so it needs to know something... > > It seems I don't understand something. We construct pipeline to get > frames via pull_frame(), don't we? So if pipeline is something alike > --> vp_node1 --> vp_node2 --> vp_node3 then vp_node3 will pull frame > from vp_node2 but vp_node3 has no needs to know if vp_node1 exists or > not. vp_node2 will give frame to vp_node3 from pending buffer or after > pulling it out from vp_node1. So no filter in pipeline need to know _all_ > pipeline but _only_ own previous and next nodes, i.e. own links. If I'm > wrong then kick me there, please. ;) No individual filter needs to know. Rather, vp_pull_frame does. For various reasons, the recursive calling was abandoned in favor of a "walk-the-list" approach. Rich From andrej at lucky.net Tue Dec 23 17:13:42 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Tue, 23 Dec 2003 18:13:42 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031223155607.GC257@brightrain.aerifal.cx> References: <20031221135249.GA12754@lucky.net> <20031223011135.GO7833@brightrain.aerifal.cx> <20031223111406.GA67067@lucky.net> <20031223155607.GC257@brightrain.aerifal.cx> Message-ID: <20031223161342.GA29405@lucky.net> Hi, D Richard Felker III! Sometime (on Tuesday, December 23 at 17:44) I've received something... >On Tue, Dec 23, 2003 at 01:14:06PM +0200, Andriy N. Gritsenko wrote: >> >The internal vp code has to be able to walk the pipeline for >> >rendering, so it needs to know something... >> It seems I don't understand something. We construct pipeline to get >> frames via pull_frame(), don't we? So if pipeline is something alike >> --> vp_node1 --> vp_node2 --> vp_node3 then vp_node3 will pull frame >> from vp_node2 but vp_node3 has no needs to know if vp_node1 exists or >> not. vp_node2 will give frame to vp_node3 from pending buffer or after >> pulling it out from vp_node1. So no filter in pipeline need to know _all_ >> pipeline but _only_ own previous and next nodes, i.e. own links. If I'm >> wrong then kick me there, please. ;) >No individual filter needs to know. Rather, vp_pull_frame does. For >various reasons, the recursive calling was abandoned in favor of a >"walk-the-list" approach. Hmm, so you want to deny direct pulling frames by filters? If it's so then how you suppose to handle such filters as "interleaving mixer" (two frames from one then three frames from second then one frame from third, and again from one, for example) or "time-scaling" (i.e. scaling clip from 0.2s long to 1.15s)? Or if you allow direct pulling frames then how your vf_pull_frame() will count them? I wonder if that possible without direct pulling frames by filter because only filter may know how much portion of video it wants and only filter may know if it wish just drop frame or change it's PTS/duration. With best wishes. Andriy. From andrej at lucky.net Tue Dec 23 17:29:01 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Tue, 23 Dec 2003 18:29:01 +0200 Subject: [MPlayer-G2-dev] Re: VP layer progress [update!] In-Reply-To: <20031223155414.GB257@brightrain.aerifal.cx> References: <20031219123858.GH7833@brightrain.aerifal.cx> <20031223002542.GK7833@brightrain.aerifal.cx> <20031223113512.GB67067@lucky.net> <20031223155414.GB257@brightrain.aerifal.cx> Message-ID: <20031223162901.GB29405@lucky.net> Hi, D Richard Felker III! Sometime (on Tuesday, December 23 at 17:42) I've received something... >On Tue, Dec 23, 2003 at 01:35:12PM +0200, Andriy N. Gritsenko wrote: >> Are you sure vp_node_t must have vp_link_t** pointers? As I may see >> some filter may want it as NULL-terminated list, some as counted list >> (but there is no counter in the structure so filter will have it in >> vp_priv_s), some may want to sort list in own order so have some own >> structure for that. So I think you have to remove these xin and xout >> pointers from vp_node_t structure since it may be just a waste. If any >> filter may have more than one input or output link then that filter may >> always have these pointers in vp_priv_s structure. :) >Impossible. If the pointers are hidden, vp_pull_frame can't walk the >chain! Originally when I was planning for recursive calling, I >intended to do something like this, but now I think there has to be >some unified structure or else the vp layer and the program that set >up the pipeline can lose track of what's in it! How do you plan walk the chain when chain is combined from mixing and splitting filters and some filters are not-plain mixer but something with interleaving? And if inputs may be added/deleted/stopped/resumed on the fly? Just keep in mind that G2 may be used for some video editor, for example. If you don't allow such mixed chains then you just limiting G2 without real needs. ;) About "program that set up the pipeline can lose track of what's in it" I'm sure it may happen only if application's developers are too lazy to handle that pipeline. :) I'm still sure pipeline must be controlled only by application. Don't overcomplicate layers, please. :) With best wishes. Andriy. From joey at nicewarrior.org Tue Dec 23 18:17:47 2003 From: joey at nicewarrior.org (Joey Parrish) Date: Tue, 23 Dec 2003 11:17:47 -0600 Subject: [MPlayer-G2-dev] Information about video coding In-Reply-To: <20031223011026.GN7833@brightrain.aerifal.cx> References: <20031220171945.GB11990@nicewarrior.org> <20031223011026.GN7833@brightrain.aerifal.cx> Message-ID: <20031223171746.GA22179@nicewarrior.org> On Mon, Dec 22, 2003 at 08:10:26PM -0500, D Richard Felker III wrote: > > Similar topic, I've just been reminded... > > I've wanted to encode a video composed of lots of scrolling > > ascii-art text. Anyone know of a codec that is designed for > > stupid things like this without insanely high bitrates? > > (Very simple images, but very little in common between frames.) > > Yes, it's called "text file with ansi escapes"... :))))) Oh, damn. But that's my source! :) Thanks anyways. --Joey -- All philosophy is naive. From dalias at aerifal.cx Wed Dec 24 07:48:54 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 24 Dec 2003 01:48:54 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031223161342.GA29405@lucky.net> References: <20031221135249.GA12754@lucky.net> <20031223011135.GO7833@brightrain.aerifal.cx> <20031223111406.GA67067@lucky.net> <20031223155607.GC257@brightrain.aerifal.cx> <20031223161342.GA29405@lucky.net> Message-ID: <20031224064854.GJ257@brightrain.aerifal.cx> On Tue, Dec 23, 2003 at 06:13:42PM +0200, Andriy N. Gritsenko wrote: > >No individual filter needs to know. Rather, vp_pull_frame does. For > >various reasons, the recursive calling was abandoned in favor of a > >"walk-the-list" approach. > > Hmm, so you want to deny direct pulling frames by filters? If it's so > then how you suppose to handle such filters as "interleaving mixer" (two > frames from one then three frames from second then one frame from third, > and again from one, for example) or "time-scaling" (i.e. scaling clip > from 0.2s long to 1.15s)? Or if you allow direct pulling frames then how > your vf_pull_frame() will count them? I wonder if that possible without > direct pulling frames by filter because only filter may know how much > portion of video it wants and only filter may know if it wish just drop > frame or change it's PTS/duration. Again, please _read_ the fine documentation and (maybe not-so-fine) source. :) Nodes of the pipeline call vp_request_frame() on a link when they want to receive a frame over it. So it's still entirely pull based, just not carried out by call-recursion. Originally I thought call recursion was the most graceful and simple way to run the pipeline, but it requires ugly hacks for auto-inserting filters (which Arpi didn't like, and I eventually came to agree with), and it makes it impossible to bail-out early (for example, to return to the caller once a certain amount of time has passed, so the caller can process user input, and then resume filtering where we left off later). The new system is not _quite_ as graceful, but it's not ugly or awkward either. I think Ivan prefers the new way a lot too, even though it's not quite what he had in mind. Rich From dalias at aerifal.cx Wed Dec 24 07:56:08 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 24 Dec 2003 01:56:08 -0500 Subject: [MPlayer-G2-dev] Re: VP layer progress [update!] In-Reply-To: <20031223162901.GB29405@lucky.net> References: <20031219123858.GH7833@brightrain.aerifal.cx> <20031223002542.GK7833@brightrain.aerifal.cx> <20031223113512.GB67067@lucky.net> <20031223155414.GB257@brightrain.aerifal.cx> <20031223162901.GB29405@lucky.net> Message-ID: <20031224065608.GK257@brightrain.aerifal.cx> On Tue, Dec 23, 2003 at 06:29:01PM +0200, Andriy N. Gritsenko wrote: > Hi, D Richard Felker III! > > Sometime (on Tuesday, December 23 at 17:42) I've received something... > >On Tue, Dec 23, 2003 at 01:35:12PM +0200, Andriy N. Gritsenko wrote: > >> Are you sure vp_node_t must have vp_link_t** pointers? As I may see > >> some filter may want it as NULL-terminated list, some as counted list > >> (but there is no counter in the structure so filter will have it in > >> vp_priv_s), some may want to sort list in own order so have some own > >> structure for that. So I think you have to remove these xin and xout > >> pointers from vp_node_t structure since it may be just a waste. If any > >> filter may have more than one input or output link then that filter may > >> always have these pointers in vp_priv_s structure. :) > > >Impossible. If the pointers are hidden, vp_pull_frame can't walk the > >chain! Originally when I was planning for recursive calling, I > >intended to do something like this, but now I think there has to be > >some unified structure or else the vp layer and the program that set > >up the pipeline can lose track of what's in it! > > How do you plan walk the chain when chain is combined from mixing and > splitting filters and some filters are not-plain mixer but something with > interleaving? RRRR TTTTTT FFFFFF SSSSSS !! RR RR TT FF SS !! RRRR TT FFFF SSSSSS !! RR RR TT FF SS !! RR RR TT FF SSSSSS !! The code is already there, and it's very short and simple. Walking starts from an endpoint where you want to get a frame out, and goes everywhere it needs to. Splitting filters are responsible for buffering whatever they need so that they can later output to other outputs when the pull occurrs. > And if inputs may be added/deleted/stopped/resumed on the fly? No problem. > Just keep in mind that G2 may be used for some video editor, for > example. Arrggg!!!!!!!!!! RTFMLA!!! This has been my intent from the very beginning, and why I insisted on replacing vf layer entirely rather than just slightly improving it like Arpi originally wanted. > If you don't allow such mixed chains then you just limiting G2 > without real needs. ;) WTF is a "mixed chain"? > About "program that set up the pipeline can lose track of what's in > it" I'm sure it may happen only if application's developers are too lazy > to handle that pipeline. :) I'm still sure pipeline must be controlled > only by application. Don't overcomplicate layers, please. :) No, in fact it MUST be controlled by the internal vp code, for auto-inserting filters when there's a format conflict. Please, before making any more nonsense posts.... READ THE FUCKING SOURCE!! **AND** all the relevant proposal docs to the mailing list (tho some are outdated) **AND** the working docs in my vp-in-progress dir.... Rich From dalias at aerifal.cx Wed Dec 24 07:57:33 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 24 Dec 2003 01:57:33 -0500 Subject: [MPlayer-G2-dev] Information about video coding In-Reply-To: <20031223171746.GA22179@nicewarrior.org> References: <20031220171945.GB11990@nicewarrior.org> <20031223011026.GN7833@brightrain.aerifal.cx> <20031223171746.GA22179@nicewarrior.org> Message-ID: <20031224065733.GL257@brightrain.aerifal.cx> On Tue, Dec 23, 2003 at 11:17:47AM -0600, Joey Parrish wrote: > On Mon, Dec 22, 2003 at 08:10:26PM -0500, D Richard Felker III wrote: > > > Similar topic, I've just been reminded... > > > I've wanted to encode a video composed of lots of scrolling > > > ascii-art text. Anyone know of a codec that is designed for > > > stupid things like this without insanely high bitrates? > > > (Very simple images, but very little in common between frames.) > > > > Yes, it's called "text file with ansi escapes"... :))))) > > Oh, damn. But that's my source! :) > > Thanks anyways. Feel free to write vd_ansi.c! :) Then you can mux video (with fourcc==ANSI :) in .avi files and play them with MPlayer.. ;) Rich From andrej at lucky.net Wed Dec 24 10:54:06 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Wed, 24 Dec 2003 11:54:06 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031224064854.GJ257@brightrain.aerifal.cx> References: <20031221135249.GA12754@lucky.net> <20031223011135.GO7833@brightrain.aerifal.cx> <20031223111406.GA67067@lucky.net> <20031223155607.GC257@brightrain.aerifal.cx> <20031223161342.GA29405@lucky.net> <20031224064854.GJ257@brightrain.aerifal.cx> Message-ID: <20031224095406.GA63682@lucky.net> Hi, D Richard Felker III! Sometime (on Wednesday, December 24 at 8:37) I've received something... >Again, please _read_ the fine documentation and (maybe not-so-fine) >source. :) >Nodes of the pipeline call vp_request_frame() on a link when they want >to receive a frame over it. So it's still entirely pull based, just >not carried out by call-recursion. I've read your source. You assume that ->out and ->in are always filled and ->xout and ->xin are NULL-terminated lists. So you will deny filters and/or application don't have "primary" link. It's bad. Many filters may not have any priority and may want to have all links in one list. Beside of that that "primary" link may be deleted while "secondary" ones will still work so it may be a problem (as I already said). To allow only layer manipulate links list is also impossible - only filter may know if it's possible to add second (third, fourth, etc.) link to list since it may depend on filter's current state. So adding and deleting links to(from) list must be controlled by filter itself. Isn't it an another problem for your current implementation? Again my questions. :) If some filter wants to get a frame from one of links it must call vp_request_frame() then call vp_pull_frame(), doesn't it? It seems overcomplicated for me but that time wasting isn't a big problem for current CPUs anyway so OK. Another thing is when filter from which we want a frame must drop a frame - will pull_frame() return 1 then but with img=NULL? And another thing - must each filter implement push_frame()? I'm not sure if it's always possible. Yet one question. Application may want get frames by something like got_frame=request_frame(start_pts,max_duration,*new_pts) and that frame may consume sources: three frames from first, no frames from second and one frame from third - just for example. Does your new VP implementation allow that? :) Anyway, don't be mad on me, please. I just want G2 to handle all needs of video editor (such as Adobe Premiere for example). I'm sure G2 may be power enough to hahdle that task. Although one of my friends has told me G2 based video editor will not be finished while he's alive. ;) May be I'm naive but remember that Adam Rice said: "paradigm shifts often require a certain naivety". I liked that. :))) With best wishes. Andriy. From dalias at aerifal.cx Wed Dec 24 22:28:30 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 24 Dec 2003 16:28:30 -0500 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031224095406.GA63682@lucky.net> References: <20031221135249.GA12754@lucky.net> <20031223011135.GO7833@brightrain.aerifal.cx> <20031223111406.GA67067@lucky.net> <20031223155607.GC257@brightrain.aerifal.cx> <20031223161342.GA29405@lucky.net> <20031224064854.GJ257@brightrain.aerifal.cx> <20031224095406.GA63682@lucky.net> Message-ID: <20031224212830.GN257@brightrain.aerifal.cx> On Wed, Dec 24, 2003 at 11:54:06AM +0200, Andriy N. Gritsenko wrote: > Hi, D Richard Felker III! > > Sometime (on Wednesday, December 24 at 8:37) I've received something... > > >Again, please _read_ the fine documentation and (maybe not-so-fine) > >source. :) > > >Nodes of the pipeline call vp_request_frame() on a link when they want > >to receive a frame over it. So it's still entirely pull based, just > >not carried out by call-recursion. > > I've read your source. You assume that ->out and ->in are always > filled and ->xout and ->xin are NULL-terminated lists. So you will deny > filters and/or application don't have "primary" link. It's bad. Many > filters may not have any priority and may want to have all links in one > list. Beside of that that "primary" link may be deleted while "secondary" > ones will still work so it may be a problem (as I already said). Then reorder the links. It doesn't matter. This is NOT an issue worth wasting time and complexity over. The 2 or 3 filters that actually want to use multiple inputs or outputs can manage their lists. > To allow only layer manipulate links list is also impossible - only > filter may know if it's possible to add second (third, fourth, etc.) link > to list since it may depend on filter's current state. So adding and > deleting links to(from) list must be controlled by filter itself. Isn't > it an another problem for your current implementation? No. It's not implemented because I am trying to make something that works in this lifetime. That is sharply different from being a problem. My design goal for multiple inputs/outputs in G2 is to make sure there are no fundamental _obstacles_ to their implementation. But I don't care about having them work anytime soon, since what I want is a working mplayer-g2 and mencoder-g2. BTW, of course the filter is involved in adding/removing links in some way. We can decide on the exact method by which this is done MUCH LATER, once playing movies and backing up DVDs works. > Again my questions. :) If some filter wants to get a frame from one > of links it must call vp_request_frame() then call vp_pull_frame(), > doesn't it? No, vp_pull_frame may NEVER be called by filers. To do so would be recursive calling. If you need frames before you can process, you must use vp_request_frame then RETURN. > It seems overcomplicated for me but that time wasting isn't a > big problem for current CPUs anyway so OK. Another thing is when filter > from which we want a frame must drop a frame - will pull_frame() return 1 > then but with img=NULL? No, an image MUST be returned because it contains pts. If a frame is going to be dropped, you are free to return an image without filling in the picture buffers. You can even use BUF_TYPE_DUMMY. > And another thing - must each filter implement push_frame()? I'm not > sure if it's always possible. If a filter doesn't want to, it can use the default implementation, which will just store the image. In fact, filters that just process each frame's contents independently are _encouraged_ to use the simplified process_frame api, which is provided by leaving the default implementations of push_frame and pull_frame in place. > Yet one question. Application may want get frames by something like > got_frame=request_frame(start_pts,max_duration,*new_pts) and that frame WTF? This is nonsense. When you get a frame, you always get the next frame, ordered by time. If it doesn't have the pts you want, you pull another. And another. Until you have the one you want. If you want your silly request-by-pts api, you can make a wrapper to do this. > may consume sources: three frames from first, no frames from second and > one frame from third - just for example. Does your new VP implementation > allow that? :) Huh?? The notion of 'source' no longer exists at this point. The application pulls from the end of the chain. > Anyway, don't be mad on me, please. I just want G2 to handle all > needs of video editor (such as Adobe Premiere for example). I'm sure G2 > may be power enough to hahdle that task. Although one of my friends has > told me G2 based video editor will not be finished while he's alive. ;) > May be I'm naive but remember that Adam Rice said: "paradigm shifts often > require a certain naivety". I liked that. :))) Then let someone naive go make shitware like Adobe premier that only supports a few colorspaces, doesn't support variable-fps formats with exact rational timestamps, doesn't let filters retime the frames, copies the image buffers ten times as often as needed so it's incredibly slow, etc. etc. etc. Meanwhile I'll be writing something that doesn't suck... Rich From joey at nicewarrior.org Wed Dec 24 22:29:49 2003 From: joey at nicewarrior.org (Joey Parrish) Date: Wed, 24 Dec 2003 15:29:49 -0600 Subject: [MPlayer-G2-dev] Information about video coding In-Reply-To: <20031224065733.GL257@brightrain.aerifal.cx> References: <20031220171945.GB11990@nicewarrior.org> <20031223011026.GN7833@brightrain.aerifal.cx> <20031223171746.GA22179@nicewarrior.org> <20031224065733.GL257@brightrain.aerifal.cx> Message-ID: <20031224212949.GC27282@nicewarrior.org> On Wed, Dec 24, 2003 at 01:57:33AM -0500, D Richard Felker III wrote: > > > Yes, it's called "text file with ansi escapes"... :))))) > > Oh, damn. But that's my source! :) > > Feel free to write vd_ansi.c! :) Then you can mux video (with > fourcc==ANSI :) in .avi files and play them with MPlayer.. ;) Don't tempt me. :) It's just the sort of pointless crap I haven't been doing enough of lately. *cough cough GIF cough hack* Although I've already written a script that takes raw text and raw frames, renders those frames as ascii-art using that raw text, then converts this colored ascii-art into pngs, which are fed into mencoder. I'm practically there, but it's just not practical. :) --Joey -- "I know Kung Fu." --Darth Vader From andrej at lucky.net Thu Dec 25 10:25:35 2003 From: andrej at lucky.net (Andriy N. Gritsenko) Date: Thu, 25 Dec 2003 11:25:35 +0200 Subject: [MPlayer-G2-dev] Re: Limitations in vo2 api :( In-Reply-To: <20031224212830.GN257@brightrain.aerifal.cx> References: <20031221135249.GA12754@lucky.net> <20031223011135.GO7833@brightrain.aerifal.cx> <20031223111406.GA67067@lucky.net> <20031223155607.GC257@brightrain.aerifal.cx> <20031223161342.GA29405@lucky.net> <20031224064854.GJ257@brightrain.aerifal.cx> <20031224095406.GA63682@lucky.net> <20031224212830.GN257@brightrain.aerifal.cx> Message-ID: <20031225092535.GA73192@lucky.net> Hi, D Richard Felker III! Sometime (on Wednesday, December 24 at 23:16) I've received something... >> Again my questions. :) If some filter wants to get a frame from one >> of links it must call vp_request_frame() then call vp_pull_frame(), >> doesn't it? >No, vp_pull_frame may NEVER be called by filers. To do so would be >recursive calling. If you need frames before you can process, you must >use vp_request_frame then RETURN. I don't understand that. Do you mean (node)->pull_frame() does not process frames until it have pending ones already but all process will be done by (node)->push_frame()? But then I don't know how filter could request two or three frames in a row? For example if filter has found that one frame isn't sufficient for it and must be requested another one? Since your vp_request_frame does nothing but sets flags only. Explain it, please. I don't know now how filters that don't do frame-to-frame will work. With best wishes. Andriy. From ivan at cacad.com Sat Dec 27 00:01:22 2003 From: ivan at cacad.com (Ivan Kalvachev) Date: Sat, 27 Dec 2003 01:01:22 +0200 (EET) Subject: [MPlayer-G2-dev] vo3 Message-ID: <2710.212.116.154.213.1072479682.squirrel@mail.cacad.com> Hi, here is some of my ideas, i'm afraid that there already too late to be implemented, as dalias is coding him pipeline system, while i have not finished the drafts already, but feel free to send coments... Ivan Kalvachev iive -------------- next part -------------- A non-text attachment was scrubbed... Name: Vo3.pdf Type: application/pdf Size: 116259 bytes Desc: not available URL: From dalias at aerifal.cx Sun Dec 28 01:31:36 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sat, 27 Dec 2003 19:31:36 -0500 Subject: [MPlayer-G2-dev] vo3 In-Reply-To: <2710.212.116.154.213.1072479682.squirrel@mail.cacad.com> References: <2710.212.116.154.213.1072479682.squirrel@mail.cacad.com> Message-ID: <20031228003136.GA257@brightrain.aerifal.cx> On Sat, Dec 27, 2003 at 01:01:22AM +0200, Ivan Kalvachev wrote: > Hi, here is some of my ideas, > i'm afraid that there already too late to be implemented, as > dalias is coding him pipeline system, while i have not finished the drafts > already, but feel free to send coments... Time to pour on the kerosine and light the flames... :) First comment: putting everything in a PDF file makes it very difficult to quote and reply to individual parts of your draft. I'll try my best... [all following quotes are from the PDF] > Here are few features that I'm trying to achieve: > ? decreasing memcpy by using one and same buffer by all filters that > can do it (already done in g1 as Direct Rendering method 1) > ? support of partial rendering (slices and DR m2) These are obviously necessary for any system that's not going to be completely unusably sucky. And they're already covered in G1 and G2 VP. > ? support for get/release buffer (ability to release > buffers when they are no longer needed) This is not so obvious at first, but absolutely necessary for overcoming bugs in G1 that prevented all but the simplest filters from using buffer sharing/DR. It's also already covered in G2 VP -- in fact it was one of the two key design points. > ? out of order rendering ? ability to move the data through the video > filters if there is no temporal dependency > ? display_order rendering ? this is for filters that need to use > temporal dependences Ivan and I disagree greatly on the nature of these goals. To me, they're a simple consequence of a natural way of thinking about frame passing and slice rendering. To him, out-of-order is the fundamental frame passing protocol, and special care is required for handling frames in order. > ? ability to keep as many incoming images are needed and to output as > many images as filter may need to (e.g. in case of motion blur we > will have e.g. 6 incoming images and 6 outgoing at once) > ? support for PTS. These were the primary motivation behind G2 VP. > ? ability to quickly reconfigure and if possible - to reuse data that > is already processed (e.g. we have scale and the user resizes the > image, - only images after scale will be redone), In my design, this makes no sense. The final scale filter for resizing would not pass any frames to the vo until time to display them. > safe seeking, auto-insertion of filters. What is safe-seeking? Auto-insertion is of course covered. > ? ability to have more complicated graph (than simple chain) for > processing. This is definitely a desirable goal. > ? simple structure and flexible design. IMNSHO the out-of-order stuff in Ivan's design is anything but simple. > In short the ideas used are : > ? common buffer and separate mpi ? already exist in g1 in some form > ? counting buffer usage by mpi and freeing after not used ? huh, > sound like java :O No. Reference counting is good. GC is idiotic. And you should never free buffers anyway until close, just repool them. > ? allocating all mpi&buffer before starting drawing (look obvious, > doesn't it?) ? in G1 filters had to copy frames in its own buffers > or play hazard by using buffers out of their scope Yes, maybe G1 was broken. All the codecs/filters I know allocate mpi/buffers before drawing, though. > ? using flag IN_ORDER, to indicate that these frames are "drawn" and > there won't come frames with "earlier" PTS. I find this really ugly. > ? using common function for processing frame and slices ? to make > slice support more easier This can easily be done at the filter implementation level, if possible. In many cases, it's not. Processing the image _contents_ and the _frame_ are two distinct tasks. > ? emulating complicated graph in a simple linked list. This sounds like an ugly hack. > ? messaging system for dropping/rebuilding MPI's.(not yet finished) Very bad. > ? having prepared simple type filters (like non temporal ? one > input/one output, processing the frame as it came, without carre for > buffer management) (not documented) Also provided for in G2 VP. > [...] > So, the frame is split on 2 parts, one I will call mpi and the other > I will call mp_buffer. The mp_buffer part contains the memory > buffer, usage count, buffer common width and height, maybe stride. > The mp_buffer->count is the number of MPI-s that point to that > buffer. Probably we may allow buffer to contain more than one piace > if memory (e.g. 3 memory blocks for Y,U,V planes). An idea like this was already suggested by Arpi and adopted in G2 VP, but not as extreme. The reason for not making such a sharp division is that the owner of the buffer will often _need_ to know about the buffer's status as the contents of a given frame, not just which buffer it is. One thing omitted in G2 so far is allowing for mixed buffer types, where different planes are allocated by different parties. For example, exporting U and V planes unchanged and direct rendering a new Y plane. I'm not sure if it's worth supporting this, since it would be excessively complicated. However, it would greatly speed up certain filters such as equalizer. > [...] > This scheme also allows to get rid of the static buffer type. Simply > the decoder will never release it's mpi, but will pass it to the > filter chain, multiple times (like ffmpeg's reuse). On the other > side static buffers should always be in the main memory, otherwise > they can take the only display buffer and stale displaying (e.g. vo > with one buffer, and decoder with 2 static buffers) This is the same principle as the REUSABLE flag in G2 VP, except that DR buffers are also allowed to be reusable in my design. > Dalias already pointed that processing may not be strictly top from > bottom, may not be line, slice, or blocks based. This question is > still open for discussion. Anyway the most flexible x,y,w,h way > proved to be also the most hardier and totally painful. Just take a > look of crop or expand filters in G1. More over the current G1 > scheme have some major flaws: > ? the drawn rectangles may overlap (it depends only on decoder) No, my spec says that draw_slice/commit_slice must be called exactly once for each pixel. If your codec is broken and does not honor this, you must wrap it or else not use slices. > ? drawing could be done in any order. This makes it very hard to say > in what part of the image is already processed I agree, it's very ugly. IMO there should at least be certain minimal restrictions on slice structure, but I don't know what they should be. In any case, I don't like Ivan's idea of restricting slices to macroblock-high horizontal strips drawn in-order from top to bottom... Certainly broken codecs like VP3 will want to draw bottom-to-top. > ? skipped_blocks processing is very hard. Theoretically it is > possible to draw only non- skipped blocks, but then the above > problem raise. I would _really_ like a clean solution to skipped_blocks processing. It's the final key to speed which we haven't solved... :( > The main problem is the out-of-order rendering. The filters should > be able to process, the frames in the order they came. On another > side there are some filters that can operate only in display order. > So what is the solution? > > By design the new video system requires PTS (picture time stamp). I PTS stands for PRESENTATION time stamp, not picture time stamp. > add new flag that I call IN_ORDER. This flag indicates that all > frames before this one are already available in the > in-coming/out-coming area. Lets make an example with MPEG IPB order. > > We have deciding order IPB and display IBP. > First we have I frame. We decode it first and we output it to the > filters. This frame is in order so the flag should be set for it > (while processing). Then we have P-Frame. We decode it, but we do > not set the flag (yet). We process the P-Frame too. Then we decode > an B-Frame that depends on the previous I and P Frames. This B-Frame > is in order when we process it. After we finish with the B-Frame(s) > the first P-Frame is in order. This idea is totally broken, as explained by Michael on ffmpeg-devel. It makes it impossible for anything except an insanely fast computer to play files with B frames!! Here's the problem: 1. You decode first I frame, IN_ORDER. 2. You display the I frame. 3. You decode the P frame. Not IN_ORDER. 4. You decode the B frame. IN_ORDER. 5. You display the B frame, but only after wasting >frametime seconds, thus causing A/V desync!! 6. The P frame becomes IN_ORDER. 7. You display the P frame. 8. Process repeats. The only solution is to always impose one-frame delay at the _decoder_ end when decoding files with B frames. In Ivan's design, this can be imposed by waiting to set the IN_ORDER flag for an I/P frame until the next B frame is decoded. > As you can see it is very easy for the decoders to set the IN_ORDER > flag, it could be done om G1's decode() end, when the frames are in > order. Actually, this is totally false. Libavcodec does _not_ export any information which allows the caller to know if the frames are being decoded in order or not. :( Yes, this means lavc is horribly broken... > If an MPI is freed without setting IN_ORDER then we could guess that > it have been skipped. Frame sources cannot be allowed to skip frames. Only the destination requesting frames can skip them. > Skipping/Rebuilding This entire section should be trashed. It's very bad design. > Now the skipping issue is rising. I propose 2 flags, that should be > added like IN_ORDER flag, I call them SKIPPED and REBUILD. I thought > about one common INVALID, but it would have different meening > depending from the array it resides (incoming or outgoing) > > SKIPPED is requared when a get_image frame is gotten but the > processing is not performed. The first filter sets this flag in the > outgoing mpi, and when next filter process the date, if should free > the mpi (that is now in the incoming). If the filter had allocated > another frame, where the skipped frame should have been draw, then > it can free it by setting it as SKIPPED. Turn things around in the only direction that works, and you don't need an image flag for SKIPPED at all. The filter _requesting_ the image knows if it intends to use the contents or not, so if not, it just ignores what's there. There IS NO CORRECT WAY to frameskip from the source side. > E.g. if we have this chain > -vf crop=720:540,spp=5:4,scale=512:384 > This chain should give quite a trill to 2GHz processor. Now imagine > that scale is auto inserted and that the vo is some window RGB only > device (vo_x11). If a user change the window size, scale parameters > change too. Scale should rebuild all frames that are processed, but > now shown. Scale filter can safely SKIP all frames in the outgoing. Bad point 1: manually created filters which have been given parameters MUST NEVER auto-reconfigure. In my design, if the user enabled dynamic window rescaling, another scale filter controlled by the UI layer would get inserted, and activated only when the window size was non-default. Bad point 2: your "rebuild" idea is not possible. Suppose the scale filter has stored its output in video memory, and its input has already been freed/overwritten. If you don't allow for this, performance will suck. > [...] > -vf spp=5,scale=512:384,osd > [...] > Now the user turns off OSD that have been already rendered into a > frame. Then vf_osd set REBUILD for all affected frames in the > incoming array. The scale filter will draw the frame again, but it > won't call spp again. And this gives a big win because vf_spp could > be extremly slow. This is stupid. We have a much better design for osd: as it slice-renders its output, it makes backups (in very efficient form) of the data that's destroyed by overwriting/alphablending. It can then undo the process at any time, without ever reading from its old input buffers or output buffers. In fact, it can handle slices of any shape and size, too! > On another side, there is one big problem ? the mpi could already be > freed by the previous filter. To workaround it we may need to keep > all buffers until the image is shown (something like > control(FLIP,pts) for all filters). Same thing may be used on seek, > to flush the buffers. This is an insurmountible problem. The buffers will very likely no longer exist. Forcing them to be kept will destroy performance. > Problems remaining! Lots more than you itemize! > 1. Interlacing ? should the second field have its own PTS? In principle, definitely yes. IMO the easiest way to handle it is to require codecs that output interlaced video to set the duration field, and then pts of the second field is just pts+duration/2. > P.S. > I absolutely forbid this document to be published anywhere. It is > only for mplayer developers' eyes. And please somebody to remove the > very old vo2 drafts, from the g1 CVS. Then don't send it to public mailing lists... :) Sorry but IMO it's impossible to properly respond/comment without quoting large sections. So, despite all the flames, I think there _are_ a few realy good ideas here, at least as far as deficiencies in G1 (or even G2 VP) which we need to resolve. But I don't like Ivan's push-based out-of-order rendering pipeline at all. It's highly non-intuitive, and maybe even restrictive. Actually, the name (VO3) reflects what I don't like about it: Ivan's design is an api for the codec to _output_ slices, thus calling it video output. (In fact, all filter execution is initiated from within the codec's slice callback!) On the other hand, I'm looking for an API for _obtaining_ frames to show on a display, which might come from anywhere -- not just a codec. For instance they might even be generated by visualization plugins from audio data, or even from /dev/urandom! My design makes the source of the video totally transparent, rather than making the source the entry point for everything! And, my design separates image content processing (which might be able to happen out-of-order) from frame processing (which always happens in order). So, Ivan. I'll try to take the best parts of what you've proposed and incorporate them into the code for G2. Maybe we'll be able to find something we're both happy with. With kind flames, Rich From ivan at cacad.com Sun Dec 28 03:51:23 2003 From: ivan at cacad.com (Ivan Kalvachev) Date: Sun, 28 Dec 2003 04:51:23 +0200 (EET) Subject: [MPlayer-G2-dev] vo3 In-Reply-To: <20031228003136.GA257@brightrain.aerifal.cx> References: <2710.212.116.154.213.1072479682.squirrel@mail.cacad.com> <20031228003136.GA257@brightrain.aerifal.cx> Message-ID: <1367.212.116.154.213.1072579883.squirrel@mail.cacad.com> D Richard Felker III said: > On Sat, Dec 27, 2003 at 01:01:22AM +0200, Ivan Kalvachev wrote: >> Hi, here is some of my ideas, >> i'm afraid that there already too late to be implemented, as >> dalias is coding him pipeline system, while i have not finished the drafts >> already, but feel free to send coments... > > Time to pour on the kerosine and light the flames... :) Your are very bad in making good points, you always fall in flame. I noticed that you like flames, even if you burn in them sometimes. > > First comment: putting everything in a PDF file makes it very > difficult to quote and reply to individual parts of your draft. I'll try my best... [all following quotes are from the PDF] Yep, The original is in OOo format. And You told me that you don't want to install "that bloat" > >> Here are few features that I'm trying to achieve: >> ? decreasing memcpy by using one and same buffer by all filters that can do it (already done in g1 as Direct Rendering method 1) >> ? support of partial rendering (slices and DR m2) > > These are obviously necessary for any system that's not going to be completely unusably sucky. And they're already covered in G1 and G2 VP. flame. I know that, you know that, everybody know that. > >> ? support for get/release buffer (ability to release buffers when they are no longer needed) > > This is not so obvious at first, but absolutely necessary for > overcoming bugs in G1 that prevented all but the simplest filters from using buffer sharing/DR. It's also already covered in G2 VP -- in fact it was one of the two key design points. yep > >> ? out of order rendering ? ability to move the data through the video filters if there is no temporal dependency >> ? display_order rendering ? this is for filters that need to use temporal dependences > > Ivan and I disagree greatly on the nature of these goals. To me, they're a simple consequence of a natural way of thinking about frame passing and slice rendering. To him, out-of-order is the fundamental frame passing protocol, and special care is required for handling frames in order. YES! > >> ? ability to keep as many incoming images are needed and to output as many images as filter may need to (e.g. in case of motion blur we will have e.g. 6 incoming images and 6 outgoing at once) >> ? support for PTS. > > These were the primary motivation behind G2 VP. yes. > >> ? ability to quickly reconfigure and if possible - to reuse data that is already processed (e.g. we have scale and the user resizes the image, - only images after scale will be redone), > > In my design, this makes no sense. The final scale filter for resizing would not pass any frames to the vo until time to display them. final scale filter?!!! How many scale filters do you have? > >> safe seeking, auto-insertion of filters. > > What is safe-seeking? When seeking filters that have stored frames should flush them For example now both mpeg2 decoders don't do that, causing garbage in B-Frames decoding after seek. Same apply for any temporal filter. In G1 there is control(SEEK,...), but usually it is not used. > > Auto-insertion is of course covered. I'm not criticizing your system. This comment is not for me. Or I hear irony? > >> ? ability to have more complicated graph (than simple chain) for processing. > > This is definitely a desirable goal. > >> ? simple structure and flexible design. > > IMNSHO the out-of-order stuff in Ivan's design is anything but simple. Take the red pill and you will see the truth :)) > > >> In short the ideas used are : >> ? common buffer and separate mpi ? already exist in g1 in some form ? counting buffer usage by mpi and freeing after not used ? huh, sound like java :O > > No. Reference counting is good. GC is idiotic. And you should never free buffers anyway until close, just repool them. I didn't say that. Free buffer is buffer that is not busy and could be reused. Moreover I don't like the way frames are locked in your code. It doesn't seem obvious. The goal in my design is all the work of freeing/quering to be moved to vf/vp functions. So in my design you will see only get_image/release_image, but never lock/unlock. Becouse having buffer meen that you need it. (well most of the time) > >> ? allocating all mpi&buffer before starting drawing (look obvious, doesn't it?) ? in G1 filters had to copy frames in its own buffers or play hazard by using buffers out of their scope > > Yes, maybe G1 was broken. All the codecs/filters I know allocate mpi/buffers before drawing, though. In G1 if you draw_slice out-of-order it is possible to go to a filter that haven't yet allocated buffer for this frame - some frames are allocated on put_frame. That's also the reason to have one common process()! > >> ? using flag IN_ORDER, to indicate that these frames are "drawn" and there won't come frames with "earlier" PTS. > > I find this really ugly. It's the only sane way to do it, if you really do out-of-order processing. > >> ? using common function for processing frame and slices ? to make slice support more easier > > This can easily be done at the filter implementation level, if > possible. In many cases, it's not. Processing the image _contents_ and the _frame_ are two distinct tasks. Not so easy. Very few filters in G1 support slices, mainly becouse it is separate chain. > >> ? emulating complicated graph in a simple linked list. > > This sounds like an ugly hack. Hack - yes, ugly - donno?? > >> ? messaging system for dropping/rebuilding MPI's.(not yet finished) > > Very bad. > >> ? having prepared simple type filters (like non temporal ? one >> input/one output, processing the frame as it came, without carre for buffer management) (not documented) > > Also provided for in G2 VP. > >> [...] >> So, the frame is split on 2 parts, one I will call mpi and the other I will call mp_buffer. The mp_buffer part contains the memory >> buffer, usage count, buffer common width and height, maybe stride. The mp_buffer->count is the number of MPI-s that point to that >> buffer. Probably we may allow buffer to contain more than one piace if memory (e.g. 3 memory blocks for Y,U,V planes). > > An idea like this was already suggested by Arpi and adopted in G2 VP, but not as extreme. The reason for not making such a sharp division is that the owner of the buffer will often _need_ to know about the buffer's status as the contents of a given frame, not just which buffer it is. huh? buffer parameters are constats. MPI parameters are variables. > > One thing omitted in G2 so far is allowing for mixed buffer types, where different planes are allocated by different parties. For > example, exporting U and V planes unchanged and direct rendering a new Y plane. I'm not sure if it's worth supporting this, since it would be excessively complicated. However, it would greatly speed up certain filters such as equalizer. Yes I was thinking about such hacks. But definitly they are not worth implementig. Matrox YUV mode need such hack, but it could be done in vo level. > >> [...] >> This scheme also allows to get rid of the static buffer type. Simply the decoder will never release it's mpi, but will pass it to the filter chain, multiple times (like ffmpeg's reuse). On the other side static buffers should always be in the main memory, otherwise they can take the only display buffer and stale displaying (e.g. vo with one buffer, and decoder with 2 static buffers) > > This is the same principle as the REUSABLE flag in G2 VP, except that DR buffers are also allowed to be reusable in my design. I don't use flag. Anyway, it is not something major. >> Dalias already pointed that processing may not be strictly top from bottom, may not be line, slice, or blocks based. This question is still open for discussion. Anyway the most flexible x,y,w,h way proved to be also the most hardier and totally painful. Just take a look of crop or expand filters in G1. More over the current G1 >> scheme have some major flaws: >> ? the drawn rectangles may overlap (it depends only on decoder) > > No, my spec says that draw_slice/commit_slice must be called exactly once for each pixel. If your codec is broken and does not honor this, you must wrap it or else not use slices. The problem may arrase in filter slices too! Imagine rounding errors;) > >> ? drawing could be done in any order. This makes it very hard to say in what part of the image is already processed > > I agree, it's very ugly. IMO there should at least be certain minimal restrictions on slice structure, but I don't know what they should be. In any case, I don't like Ivan's idea of restricting slices to > macroblock-high horizontal strips drawn in-order from top to bottom... Certainly broken codecs like VP3 will want to draw bottom-to-top. I have never said anything about macroblock-high strips. As for VP3 we may flip it in the beggning and the flip it before processing ;) yeh, ugly ,) j/k > >> ? skipped_blocks processing is very hard. Theoretically it is >> possible to draw only non- skipped blocks, but then the above >> problem raise. > > I would _really_ like a clean solution to skipped_blocks processing. It's the final key to speed which we haven't solved... :( > >> The main problem is the out-of-order rendering. The filters should be able to process, the frames in the order they came. On another side there are some filters that can operate only in display order. So what is the solution? >> By design the new video system requires PTS (picture time stamp). I > > PTS stands for PRESENTATION time stamp, not picture time stamp. I thought I had fixed that. > >> add new flag that I call IN_ORDER. This flag indicates that all frames before this one are already available in the >> in-coming/out-coming area. Lets make an example with MPEG IPB order. We have deciding order IPB and display IBP. >> First we have I frame. We decode it first and we output it to the filters. This frame is in order so the flag should be set for it (while processing). Then we have P-Frame. We decode it, but we do not set the flag (yet). We process the P-Frame too. Then we decode an B-Frame that depends on the previous I and P Frames. This B-Frame is in order when we process it. After we finish with the B-Frame(s) the first P-Frame is in order. > > This idea is totally broken, as explained by Michael on ffmpeg-devel. It makes it impossible for anything except an insanely fast computer to play files with B frames!! Here's the problem: > > 1. You decode first I frame, IN_ORDER. > 2. You display the I frame. > 3. You decode the P frame. Not IN_ORDER. > 4. You decode the B frame. IN_ORDER. > 5. You display the B frame, but only after wasting >frametime seconds, > thus causing A/V desync!! > 6. The P frame becomes IN_ORDER. > 7. You display the P frame. > 8. Process repeats. > > The only solution is to always impose one-frame delay at the _decoder_ end when decoding files with B frames. In Ivan's design, this can be imposed by waiting to set the IN_ORDER flag for an I/P frame until the next B frame is decoded. Again you say things that i haven't. Actually I (or you ) may have missed one of the points. Well I will add it to the goals. Buffering ahead. Now. I said that IN_ORDER is replacement for the draw_frame()!!!! This meen that in the above example I-frame won't be IN_ORDER. Your problem solved. Anyway the IN_ORDER doesn't force us to display the frame. There is no need to start displaying frame in the moment they are compleated. I agree that there may be some problems for vo with one buffer. So far you have one (good) point. >> As you can see it is very easy for the decoders to set the IN_ORDER >> flag, it could be done om G1's decode() end, when the frames are in order. > > Actually, this is totally false. Libavcodec does _not_ export any information which allows the caller to know if the frames are being decoded in order or not. :( Yes, this means lavc is horribly broken... avcodec always display frames in order, unless you set manually flags like _OUT_OF_ORDER or _LOW_DELAY ;) > >> If an MPI is freed without setting IN_ORDER then we could guess that it have been skipped. > > Frame sources cannot be allowed to skip frames. Only the destination requesting frames can skip them. If this rule is removed then IN_ORDER don't have any meening. Usually filter that makes such frames is broken. If a filter that wants to remove dublicated frames may set flag SKIPPED (well if such flag exists;) SKIPPED/INVALID is requared becouse there are always 2 mpi's that point to one buffer (vf1->out and vf_2->in ) > >> Skipping/Rebuilding > > This entire section should be trashed. It's very bad design. did i said somewhere - not finished? > >> Now the skipping issue is rising. I propose 2 flags, that should be added like IN_ORDER flag, I call them SKIPPED and REBUILD. I thought about one common INVALID, but it would have different meening >> depending from the array it resides (incoming or outgoing) >> SKIPPED is requared when a get_image frame is gotten but the >> processing is not performed. The first filter sets this flag in the outgoing mpi, and when next filter process the date, if should free the mpi (that is now in the incoming). If the filter had allocated another frame, where the skipped frame should have been draw, then it can free it by setting it as SKIPPED. > > Turn things around in the only direction that works, and you don't need an image flag for SKIPPED at all. The filter _requesting_ the image knows if it intends to use the contents or not, so if not, it just ignores what's there. There IS NO CORRECT WAY to frameskip from the source side. I'm not talking about skipping of frame to maintain A-V sync. And decoders are from the source side, they DO skip frames. And in this section I use SKIPPED in menning of INVALID, as you can see from the quote. > >> E.g. if we have this chain >> -vf crop=720:540,spp=5:4,scale=512:384 >> This chain should give quite a trill to 2GHz processor. Now imagine that scale is auto inserted and that the vo is some window RGB only device (vo_x11). If a user change the window size, scale parameters change too. Scale should rebuild all frames that are processed, but now shown. Scale filter can safely SKIP all frames in the outgoing. > > Bad point 1: manually created filters which have been given parameters MUST NEVER auto-reconfigure. In my design, if the user enabled dynamic window rescaling, another scale filter controlled by the UI layer would get inserted, and activated only when the window size was > non-default. I just give example how the filter chain will look WHEN scale is auto inserted. Read carefuly!! and don't hurry to flame. And don't forget that the Front-end will have full control of the config() parameters. how many scaling filters are you planing to have? don't you know that scale filter is slow? > > Bad point 2: your "rebuild" idea is not possible. Suppose the scale filter has stored its output in video memory, and its input has > already been freed/overwritten. If you don't allow for this, > performance will suck. If you had read carefully you would see that I had pointed that problem too (with solution I don't like very much). That's the main reason this section is not compleated. > >> [...] >> -vf spp=5,scale=512:384,osd >> [...] >> Now the user turns off OSD that have been already rendered into a frame. Then vf_osd set REBUILD for all affected frames in the >> incoming array. The scale filter will draw the frame again, but it won't call spp again. And this gives a big win because vf_spp could be extremly slow. > > This is stupid. We have a much better design for osd: as it > slice-renders its output, it makes backups (in very efficient form) of the data that's destroyed by overwriting/alphablending. It can then undo the process at any time, without ever reading from its old input buffers or output buffers. In fact, it can handle slices of any shape and size, too! OSD is only EXAMPLE. not the real case. Well then I had gave bad example. In fact REBUILD is nessesery then filter uses a buffer that is requested by the previous filter. Also if vo invalidate the buffer by some reason, this is the only way it could signal the rest of the filters. Yeh, these issues are raised by he way i handle mpi/buffer, but I have not seen any such system so far. Usually in such situation all filters will get something like reset and will start from next frame. Of cource this could be a lot of pain in out-of-order scheme! > >> On another side, there is one big problem ? the mpi could already be freed by the previous filter. To workaround it we may need to keep all buffers until the image is shown (something like >> control(FLIP,pts) for all filters). Same thing may be used on seek, to flush the buffers. > > This is an insurmountible problem. The buffers will very likely no longer exist. Forcing them to be kept will destroy performance. You meen will consume a lot of memory? huh? > >> Problems remaining! > > Lots more than you itemize! > >> 1. Interlacing ? should the second field have its own PTS? > > In principle, definitely yes. IMO the easiest way to handle it is to require codecs that output interlaced video to set the duration field, and then pts of the second field is just pts+duration/2. Why? Just becouse you like it that way? Simply give examples. > >> P.S. >> I absolutely forbid this document to be published anywhere. It is >> only for mplayer developers' eyes. And please somebody to remove the very old vo2 drafts, from the g1 CVS. > > Then don't send it to public mailing lists... :) The author is never limited by the license, I own full copyright of this document and I may set any rules on it. > > Sorry but IMO it's impossible to properly respond/comment without quoting large sections. > > So, despite all the flames, I think there _are_ a few realy good ideas here, at least as far as deficiencies in G1 (or even G2 VP) which we need to resolve. But I don't like Ivan's push-based out-of-order rendering pipeline at all. It's highly non-intuitive, and maybe even restrictive. Huh, I'm happy to hear that there are good ideas. You didn't point anything good. I see only critics&flames. > > Actually, the name (VO3) reflects what I don't like about it: Ivan's design is an api for the codec to _output_ slices, thus calling it video output. (In fact, all filter execution is initiated from within the codec's slice callback!) This is one of the possible ways. In the vo2 drafts I wanted to implement something called automatic sliceing- forcing filters to use slices even when decoder doesn't support slicing. (I can nearly imagine the flames you are thinking in the moment;) Anyway my API makes all filter codecses. That's why the scheme looks so complicated, and that's why simple filter is so nessesery. The full beauty of the API will be seen only for people that make temoral filters and adding/removing frames. This mean by you :O > On the other hand, I'm looking for an API > for _obtaining_ frames to show on a display, which might come from anywhere -- not just a codec. For instance they might even be > generated by visualization plugins from audio data, or even from /dev/urandom! Oh, Could you explain why mine API cannot be used for these things? > My design makes the source of the video totally > transparent, rather than making the source the entry point for > everything! And, my design separates image content processing (which might be able to happen out-of-order) from frame processing (which always happens in order). > > So, Ivan. I'll try to take the best parts of what you've proposed and incorporate them into the code for G2. Maybe we'll be able to find something we're both happy with. Wrong, We need something that we both are equally unhappy with:))) But as far as you code it is natural you to implement your ideas. > > With kind flames, > Rich > Deep water is dangerous Ivan Kalvachev iive From dalias at aerifal.cx Sun Dec 28 06:18:30 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Sun, 28 Dec 2003 00:18:30 -0500 Subject: [MPlayer-G2-dev] vo3 In-Reply-To: <1367.212.116.154.213.1072579883.squirrel@mail.cacad.com> References: <2710.212.116.154.213.1072479682.squirrel@mail.cacad.com> <20031228003136.GA257@brightrain.aerifal.cx> <1367.212.116.154.213.1072579883.squirrel@mail.cacad.com> Message-ID: <20031228051830.GG257@brightrain.aerifal.cx> > User-Agent: SquirrelMail/1.4.1 Ivan, if you expect a reply, please get a mailer that understands how to wrap lines at 80 columns and preserve formatting when quoting. Your broken SquirrelMail converted all the quotes into multi-hundred-column unreadable jibberish. On Sun, Dec 28, 2003 at 04:51:23AM +0200, Ivan Kalvachev wrote: > D Richard Felker III said: > > On Sat, Dec 27, 2003 at 01:01:22AM +0200, Ivan Kalvachev wrote: > >> Hi, here is some of my ideas, > >> i'm afraid that there already too late to be implemented, as > >> dalias is coding him pipeline system, while i have not finished the drafts > >> already, but feel free to send coments... > > > > Time to pour on the kerosine and light the flames... :) > Your are very bad in making good points, you always fall in flame. I > noticed that you like flames, even if you burn in them sometimes. This line was a joke. Maybe if you'd taken it as such you would have been more thoughtful in your responses to what followed. > > First comment: putting everything in a PDF file makes it very > > difficult to quote and reply to individual parts of your draft. > > I'll try my best... [all following quotes are from the PDF] > Yep, The original is in OOo format. And You told me that you don't want to > install "that bloat" Well yeah, PDF was better than OOo. > >> Here are few features that I'm trying to achieve: > >> ? decreasing memcpy by using one and same buffer by all filters > >> that can do it (already done in g1 as Direct Rendering method 1) > >> ? support of partial rendering (slices and DR m2) > > > > These are obviously necessary for any system that's not going to be > > completely unusably sucky. And they're already covered in G1 and G2 VP. > flame. I know that, you know that, everybody know that. Huh? Who am I flaming? I don't follow. > >> ? ability to quickly reconfigure and if possible - to reuse data that > >> is already processed (e.g. we have scale and the user resizes the > >> image, - only images after scale will be redone), > > > > In my design, this makes no sense. The final scale filter for resizing > > would not pass any frames to the vo until time to display them. > final scale filter?!!! > How many scale filters do you have? Normally only one. But suppose you did something like the following: -vf scale=640:480,pullup with output to vo_x11. The idea is that since you'll be resizing DVD to square pixels anyway, you might as well do it before the inverse telecine and save some cycles. Now the user resizes the window... In yours (and Arpi's old) very bad way, the scale filter gets reconfigured, ruining the fields. If you don't like this example (there are other ways to handle interlacing) then just consider something like denoising with scaling. My principle in response is that the player should NEVER alter configuration for any filters inserted manually by the user. Instead, it should create its own scale filter for dynamic window resizing and final vo colorspace conversion. > >> safe seeking, auto-insertion of filters. > > > > What is safe-seeking? > When seeking filters that have stored frames should flush them > For example now both mpeg2 decoders don't do that, causing garbage in > B-Frames decoding after seek. Same apply for any temporal filter. > In G1 there is control(SEEK,...), but usually it is not used. OK, understood perfectly. > > Auto-insertion is of course covered. > I'm not criticizing your system. This comment is not for me. > Or I hear irony? And I wasn't criticizing yours, here. I was just saying it's not a problem for either system. > >> In short the ideas used are : > >> ? common buffer and separate mpi ? already exist in g1 in some form ? > >> counting buffer usage by mpi and freeing after not used ? huh, sound > >> like java :O > > > > No. Reference counting is good. GC is idiotic. And you should never free > > buffers anyway until close, just repool them. > I didn't say that. Free buffer is buffer that is not busy and could be > reused. > Moreover I don't like the way frames are locked in your code. It doesn't > seem obvious. In VP, you will _only_ lock frames if you need to keep them after passing them on to the next filter. Normally you shouldn't be doing this. > The goal in my design is all the work of freeing/quering to > be moved to vf/vp functions. So in my design you will see only > get_image/release_image, but never lock/unlock. Becouse having buffer meen > that you need it. (well most of the time) Keep in mind there's no unlock function. It's just get/lock/release. Perhaps you should spend some time thinking about the needs of various filters and codecs. The reason I have a lock function is that the _normal_ case is passing your image on to the next filter without keeping it. Images could start out with 2 locks (1 for source, 1 for dest) and then the source would have to explicitly release it when finished, but IMHO this just adds complexity since most filters should never think about a frame again once sending it out. > >> ? allocating all mpi&buffer before starting drawing (look obvious, > >> doesn't it?) ? in G1 filters had to copy frames in its own buffers or > >> play hazard by using buffers out of their scope > > > > Yes, maybe G1 was broken. All the codecs/filters I know allocate > > mpi/buffers before drawing, though. > In G1 if you draw_slice out-of-order it is possible to go to a filter that > haven't yet allocated buffer for this frame - some frames are allocated on > put_frame. This is because G1 is horribly broken. Slices should not be expected to work at all in G1 except direct VD->VO. > That's also the reason to have one common process()! I disagree. > >> ? using flag IN_ORDER, to indicate that these frames are "drawn" and > >> there won't come frames with "earlier" PTS. > > > > I find this really ugly. > It's the only sane way to do it, if you really do out-of-order processing. No, the draw_slice/commit_slice recursion with frames getting pulled in order works just fine. And it's much more intuitive. > >> ? using common function for processing frame and slices ? to make slice > >> support more easier > > > > This can easily be done at the filter implementation level, if > > possible. In many cases, it's not. Processing the image _contents_ and > > the _frame_ are two distinct tasks. > Not so easy. Very few filters in G1 support slices, mainly becouse it is > separate chain. No, mainly because the api is _incorrect_ and cannot work. Slices in G1 will inevitably sig11. > > One thing omitted in G2 so far is allowing for mixed buffer types, where > > different planes are allocated by different parties. For > > example, exporting U and V planes unchanged and direct rendering a new Y > > plane. I'm not sure if it's worth supporting this, since it would be > > excessively complicated. However, it would greatly speed up certain > > filters such as equalizer. > Yes I was thinking about such hacks. But definitly they are not worth > implementig. Matrox YUV mode need such hack, but it could be done in vo > level. Actually it doesn't. The YV12->NV12 converter can just allow direct rendering, with passthru to the VO's Y plane and its own U/V planes. Then, on draw_slice, the converter does nothing with Y and packs U/V into place in the VO's DR buffer. This is all perfectly valid and no effort to implement in my design. The difficult case is when you want to export some planes and DR others... > >> Dalias already pointed that processing may not be strictly top from > >> bottom, may not be line, slice, or blocks based. This question is still > >> open for discussion. Anyway the most flexible x,y,w,h way proved to be > >> also the most hardier and totally painful. Just take a look of crop or > >> expand filters in G1. More over the current G1 > >> scheme have some major flaws: > >> ? the drawn rectangles may overlap (it depends only on decoder) > > > > No, my spec says that draw_slice/commit_slice must be called exactly > > once for each pixel. If your codec is broken and does not honor this, > > you must wrap it or else not use slices. > The problem may arrase in filter slices too! Imagine rounding errors;) Huh? Rounding? WTF? You can't render half a pixel. If a filter is doing slices+resizing (e.g. scale, subpel translate, etc.) it has to deal with the hideous boundary conditions itself... > >> add new flag that I call IN_ORDER. This flag indicates that all frames > before this one are already available in the > >> in-coming/out-coming area. Lets make an example with MPEG IPB order. We > have deciding order IPB and display IBP. > >> First we have I frame. We decode it first and we output it to the > filters. This frame is in order so the flag should be set for it (while > processing). Then we have P-Frame. We decode it, but we do not set the > flag (yet). We process the P-Frame too. Then we decode an B-Frame that > depends on the previous I and P Frames. This B-Frame is in order when > we process it. After we finish with the B-Frame(s) the first P-Frame is > in order. > > > > This idea is totally broken, as explained by Michael on ffmpeg-devel. It > makes it impossible for anything except an insanely fast computer to > play files with B frames!! Here's the problem: > > > > 1. You decode first I frame, IN_ORDER. > > 2. You display the I frame. > > 3. You decode the P frame. Not IN_ORDER. > > 4. You decode the B frame. IN_ORDER. > > 5. You display the B frame, but only after wasting >frametime seconds, > > thus causing A/V desync!! > > 6. The P frame becomes IN_ORDER. > > 7. You display the P frame. > > 8. Process repeats. > > > > The only solution is to always impose one-frame delay at the _decoder_ > end when decoding files with B frames. In Ivan's design, this can be > imposed by waiting to set the IN_ORDER flag for an I/P frame until the > next B frame is decoded. > Again you say things that i haven't. Actually I (or you ) may have missed > one of the points. Well I will add it to the goals. Buffering ahead. Now. > I said that IN_ORDER is replacement for the draw_frame()!!!! Ahh, that's a _very_ helpful way to think about it. Thanks! I still don't like it, but at least I don't think your system is total nonsense anymore. > This meen that in the above example I-frame won't be IN_ORDER. Your > problem solved. Yes, I agree totally. That's solved. > Anyway the IN_ORDER doesn't force us to display the frame. > There is no need to start displaying frame in the moment they are > compleated. Yes, but it's hard to know when to display unless you're using threads (or the cool pageflip-from-slice-callback hack :)) > I agree that there may be some problems for vo with one buffer. > So far you have one (good) point. I never said anything about vo with one buffer. IMO it sucks so much it shouldn't even be supported, but then Arpi would get mad. > >> As you can see it is very easy for the decoders to set the IN_ORDER > >> flag, it could be done om G1's decode() end, when the frames are in order. > > > > Actually, this is totally false. Libavcodec does _not_ export any > > information which allows the caller to know if the frames are being > > decoded in order or not. :( Yes, this means lavc is horribly broken... > avcodec always display frames in order, unless you set manually flags like > _OUT_OF_ORDER or _LOW_DELAY ;) No. Keep in mind that your chain will be running from the draw_horiz_band callback... (in which case, it will be out of order) I would expect you to set the LOW_DELAY flag under these circumstances, but maybe you wouldn't. > >> If an MPI is freed without setting IN_ORDER then we could guess that it > >> have been skipped. > > > > Frame sources cannot be allowed to skip frames. Only the destination > > requesting frames can skip them. > If this rule is removed then IN_ORDER don't have any meening. Usually > filter that makes such frames is broken. If a filter that wants to remove > dublicated frames may set flag SKIPPED (well if such flag exists;) > SKIPPED/INVALID is requared becouse there are always 2 mpi's that point to > one buffer (vf1->out and vf_2->in ) I misunderstood IN_ORDER. SKIPPED makes sense now, it's just not quite the way I would implement it. > >> Skipping/Rebuilding > > > > This entire section should be trashed. It's very bad design. > did i said somewhere - not finished? Yes. IMO it just shouldn't exist, though. It's unnecessary complexity and part requires sacrificing performance. > >> Now the skipping issue is rising. I propose 2 flags, that should be > >> added like IN_ORDER flag, I call them SKIPPED and REBUILD. I thought > >> about one common INVALID, but it would have different meening > >> depending from the array it resides (incoming or outgoing) > >> SKIPPED is requared when a get_image frame is gotten but the > >> processing is not performed. The first filter sets this flag in the > >> outgoing mpi, and when next filter process the date, if should free the > >> mpi (that is now in the incoming). If the filter had allocated another > >> frame, where the skipped frame should have been draw, then it can free > >> it by setting it as SKIPPED. > > > > Turn things around in the only direction that works, and you don't need > > an image flag for SKIPPED at all. The filter _requesting_ the image > > knows if it intends to use the contents or not, so if not, it just > > ignores what's there. There IS NO CORRECT WAY to frameskip from the > > source side. > I'm not talking about skipping of frame to maintain A-V sync. OK, misunderstood. > And decoders are from the source side, they DO skip frames. And in this > section I use SKIPPED in menning of INVALID, as you can see from the > quote. I couldn't tell. If the codec skipped a frame at user-request, it would also be invalid... > how many scaling filters are you planing to have? don't you know that > scale filter is slow? Yes, it's slow. vo_x11 sucks. My point is that the player should _never_ automatically do stuff that gives incorrect output. > > Bad point 2: your "rebuild" idea is not possible. Suppose the scale > > filter has stored its output in video memory, and its input has > > already been freed/overwritten. If you don't allow for this, > > performance will suck. > If you had read carefully you would see that I had pointed that problem > too (with solution I don't like very much). That's the main reason this > section is not compleated. :(( > > > > >> [...] > >> -vf spp=5,scale=512:384,osd > >> [...] > >> Now the user turns off OSD that have been already rendered into a > frame. Then vf_osd set REBUILD for all affected frames in the > >> incoming array. The scale filter will draw the frame again, but it > won't call spp again. And this gives a big win because vf_spp could be > extremly slow. > > > > This is stupid. We have a much better design for osd: as it > > slice-renders its output, it makes backups (in very efficient form) of > the data that's destroyed by overwriting/alphablending. It can then undo > the process at any time, without ever reading from its old input buffers > or output buffers. In fact, it can handle slices of any shape and size, > too! > OSD is only EXAMPLE. not the real case. > Well then I had gave bad example. In fact REBUILD is nessesery then filter > uses a buffer that is requested by the previous filter. Also if vo > invalidate the buffer by some reason, this is the only way it could signal > the rest of the filters. Invalidating buffers is a problem... > Yeh, these issues are raised by he way i handle mpi/buffer, but I have not > seen any such system so far. Usually in such situation all filters will > get something like reset and will start from next frame. Of cource this > could be a lot of pain in out-of-order scheme! It's not really too bad. Although ideally it should be possible to make small changes to the filter chain _without_ any discontinuity in the output video... > > This is an insurmountible problem. The buffers will very likely no > > longer exist. Forcing them to be kept will destroy performance. > You meen will consume a lot of memory? > huh? No. You might have to _copy_ them, which kills performance. Think of export-type buffers, which are NOT just for obsolete codecs! Or reusable/static-type buffers! > >> 1. Interlacing ? should the second field have its own PTS? > > > > In principle, definitely yes. IMO the easiest way to handle it is to > require codecs that output interlaced video to set the duration field, > and then pts of the second field is just pts+duration/2. > Why? Just becouse you like it that way? Yes. Any other way is fine too. Unfortunately it's impossible to detect whether the source video is interlaced or not (stupid flags are always wrong), so some other methods such as always treating fields independently are troublesome... > > Then don't send it to public mailing lists... :) > The author is never limited by the license, I own full copyright of this > document and I may set any rules on it. Yes but you already published it in a public place. :) > > So, despite all the flames, I think there _are_ a few realy good ideas > > here, at least as far as deficiencies in G1 (or even G2 VP) which we > > need to resolve. But I don't like Ivan's push-based out-of-order > > rendering pipeline at all. It's highly non-intuitive, and maybe even > > restrictive. > Huh, I'm happy to hear that there are good ideas. You didn't point > anything good. I see only critics&flames. Sorry, I wasn't at all clear. The best ideas from my standpoint were the ones that highlighted deficiencies in my design, e.g. the buffers-from-multiple-sources thing. Even though I flame them, I also sort of line your slice ideas, but basically every way of doing slices sucks... :( Another thing is the rebuild idea. Even though I don't see any way it can be done correctly with your proposal, it would be nice to be able to regenerate the current frame. Think of a screenshot function, for example. > > Actually, the name (VO3) reflects what I don't like about it: Ivan's > > design is an api for the codec to _output_ slices, thus calling it video > > output. (In fact, all filter execution is initiated from within the > > codec's slice callback!) > This is one of the possible ways. In the vo2 drafts I wanted to implement > something called automatic sliceing- forcing filters to use slices even > when decoder doesn't support slicing. (I can nearly imagine the flames you > are thinking in the moment;) I understand what you're saying. I'm just strongly opposed to the main entry point being at the codec end. In particular, it does not allow cpu-saving frame dropping. Only in a pull-based system where you wait to decode/process a frame until the next filter wants it can you skip (expensive!) processing (or even decoding, for B frames!) based on whether the output is destined for the encoder/monitor or the bitbucket... Ultimately, slice _processing_ isn't very friendly to this goal. The more we discuss it, the more I'm doubting that slice processing is even useful. On the one hand it's very nice, for optimizing cache usage, but on the other, it forces you to process frames before you even want them. This is a _big_ obstacle to framedropping, and to smooth playback, since displaying certain frames might require no processing, and displaying others might require processing 2 or 3 frames first... :(( > Anyway my API makes all filter codecses. That's why the scheme looks so > complicated, and that's why simple filter is so nessesery. The full beauty > of the API will be seen only for people that make temoral filters and > adding/removing frames. This mean by you :O Perhaps you could port vf_pullup to pseudocode using your api and see if you could convince me? > > On the other hand, I'm looking for an API > > for _obtaining_ frames to show on a display, which might come from > > anywhere -- not just a codec. For instance they might even be > > generated by visualization plugins from audio data, or even from > /dev/urandom! > Oh, Could you explain why mine API cannot be used for these things? It's _called_ from the codec's draw_slice! Not very good at all for multiple video sources, e.g. music + music video + overlaid visualization. > > So, Ivan. I'll try to take the best parts of what you've proposed and > > incorporate them into the code for G2. Maybe we'll be able to find > > something we're both happy with. > Wrong, We need something that we both are equally unhappy with:))) Yes... > But as far as you code it is natural you to implement your ideas. Yes again. So, now let me make some general remarks (yes, this is long already...) After this email, I understand your proposal a lot better. The big difference between our approaches is that I treat buffers (including "indirect" buffers) as objects which filters obtain and hold onto internally and which they only "pass along" when it's time to display them, while you treat buffers as entities which are carefully managed in a queue between each pair of filters, which can be processed immediately, and which are only "activated" (IN_ORDER flag) when it's actually their time. Here are some things I like better about your approach: - It's very easy to cancel buffers when unloading/resetting filters. - Buffer management can't be 'hidden' inside the filters, meaning that we're less likely to have leaks/crashes from buggy filters. - Processing can be done in decoding order even when slices aren't supported (dunno whether this actually happens). - Slices are fairly restricted, easing implementation. And here are some things I really don't like about your approach: - It's push-based rather than pull-based. Thus: - No good way to handle (intentional) frame dropping. This will be a problem _whenever_ you do out-of-order processing, so it happens with my design too. But the only time I do OOO is for slices... - Slices are fairly restricted, limiting their usefulness. - Having the chain run from the decoder's callback sucks. :( - It doesn't allow "dumb slices" (small reused buffer). - It doesn't have a way to handle buffer age/skipped blocks (I know, my design doesn't solve this either...)~: - My YV12->NV12 conversion might not be possible with your buffer management system...? Now that my faith in slices has been shaken, I think it would be really beneficial to see some _benchmarks_. Particularly, a comparison of slice-rendering through scale for colorspace conversion (and possibly also scaling) versus non-slice. If slices don't help (or hurt) for such complex processing (due to cache thrashing from the scale process itself), then I would be inclined to throw away slices for everything except copying to write-only DR buffers while decoding... On the other hand, if they help, then they're at least good for in-order rendering (non-B codecs). And Ivan, remember: a (pseudocode?) port of vf_pullup to your layer might be very useful in convincing me of its merits or demerits. Also feel free to criticize the way pullup.c does all its own internal buffer management -- perhaps you'd prefer it obtain buffers from the next filter. :) This can be arranged if it will help. As much as I dislike some of your ideas, I'm open to changing things to be more like what you propose. I want G2 to be the best possible tool for video! And that matters more than ego/flames/eliteness/etc. Maybe you'll even get me to write your (modified) design for you, if you come up with convincing proposals... :)) Rich From diego at biurrun.de Sun Dec 28 20:02:28 2003 From: diego at biurrun.de (Diego Biurrun) Date: Sun, 28 Dec 2003 16:02:28 -0300 Subject: [MPlayer-G2-dev] vo3 In-Reply-To: <20031228051830.GG257@brightrain.aerifal.cx> References: <2710.212.116.154.213.1072479682.squirrel@mail.cacad.com> <20031228003136.GA257@brightrain.aerifal.cx> <1367.212.116.154.213.1072579883.squirrel@mail.cacad.com> <20031228051830.GG257@brightrain.aerifal.cx> Message-ID: <16367.10436.766096.507637@gargle.gargle.HOWL> D Richard Felker III writes: > > User-Agent: SquirrelMail/1.4.1 > > Ivan, if you expect a reply, please get a mailer that understands how > to wrap lines at 80 columns and preserve formatting when quoting. Your > broken SquirrelMail converted all the quotes into multi-hundred-column > unreadable jibberish. 100% agree. That reply was _very_ hard to follow. > > > Then don't send it to public mailing lists... :) > > The author is never limited by the license, I own full copyright of this > > document and I may set any rules on it. > > Yes but you already published it in a public place. :) Right. Adding that note at the _end_ of the document does not make much sense either. If you want to keep it confidential, you have to mark it as such up front. Anyway, this is getting OT. > Another thing is the rebuild idea. Even though I don't see any way it > can be done correctly with your proposal, it would be nice to be able > to regenerate the current frame. Think of a screenshot function, for > example. Or things like changing window size or switching to fullscreen while the movie is paused. Currently MPlayer unpauses to do these things. Being able to do this is on the wishlist. Diego From arpi at thot.banki.hu Mon Dec 29 09:19:01 2003 From: arpi at thot.banki.hu (Arpi) Date: Mon, 29 Dec 2003 09:19:01 +0100 (CET) Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031220083434.GO7833@brightrain.aerifal.cx> Message-ID: <20031229081901.F01D42084B@mail.mplayerhq.hu> Hi, > I've been reading some demuxer code to figure out how pts is computed > for various demuxers, in order to understand how it needs to be > handled by the new video (and eventually new audio!) layer. In the > process, I've come up with a few recommendations for changes. > > 1. Some demuxers, such as AVI, seek into the middle of an audio chunk > without understanding audio packet boundaries at all (because the > container format sucks too much to distinguish packets), forcing > the decoder to recover. This also means (a) the demuxer will output > a broken packet, which is bad if you just want to remux without > using any codecs, and (b) pts is no longer exact, only approximate, > which IMO sucks really bad. agree, but you're wrong. AVI demuxer (we're talking about g1, as g2 avi has no seeking yet) does seek to frame boundaries, using packet size of nBlockAlign. although for some codec/encoders, it's set to 1, so it can seek to any position. most common case is cbr mp3, where it used to be 1. anywya the pts is still exact, as pts is calculated by samplerate (drRate/dwScale) multiplied by block (nBlockAlign size!) number. so, for AVI files this is not an issue. anywya there may be formats where it can be. my "favourite" one is the quicktime mov, where the demuxer cannot work without knowing the compression ratio (actually compressed and uncompressed frame/block size), as mov audio chunk headers contain the uncompressed(!) size of block, while it contains compressed data. how dumb they were when created this mess... some newer files (qt4 and above?) conatins an 'extended audio header' containing this info, but for older files/codecs you HAVE TO KNOW it from the codec fourcc... can be tricky for codecs like MACE, where block size also depend on other codec parameters, like number of channels and samplerate... ie you have to a) hardcode those evil codec fourccs and their blocksizes to demuxer b) have some loopback/talkback from decoder to demuxer (g1 way) -> this is why framecopy-ing audio (or just -dumpaudio) from mov sometimes fails with mencoder... > My recommendation would be to _always_ seek to a boundary the demuxer > understands. That way you have exact pts, and no broken packets for > the decoder or muxer to deal with. The demuxer can skip video frames > up to the next keyframe (the point you were trying to seek to) and the > audio pipeline can skip the audio _after_ decoding it so that it can > keep track of the exact number of samples. (Since audio decoding is > very fast, this should not impact performance when seeking.) the framer api -we're talking about yesterday- should solve this. > 2. After seeking, demuxers call resync_audio_stream, which depends on > there being an audio decoder! I found this problem a long time ago > while adding seeking support to mencoder: it was crashing with -oac > copy! It's bad because it makes the demuxer layer dependent on the > codec layer. > > My recommendation is to eliminate resync_audio_stream, and instead > just report a discontinuity the next time the demuxer stream is read. > That way the codec, if one exists, can decide what to do when it reads > from the demuxer, without having to use a callback from the demuxer > layer to the codec. Also, resync should become unnecessary for most > codecs if my above seeking recommendation is implemented. i like this idea! that resync* shit was always an ugly hack :( A'rpi / Astral & ESP-team -- Developer of MPlayer G2, the Movie Framework for all - http://www.MPlayerHQ.hu From arpi at thot.banki.hu Mon Dec 29 09:20:24 2003 From: arpi at thot.banki.hu (Arpi) Date: Mon, 29 Dec 2003 09:20:24 +0100 (CET) Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031220091526.GP7833@brightrain.aerifal.cx> Message-ID: <20031229082024.7817120845@mail.mplayerhq.hu> Hi, > OK, one more... > > 3. PTS handling is really bogus in some demuxers. Sometimes ds->pts is > scaled by rate_d, sometimes not. WTF? > My recommendation is for pts to always be in units of rate_d/rate_m. it _IS_ if it isn't, it's a bug then!! A'rpi / Astral & ESP-team -- Developer of MPlayer G2, the Movie Framework for all - http://www.MPlayerHQ.hu From dalias at aerifal.cx Mon Dec 29 18:21:21 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 29 Dec 2003 12:21:21 -0500 Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031229082024.7817120845@mail.mplayerhq.hu> References: <20031220091526.GP7833@brightrain.aerifal.cx> <20031229082024.7817120845@mail.mplayerhq.hu> Message-ID: <20031229172121.GN257@brightrain.aerifal.cx> On Mon, Dec 29, 2003 at 09:20:24AM +0100, Arpi wrote: > Hi, > > > OK, one more... > > > > 3. PTS handling is really bogus in some demuxers. Sometimes ds->pts is > > scaled by rate_d, sometimes not. > > WTF? AVI demuxer outputs rate_m*framenumber for pts. NUT demuxer outputs ticknumber for pts. MPEG and ASF also use ticknumber, but it's the same since their rate_m is implicitly 1 (1/90000 and 1/1000). > > My recommendation is for pts to always be in units of rate_d/rate_m. > > it _IS_ > > if it isn't, it's a bug then!! Then you agree that AVI should just output chunk number for pts, at least for video? Right now pts advances by rate_m for each frame, which makes no sense to me... For audio, the demuxer should probably also output chunk number (along with rate_m/rate_d values, of course) and the framer could transform that to sample numbers (i.e. in units of samplerate) or leave it in chunk numbers for fixed framesize audio codecs. Rich From dalias at aerifal.cx Mon Dec 29 18:26:53 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 29 Dec 2003 12:26:53 -0500 Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031229081901.F01D42084B@mail.mplayerhq.hu> References: <20031220083434.GO7833@brightrain.aerifal.cx> <20031229081901.F01D42084B@mail.mplayerhq.hu> Message-ID: <20031229172653.GO257@brightrain.aerifal.cx> On Mon, Dec 29, 2003 at 09:19:01AM +0100, Arpi wrote: > Hi, > > > I've been reading some demuxer code to figure out how pts is computed > > for various demuxers, in order to understand how it needs to be > > handled by the new video (and eventually new audio!) layer. In the > > process, I've come up with a few recommendations for changes. > > > > 1. Some demuxers, such as AVI, seek into the middle of an audio chunk > > without understanding audio packet boundaries at all (because the > > container format sucks too much to distinguish packets), forcing > > the decoder to recover. This also means (a) the demuxer will output > > a broken packet, which is bad if you just want to remux without > > using any codecs, and (b) pts is no longer exact, only approximate, > > which IMO sucks really bad. > > agree, but you're wrong. > AVI demuxer (we're talking about g1, as g2 avi has no seeking yet) > does seek to frame boundaries, using packet size of nBlockAlign. > although for some codec/encoders, it's set to 1, so it can seek to > any position. most common case is cbr mp3, where it used to be 1. My idea was for the demuxer to always seek only to the beginning of a chunk -- or are encoded audio frames sometimes split across chunks?! :( > anywya the pts is still exact, as pts is calculated by samplerate > (drRate/dwScale) multiplied by block (nBlockAlign size!) number. > so, for AVI files this is not an issue. anywya there may be formats > where it can be. Well, Suppose you want to seek to pts X in a file, and you do so by this method. But, the resulting byte position happens to be 10 bytes after the start of the audio frame. So you lose this whole frame, and begin framing/decoding at the next one, which is maybe 1000 bytes later. This seems bad for perfect a/v sync. IMO it would be better for the demuxer to seek to a point where it knows valid frames begin (if this is always possible) and let the framer pick the exact frame to start using. > my "favourite" one is the quicktime mov, where the demuxer cannot > work without knowing the compression ratio (actually compressed and > uncompressed frame/block size), as mov audio chunk headers contain > the uncompressed(!) size of block, while it contains compressed data. > how dumb they were when created this mess... Quicktime is idiotic... > > My recommendation would be to _always_ seek to a boundary the demuxer > > understands. That way you have exact pts, and no broken packets for > > the decoder or muxer to deal with. The demuxer can skip video frames > > up to the next keyframe (the point you were trying to seek to) and the > > audio pipeline can skip the audio _after_ decoding it so that it can > > keep track of the exact number of samples. (Since audio decoding is > > very fast, this should not impact performance when seeking.) > > the framer api -we're talking about yesterday- should solve this. :)))) > > 2. After seeking, demuxers call resync_audio_stream, which depends on > > there being an audio decoder! I found this problem a long time ago > > while adding seeking support to mencoder: it was crashing with -oac > > copy! It's bad because it makes the demuxer layer dependent on the > > codec layer. > > > > My recommendation is to eliminate resync_audio_stream, and instead > > just report a discontinuity the next time the demuxer stream is read. > > That way the codec, if one exists, can decide what to do when it reads > > from the demuxer, without having to use a callback from the demuxer > > layer to the codec. Also, resync should become unnecessary for most > > codecs if my above seeking recommendation is implemented. > > i like this idea! > that resync* shit was always an ugly hack :( :))) BTW same thing works for video, to prevent misdecoding of B frames after seeking and flush inverse telecine buffers after seeking too! :) Rich From arpi at thot.banki.hu Mon Dec 29 20:53:40 2003 From: arpi at thot.banki.hu (Arpi) Date: Mon, 29 Dec 2003 20:53:40 +0100 (CET) Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031229172121.GN257@brightrain.aerifal.cx> Message-ID: <20031229195340.9A0AB1FF0D@mail.mplayerhq.hu> Hi, > > > OK, one more... > > > > > > 3. PTS handling is really bogus in some demuxers. Sometimes ds->pts is > > > scaled by rate_d, sometimes not. > > > > WTF? > > AVI demuxer outputs rate_m*framenumber for pts. huh? ah, i think i remember now. it was more clever idea behind it. to get frame number: frameno=ptsvalue/rate_d to get time position: seconds=ptsvalue/rate_m and ptsvalue is always integer and can produce accurate values for both things. so, demuxers for formats storing frame/chunk number (avi) will scale by rate_d, demuxers for timestamp-based formats (asf) will scale by rate_m so for both kind of formats, only integer*integer multiply is done, so no bit loss :) > > > My recommendation is for pts to always be in units of rate_d/rate_m. > > > > it _IS_ ok, it isn't... i was wrong :) A'rpi / Astral & ESP-team -- Developer of MPlayer G2, the Movie Framework for all - http://www.MPlayerHQ.hu From arpi at thot.banki.hu Mon Dec 29 21:04:05 2003 From: arpi at thot.banki.hu (Arpi) Date: Mon, 29 Dec 2003 21:04:05 +0100 (CET) Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031229172653.GO257@brightrain.aerifal.cx> Message-ID: <20031229200405.DC11020793@mail.mplayerhq.hu> Hi, > > agree, but you're wrong. > > AVI demuxer (we're talking about g1, as g2 avi has no seeking yet) > > does seek to frame boundaries, using packet size of nBlockAlign. > > although for some codec/encoders, it's set to 1, so it can seek to > > any position. most common case is cbr mp3, where it used to be 1. > > My idea was for the demuxer to always seek only to the beginning of a > chunk -- or are encoded audio frames sometimes split across chunks?! > :( its possible. also possible (do you want sample files?) tat the whole audio of a whole film is stored in a _single_, ~100mb chunk. what a mess! but it's quite usual that 10 sec (or even more) of audio is stored together in a single chunk, containing many (100+) frames of mp3 or ac3. it's lame to burst-decode it any time you seek in the file... lets forget about avi chunks, its' an internal problem of the avi demuxer. the avi header contains the audio block size, which is the elementary size of independent, aligned audio blocks. you want those blocks, not the raw chunks. believe me... > > anywya the pts is still exact, as pts is calculated by samplerate > > (drRate/dwScale) multiplied by block (nBlockAlign size!) number. > > so, for AVI files this is not an issue. anywya there may be formats > > where it can be. > > Well, Suppose you want to seek to pts X in a file, and you do so by > this method. But, the resulting byte position happens to be 10 bytes > after the start of the audio frame. So you lose this whole frame, and > begin framing/decoding at the next one, which is maybe 1000 bytes > later. This seems bad for perfect a/v sync. IMO it would be better for > the demuxer to seek to a point where it knows valid frames begin (if > this is always possible) and let the framer pick the exact frame to > start using. go and rtfs g1's avi demuxer it does: 1. find vidoe frame we want to seek to 2. find nearest video keyframe 3. find audio block boundary right at or behind the keyframe selected at 2. 4. find the chunk containing that audio block 5. if position from 4. is bellow pos from 2., then find how many video frames you will read from stream before the keyframe, set skip_video_frames 6. skip N audio blocks from the audio chunk, to get to the position selected at 3. 7. calculate teh time difference between start of that audio block and the keyframe, put the value to audio_delay to compensate delay. yes its not easy, this is why i spent so much time on that mess. > > my "favourite" one is the quicktime mov, where the demuxer cannot > > work without knowing the compression ratio (actually compressed and > > uncompressed frame/block size), as mov audio chunk headers contain > > the uncompressed(!) size of block, while it contains compressed data. > > how dumb they were when created this mess... > > Quicktime is idiotic... wonder!!! we agree on something! :) > > > My recommendation would be to _always_ seek to a boundary the demuxer > > > understands. That way you have exact pts, and no broken packets for > > > the decoder or muxer to deal with. The demuxer can skip video frames > > > up to the next keyframe (the point you were trying to seek to) and the > > > audio pipeline can skip the audio _after_ decoding it so that it can > > > keep track of the exact number of samples. (Since audio decoding is > > > very fast, this should not impact performance when seeking.) > > > > the framer api -we're talking about yesterday- should solve this. > > :)))) and i should write that mail down... > > > 2. After seeking, demuxers call resync_audio_stream, which depends on > > > there being an audio decoder! I found this problem a long time ago > > > while adding seeking support to mencoder: it was crashing with -oac > > > copy! It's bad because it makes the demuxer layer dependent on the > > > codec layer. > > > > > > My recommendation is to eliminate resync_audio_stream, and instead > > > just report a discontinuity the next time the demuxer stream is read. > > > That way the codec, if one exists, can decide what to do when it reads > > > from the demuxer, without having to use a callback from the demuxer > > > layer to the codec. Also, resync should become unnecessary for most > > > codecs if my above seeking recommendation is implemented. > > > > i like this idea! > > that resync* shit was always an ugly hack :( > > :))) > > BTW same thing works for video, to prevent misdecoding of B frames > after seeking and flush inverse telecine buffers after seeking too! :) sure. Michael said it (B frame bug after seeking in IPB mpeg) long time ago but i didnt believe it... actually this bug exists from mpg12player v0.001 and was reported many times even in teh first days of mplayer... (video starts with green frame, green macroblocks etc) A'rpi / Astral & ESP-team -- Developer of MPlayer G2, the Movie Framework for all - http://www.MPlayerHQ.hu From dalias at aerifal.cx Mon Dec 29 22:02:48 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 29 Dec 2003 16:02:48 -0500 Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031229195340.9A0AB1FF0D@mail.mplayerhq.hu> References: <20031229172121.GN257@brightrain.aerifal.cx> <20031229195340.9A0AB1FF0D@mail.mplayerhq.hu> Message-ID: <20031229210248.GV257@brightrain.aerifal.cx> On Mon, Dec 29, 2003 at 08:53:40PM +0100, Arpi wrote: > Hi, > > > > > OK, one more... > > > > > > > > 3. PTS handling is really bogus in some demuxers. Sometimes ds->pts is > > > > scaled by rate_d, sometimes not. > > > > > > WTF? > > > > AVI demuxer outputs rate_m*framenumber for pts. > > huh? > ah, i think i remember now. it was more clever idea behind it. > to get frame number: frameno=ptsvalue/rate_d > to get time position: seconds=ptsvalue/rate_m > and ptsvalue is always integer and can produce accurate values for > both things. > > so, demuxers for formats storing frame/chunk number (avi) will scale by > rate_d, demuxers for timestamp-based formats (asf) will scale by rate_m > so for both kind of formats, only integer*integer multiply is done, > so no bit loss :) Timestamps are not necessarily in units of 1/rate_d!! They can (and SHOULD, to avoid wasting space) be in units of rate_m/rate_d, e.g. 1001/24000. With your method, pts would increase by 1001 per frame for avi or ntsc nut, and the output encoder/muxer would have no way of knowing that it should use 1001/24000 time units rather than 1/24000. Either method (yours or mine) allows you to get exact values for frame[/tick] number or timestamp. In my method, frame/tick/sample number is "pts" and timestamp is "pts*rate_m/rate_d". In yours, frame/tick/sample number is "pts/rate_m" and timestamp is "pts/rate_d". Either works (yours is just scaled by rate_m, but still always divisible by rate_m), but the reason I prefer my method is that you keep time units as large as possible in the output, allowing the encoder/muxer (for mencoder g2) to be as efficient as possible without the user having to manually override time units. > > > > My recommendation is for pts to always be in units of rate_d/rate_m. > > > > > > it _IS_ > > ok, it isn't... i was wrong :) :) Rich From dalias at aerifal.cx Mon Dec 29 22:07:19 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Mon, 29 Dec 2003 16:07:19 -0500 Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031229200405.DC11020793@mail.mplayerhq.hu> References: <20031229172653.GO257@brightrain.aerifal.cx> <20031229200405.DC11020793@mail.mplayerhq.hu> Message-ID: <20031229210719.GW257@brightrain.aerifal.cx> On Mon, Dec 29, 2003 at 09:04:05PM +0100, Arpi wrote: > Hi, > > > > agree, but you're wrong. > > > AVI demuxer (we're talking about g1, as g2 avi has no seeking yet) > > > does seek to frame boundaries, using packet size of nBlockAlign. > > > although for some codec/encoders, it's set to 1, so it can seek to > > > any position. most common case is cbr mp3, where it used to be 1. > > > > My idea was for the demuxer to always seek only to the beginning of a > > chunk -- or are encoded audio frames sometimes split across chunks?! > > :( > > its possible. > also possible (do you want sample files?) tat the whole audio of > a whole film is stored in a _single_, ~100mb chunk. what a mess! OK, thanks for making it that blatently clear that my idea sux. :) > but it's quite usual that 10 sec (or even more) of audio is stored > together in a single chunk, containing many (100+) frames of mp3 or ac3. > it's lame to burst-decode it any time you seek in the file... I was thinking of burst-frame, not necessarily burst-decode. But that's lame too if there can be so much... > lets forget about avi chunks, its' an internal problem of the avi > demuxer. the avi header contains the audio block size, which is the > elementary size of independent, aligned audio blocks. you want those > blocks, not the raw chunks. believe me... OK, fair enough. > > > anywya the pts is still exact, as pts is calculated by samplerate > > > (drRate/dwScale) multiplied by block (nBlockAlign size!) number. > > > so, for AVI files this is not an issue. anywya there may be formats > > > where it can be. > > > > Well, Suppose you want to seek to pts X in a file, and you do so by > > this method. But, the resulting byte position happens to be 10 bytes > > after the start of the audio frame. So you lose this whole frame, and > > begin framing/decoding at the next one, which is maybe 1000 bytes > > later. This seems bad for perfect a/v sync. IMO it would be better for > > the demuxer to seek to a point where it knows valid frames begin (if > > this is always possible) and let the framer pick the exact frame to > > start using. > > go and rtfs g1's avi demuxer > it does: > 1. find vidoe frame we want to seek to > 2. find nearest video keyframe > 3. find audio block boundary right at or behind the keyframe selected at 2. > 4. find the chunk containing that audio block > 5. if position from 4. is bellow pos from 2., then find how many video > frames you will read from stream before the keyframe, set skip_video_frames > 6. skip N audio blocks from the audio chunk, to get to the position > selected at 3. > 7. calculate teh time difference between start of that audio block and > the keyframe, put the value to audio_delay to compensate delay. > > yes its not easy, this is why i spent so much time on that mess. I did RTFS it... :( That's why I was hoping we might be able to simplify the design for G2. Rich From ivan at cacad.com Tue Dec 30 20:06:55 2003 From: ivan at cacad.com (Ivan Kalvachev) Date: Tue, 30 Dec 2003 21:06:55 +0200 (EET) Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031229210719.GW257@brightrain.aerifal.cx> References: <20031229172653.GO257@brightrain.aerifal.cx><20031229200405.DC11020793@mail.mplayerhq.hu> <20031229210719.GW257@brightrain.aerifal.cx> Message-ID: <1324.212.116.154.211.1072811215.squirrel@mail.cacad.com> Just one quick question. Here you talk about PTS - presentation time stamps. Are you sure that they are not DTS- decoder time stamps? Or maybe we need both? They differs a little. Ivan Kalvachev iive D Richard Felker III said: > On Mon, Dec 29, 2003 at 09:04:05PM +0100, Arpi wrote: >> Hi, >> >> > > agree, but you're wrong. >> > > AVI demuxer (we're talking about g1, as g2 avi has no seeking yet) >> > > does seek to frame boundaries, using packet size of nBlockAlign. >> > > although for some codec/encoders, it's set to 1, so it can seek to >> > > any position. most common case is cbr mp3, where it used to be 1. >> > >> > My idea was for the demuxer to always seek only to the beginning of a >> > chunk -- or are encoded audio frames sometimes split across chunks?! >> > :( >> >> its possible. >> also possible (do you want sample files?) tat the whole audio of >> a whole film is stored in a _single_, ~100mb chunk. what a mess! > > OK, thanks for making it that blatently clear that my idea sux. :) > >> but it's quite usual that 10 sec (or even more) of audio is stored >> together in a single chunk, containing many (100+) frames of mp3 or ac3. >> it's lame to burst-decode it any time you seek in the file... > > I was thinking of burst-frame, not necessarily burst-decode. But > that's lame too if there can be so much... > >> lets forget about avi chunks, its' an internal problem of the avi >> demuxer. the avi header contains the audio block size, which is the >> elementary size of independent, aligned audio blocks. you want those >> blocks, not the raw chunks. believe me... > > OK, fair enough. > >> > > anywya the pts is still exact, as pts is calculated by samplerate >> > > (drRate/dwScale) multiplied by block (nBlockAlign size!) number. >> > > so, for AVI files this is not an issue. anywya there may be formats >> > > where it can be. >> > >> > Well, Suppose you want to seek to pts X in a file, and you do so by >> > this method. But, the resulting byte position happens to be 10 bytes >> > after the start of the audio frame. So you lose this whole frame, and >> > begin framing/decoding at the next one, which is maybe 1000 bytes >> > later. This seems bad for perfect a/v sync. IMO it would be better for >> > the demuxer to seek to a point where it knows valid frames begin (if >> > this is always possible) and let the framer pick the exact frame to >> > start using. >> >> go and rtfs g1's avi demuxer >> it does: >> 1. find vidoe frame we want to seek to >> 2. find nearest video keyframe >> 3. find audio block boundary right at or behind the keyframe selected at >> 2. >> 4. find the chunk containing that audio block >> 5. if position from 4. is bellow pos from 2., then find how many video >> frames you will read from stream before the keyframe, set >> skip_video_frames >> 6. skip N audio blocks from the audio chunk, to get to the position >> selected at 3. >> 7. calculate teh time difference between start of that audio block and >> the keyframe, put the value to audio_delay to compensate delay. >> >> yes its not easy, this is why i spent so much time on that mess. > > I did RTFS it... :( That's why I was hoping we might be able to > simplify the design for G2. > > > Rich > > _______________________________________________ > MPlayer-G2-dev mailing list > MPlayer-G2-dev at mplayerhq.hu > http://mplayerhq.hu/mailman/listinfo/mplayer-g2-dev > From attila at kinali.ch Tue Dec 30 20:35:08 2003 From: attila at kinali.ch (Attila Kinali) Date: Tue, 30 Dec 2003 20:35:08 +0100 Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <1324.212.116.154.211.1072811215.squirrel@mail.cacad.com> References: <20031229172653.GO257@brightrain.aerifal.cx> <20031229200405.DC11020793@mail.mplayerhq.hu> <20031229210719.GW257@brightrain.aerifal.cx> <1324.212.116.154.211.1072811215.squirrel@mail.cacad.com> Message-ID: <20031230203508.72939d78.attila@kinali.ch> On Tue, 30 Dec 2003 21:06:55 +0200 (EET) "Ivan Kalvachev" wrote: > Just one quick question. > Here you talk about PTS - presentation time stamps. > Are you sure that they are not DTS- decoder time stamps? > Or maybe we need both? While we are at it, could some explain why dts is needed ? Doesn't pts already cover everything ? Attila Kinali -- egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger -- reeler in +kaosu From dalias at aerifal.cx Tue Dec 30 22:24:45 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Tue, 30 Dec 2003 16:24:45 -0500 Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <1324.212.116.154.211.1072811215.squirrel@mail.cacad.com> References: <20031229210719.GW257@brightrain.aerifal.cx> <1324.212.116.154.211.1072811215.squirrel@mail.cacad.com> Message-ID: <20031230212445.GH257@brightrain.aerifal.cx> On Tue, Dec 30, 2003 at 09:06:55PM +0200, Ivan Kalvachev wrote: > Just one quick question. > Here you talk about PTS - presentation time stamps. > Are you sure that they are not DTS- decoder time stamps? > Or maybe we need both? > > They differs a little. Yes, they differ. IMO DTS is dumb; a good container should provide PTS. If we decide to make a framer layer like Arpi proposed, the framer will always provide PTS (converting from DTS when the container sucks). Rich From dalias at aerifal.cx Tue Dec 30 22:25:52 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Tue, 30 Dec 2003 16:25:52 -0500 Subject: [MPlayer-G2-dev] Recommendations for DEMUXER layer In-Reply-To: <20031230203508.72939d78.attila@kinali.ch> References: <20031229172653.GO257@brightrain.aerifal.cx> <20031229200405.DC11020793@mail.mplayerhq.hu> <20031229210719.GW257@brightrain.aerifal.cx> <1324.212.116.154.211.1072811215.squirrel@mail.cacad.com> <20031230203508.72939d78.attila@kinali.ch> Message-ID: <20031230212552.GI257@brightrain.aerifal.cx> On Tue, Dec 30, 2003 at 08:35:08PM +0100, Attila Kinali wrote: > On Tue, 30 Dec 2003 21:06:55 +0200 (EET) > "Ivan Kalvachev" wrote: > > > Just one quick question. > > Here you talk about PTS - presentation time stamps. > > Are you sure that they are not DTS- decoder time stamps? > > Or maybe we need both? > > While we are at it, could some explain why dts is needed ? > Doesn't pts already cover everything ? Yes. But some stupidly designed containers only provide DTS (usually because they were only meant to support in-order codecs (where DTS=PTS) and then some windows idiots hacked them to add B frames support). Rich From arpi at thot.banki.hu Wed Dec 31 16:41:53 2003 From: arpi at thot.banki.hu (Arpi) Date: Wed, 31 Dec 2003 16:41:53 +0100 (CET) Subject: [MPlayer-G2-dev] framer API, demuxer chaining Message-ID: <20031231154153.557CF2009F@mail.mplayerhq.hu> Hi, I'm introducing a new thing for g2, called Framer. It's a layer between the demuxer and the VP layer (decoder/filter/vo/encoder). The goal of this thing to frame raw data to frames/packets. Why? Some container formats store audio/video data in raw form. Ie. the container has no information about the frame/packet boundaries, it just keeps the streams separated. Some examples: - raw mp3, raw ac3 - raw mpeg-es, raw h263, raw .264 - raw mjpeg - audio and video stream of mpeg-ps In these cases, we cannot ask the demuxer to 'give me a video frame', as it doesn't know how long is it or even where does it starts. Some other containers cleanly separates a/v packets, for example: - avi - rm - asf - mov In these, you can read a demuxer packet, and it will contain a single video or audio frame. (ok, in case of avi and mov, audio is tricky) Also, some codecs can handle frames only (most win32 deocders), some can handle raw data only (libmpeg2), and some can handle both (libavcodec). This is not new anyway, in mplayer-g1 it was present, and implemented in libmpdemux/video.c, in a very messy, ugly way. I thought i can skip it in g2, by doing raw parsig inside the demuxers, and/or the codecs, but it is also ugly and sometimes redundant code, so the old idea came back. Also, when we do streamcopy to a framed container from a raw one (for example, mpeg-ps to avi) we need to parse the stream anyway, without actual decoding. So placing framer into decoder is bad idea. I think it could be implemented as the top object of the new VP layer. ie it would replace current codecs, and codecs would be just filters. The other way to go, is implementing it as independent layer. (codecs as conversion filters is not a new idea, it could help in cases of raw rgb video / pcm audio, hardware decoders (ac3/mpeg passthrough), framecopy encoding, transcoding (think of an div3->mpeg4 one) etc). Since i don't know the new VP thing yet, i cantd ecide which one is the beter, but i hope it's cleaner to put into VP. Yet another thing: demuxer chaining. (we've also discussed it on irc, ie its messy and automatic chaining should be solved somehow without ugly hacks) So i've found the solution today: add yet another stream type to demuxer layer, besides audio, video and subtitles: the muxedstream type. In case of multi-layer muxed streams (like ogg-in-avi, rawdv-in-avi, mpegps-in-mov, mpegps-in-mpegts etc) the top level demuxer (in case of ogg-in-avi the avi demuxer) could export the 2nd level demuxed steram as a new stream, with type muxedstream, and set format to the 2ndlevel demuxer name (if known, anyway it should be known...). So, in case of ogg-in-avi, teh avi demuxer opens the stram, and exports stream#1=video(format=div3) and stream#2=muxed(format=ogg), the the caller of demux_open (the main app) will see that we have muxed substreams, and it can create new virtual stream (type=ds) from it and call demux_open again with this stream as input. This way we can avoid callong demux_open and such from demuxers, and still have the whole mess simple. Also, if user doesnt want the muxed stream (for example, -nosound for ogg-in-avi) we dont need to work with that stream. A'rpi / Astral & ESP-team -- Developer of MPlayer G2, the Movie Framework for all - http://www.MPlayerHQ.hu From dalias at aerifal.cx Wed Dec 31 17:11:24 2003 From: dalias at aerifal.cx (D Richard Felker III) Date: Wed, 31 Dec 2003 11:11:24 -0500 Subject: [MPlayer-G2-dev] framer API, demuxer chaining In-Reply-To: <20031231154153.557CF2009F@mail.mplayerhq.hu> References: <20031231154153.557CF2009F@mail.mplayerhq.hu> Message-ID: <20031231161124.GP257@brightrain.aerifal.cx> On Wed, Dec 31, 2003 at 04:41:53PM +0100, Arpi wrote: > I think it could be implemented as the top object of the new VP layer. > ie it would replace current codecs, and codecs would be just filters. > The other way to go, is implementing it as independent layer. > (codecs as conversion filters is not a new idea, it could help in cases > of raw rgb video / pcm audio, hardware decoders (ac3/mpeg passthrough), > framecopy encoding, transcoding (think of an div3->mpeg4 one) etc). > > Since i don't know the new VP thing yet, i cantd ecide which one is > the beter, but i hope it's cleaner to put into VP. What are the choices you're trying to decide between? I can't really tell from what you wrote, but I think it makes sense for the framer to be the beginning of the vp chain (and eventually ap chain) with codecs as filters. We need to decide on a sane way for storing encoded images in mpi, and there might be some pts/dts issues to think about (although IMO the framer should be able to provide pts for every frame it gives). > Yet another thing: demuxer chaining. > (we've also discussed it on irc, ie its messy and automatic chaining > should be solved somehow without ugly hacks) > > So i've found the solution today: add yet another stream type to demuxer > layer, besides audio, video and subtitles: the muxedstream type. > > In case of multi-layer muxed streams (like ogg-in-avi, rawdv-in-avi, > mpegps-in-mov, mpegps-in-mpegts etc) the top level demuxer (in case > of ogg-in-avi the avi demuxer) could export the 2nd level demuxed > steram as a new stream, with type muxedstream, and set format to the > 2ndlevel demuxer name (if known, anyway it should be known...). > So, in case of ogg-in-avi, teh avi demuxer opens the stram, and > exports stream#1=video(format=div3) and stream#2=muxed(format=ogg), > the the caller of demux_open (the main app) will see that we have > muxed substreams, and it can create new virtual stream (type=ds) > from it and call demux_open again with this stream as input. > This way we can avoid callong demux_open and such from demuxers, > and still have the whole mess simple. Also, if user doesnt want the > muxed stream (for example, -nosound for ogg-in-avi) we dont need > to work with that stream. Question: If we don't open up all the demuxers at startup, how will runtime stream selection work? Think of something idiotic like ogg-inside-avi with two separate vorbis streams (2 languages) inside the ogg. Bleh... IMO some api should be provided for recursively opening all demuxers. That way players don't have to implement this themselves unless the specifically want to. Rich From joey at nicewarrior.org Wed Dec 31 17:06:52 2003 From: joey at nicewarrior.org (Joey Parrish) Date: Wed, 31 Dec 2003 10:06:52 -0600 Subject: [MPlayer-G2-dev] framer API, demuxer chaining In-Reply-To: <20031231154153.557CF2009F@mail.mplayerhq.hu> References: <20031231154153.557CF2009F@mail.mplayerhq.hu> Message-ID: <20031231160652.GA17610@nicewarrior.org> On Wed, Dec 31, 2003 at 04:41:53PM +0100, Arpi wrote: > In case of multi-layer muxed streams (like ogg-in-avi, rawdv-in-avi, > mpegps-in-mov, mpegps-in-mpegts etc) the top level demuxer (in case > of ogg-in-avi the avi demuxer) could export the 2nd level demuxed > steram as a new stream, with type muxedstream, and set format to the > 2ndlevel demuxer name (if known, anyway it should be known...). > So, in case of ogg-in-avi, teh avi demuxer opens the stram, and > exports stream#1=video(format=div3) and stream#2=muxed(format=ogg), > the the caller of demux_open (the main app) will see that we have > muxed substreams, and it can create new virtual stream (type=ds) > from it and call demux_open again with this stream as input. > This way we can avoid callong demux_open and such from demuxers, > and still have the whole mess simple. Also, if user doesnt want the > muxed stream (for example, -nosound for ogg-in-avi) we dont need > to work with that stream. So correct me if I'm wrong, but would something like this happen? : UI says open this stream, then sees that it has one video stream and one muxed stream. How do we know if that muxed stream contains audio or video without parsing it? Imagine a UI that loads a file, then presents a dialog to say "which streams should I show you?" I would never create such a things, but it's the sort of thing a UI might want to know for other reasons I can't think of. (Maybe intelligent auto-selection of default streams to playback?) So if this is correct and UI sees one video, one muxed, then should the stream layer do the parsing of muxed stream automatically, or should the UI request this muxed stream to be further demuxed before playback begins? Or do I just have a fundamental misunderstanding of the interface between G2 UI and stream layer? :) --Joey -- "I know Kung Fu." --Darth Vader From attila at kinali.ch Wed Dec 31 17:11:37 2003 From: attila at kinali.ch (Attila Kinali) Date: Wed, 31 Dec 2003 17:11:37 +0100 Subject: [MPlayer-G2-dev] framer API, demuxer chaining In-Reply-To: <20031231160652.GA17610@nicewarrior.org> References: <20031231154153.557CF2009F@mail.mplayerhq.hu> <20031231160652.GA17610@nicewarrior.org> Message-ID: <20031231171137.521376f0.attila@kinali.ch> On Wed, 31 Dec 2003 10:06:52 -0600 Joey Parrish wrote: > So correct me if I'm wrong, but would something like this happen? : > UI says open this stream, then sees that it has one video stream and one > muxed stream. How do we know if that muxed stream contains audio or > video without parsing it? You are too slow :) We discussed this already on irc: All streams need to be opened. See also Richs mail. Attila Kinali -- egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger -- reeler in +kaosu From nsabbi at tiscali.it Wed Dec 31 19:33:20 2003 From: nsabbi at tiscali.it (Nico) Date: Wed, 31 Dec 2003 19:33:20 +0100 Subject: [MPlayer-G2-dev] framer API, demuxer chaining In-Reply-To: <20031231154153.557CF2009F@mail.mplayerhq.hu> References: <20031231154153.557CF2009F@mail.mplayerhq.hu> Message-ID: <3FF31670.1070504@tiscali.it> Arpi wrote: >Hi, > >I'm introducing a new thing for g2, called Framer. >It's a layer between the demuxer and the VP layer (decoder/filter/vo/encoder). >The goal of this thing to frame raw data to frames/packets. > >So i've found the solution today: add yet another stream type to demuxer >layer, besides audio, video and subtitles: the muxedstream type. > > > 100% agree >In case of multi-layer muxed streams (like ogg-in-avi, rawdv-in-avi, >mpegps-in-mov, mpegps-in-mpegts etc) > eh? maybe you mean pes-in-ps and pes-in-ts >the top level demuxer (in case >of ogg-in-avi the avi demuxer) could export the 2nd level demuxed >steram as a new stream, with type muxedstream, and set format to the >2ndlevel demuxer name (if known, anyway it should be known...). >So, in case of ogg-in-avi, teh avi demuxer opens the stram, and >exports stream#1=video(format=div3) and stream#2=muxed(format=ogg), >the the caller of demux_open (the main app) will see that we have >muxed substreams, and it can create new virtual stream (type=ds) >from it and call demux_open again with this stream as input. >This way we can avoid callong demux_open and such from demuxers, >and still have the whole mess simple. Also, if user doesnt want the >muxed stream (for example, -nosound for ogg-in-avi) we dont need >to work with that stream. > > >A'rpi / Astral & ESP-team > > > This new layer gives the chance to introduce clean fixup functions after seeking (e.g. sync to the next GOP boundary in case of mpeg1/2 (that is really nasty currently) or something more complicated if needed). IMO the concept of "dump" should be rethought: usually when I want to dump video I want the terminal video stream (that currently is inaccessible in g1 in case of a chained demuxer) but I could also want one of the many intermediate chained streams (e.g. for debugging). Nico