[MPlayer-G2-dev] Re: Limitations in vo2 api :(

Sat Dec 20 13:31:42 CET 2003

    Hi, D Richard Felker III!

Sometime (on Saturday, December 20 at 12:58) I've received something...
>On Sat, Dec 20, 2003 at 12:26:23PM +0200, Andriy N. Gritsenko wrote:
>> >IMO the vp/vo/muxer[/demuxer?] integration is only appropriate if it
>> >can be done at a level where the parts share a common api, but don't
>> >in any way depend on one another -- so they can still all be used
>> >independently.
>> 
>>     Fully agree with that.

>I don't think you were agreeing with what I said, but with something
>different...

    It may be. First of all my native language is too different from
English so may be (and it seems they are) some misunderstandings there.
And also each person has own way of thinking. :)  But I hope we anyways
aren't too far one from other. At least we can speak out all and find
the best. :)

>> There must be common (stream-independent) part
>> of the API - that part will be used by any layer and it must contain such
>> things as stream/codec type, PTS, duration, some frame/buffer pointer,
>> control and config proc pointers, and may be some others. Layer-specific
>> data (such as audio, video, subs, menu, etc. specific) must be specified
>> only in that layer.

>Eh? Are you saying to use a common transport/pipeline api for audio,
>video, and subs??? This is nonsense!! They have _very_ different
>requirements.

    Hmm, it seems I've told some misunderstandable again, sorry. I meant
not common transport but common API only, i.e. muxer must don't know
about video/audio/subs/etc. - encoder program must pass some common API
structure to it and so on. ;)  I.e. I meant muxer don't have rights to
know about _any_ layer and must be fully independent layer. And I think
it's possible for demuxer too. I didn't dig into demuxer but I don't see
any problem in that - demuxer just splits input file/stream to streams,
gets stream type and PTS (if there is one in container), and pass all
streams to application. Distinguish between stream types isn't work for
demuxer but for application instead and application will pass stream to
appropriate decoder. Decoder must be layer specific so it has to have
layer-specific API and be first section of some chunk (audio, video, or
some other). Last section of that chunk will be or ao/vo/sub driver, or
codec to encode. All chunk must use the same API between all sections of
it. It's only place to be layer-specific.

    Let's illustrate it in diagramm:

             ,---------> vd ---------> vf ---------> vo
            /  [common]      [video]       [video]
file --> demuxer
            \ 
             `---------> ad ---------> af ---------> ao
               [common]      [audio]       [audio]

             ,---------> vd -------> vf -------> ovc -. [common]
            /  [common]     [video]     [video]        \
file --> demuxer                                       muxer --> file
            \                                          /
             `---------> ad -------> af -------> oac -'
               [common]     [audio]     [audio]      [common]

[video] [audio] [common] -- are API structures of video layer, audio
layer, and common respectively.

    I hope now I said it clean enough to be understandable. :)

> That's not to say you can't interconnect them; you could
>have a module that serves both as a video and audio node. The most
>common case for this would be a 'visualization' effect for music.
>Another more esoteric but fun example is:

>vd ----> vf_split ----------------------------------------> vo
>           \     [video layer]
>            `---> vf/sf_ocr
>                    \            [subtitle layer]
>                     `--> sf/af_speechsynth
>                             \
>                              \           [audio layer]
>                               `--> af_merge -------------> ao
>                                      /`
>ad ----------------------------------'

>While this looks nice in my pretty ascii diagram, the truth of the
>matter is that audio, video, and text are nothing alike, and can't be
>passed around via the same interfaces. For example, video requires
>ultra-efficient buffering to avoid wasted copies, and comes in
>discrete frames. Text/subtitles are very small and can be passed
>around in any naive way you want. Audio should be treated as a
>continuous stream, with automatic buffering between filters.

    Yes, I think the same. What I said is just API from vd to vo must be
unified (let's say - video stream API) but that API must be inside video
layer. Application must call video layer API for "connection" all from vd
to vo in some chunk and that's all. Application will know only common
data structures of members of the chunk. And here goes the same for audio
chunk(s) and any other. Common data structures means application's
developers may learn only common API structure and API calls for layers
and it's all. Let's make it simpler. ;)

>>     This way we could manipulate "connections" from streams to muxer in
>> some generic way and be free to have any number of audios/videos/subs in
>> resulting file.

>The idea is appealing, but I don't think it can be done... If you have
>a suggestion that's not a horrible hack, please make it.

    I have that idea in thoughts and I've tried to explain it above. If
something isn't clear yet, feel free to ask. I'll glad to explain it.

>>     Also when we have some common part then wrapper may use only that
>> common part and it'll be as simple as possible and player/encoder don't
>> need to know layer internals and will be simpler too. This includes your
>> example above about muxer/demuxer without any codecs too. :)
>>     Also when we have common part in some common API then we'll document
>> that common part only once and specific parts also once so it'll reduce
>> API documentation too. :)

>And now we can be slow like Xine!! :))))

    I'm not sure about that. :)

    With best wishes.
    Andriy.