[MPlayer-G2-dev] Recommendations for DEMUXER layer

Mon Dec 29 18:26:53 CET 2003

On Mon, Dec 29, 2003 at 09:19:01AM +0100, Arpi wrote:
> Hi,
> 
> > I've been reading some demuxer code to figure out how pts is computed
> > for various demuxers, in order to understand how it needs to be
> > handled by the new video (and eventually new audio!) layer. In the
> > process, I've come up with a few recommendations for changes.
> > 
> > 1. Some demuxers, such as AVI, seek into the middle of an audio chunk
> >    without understanding audio packet boundaries at all (because the
> >    container format sucks too much to distinguish packets), forcing
> >    the decoder to recover. This also means (a) the demuxer will output
> >    a broken packet, which is bad if you just want to remux without
> >    using any codecs, and (b) pts is no longer exact, only approximate,
> >    which IMO sucks really bad.
> 
> agree, but you're wrong.
> AVI demuxer (we're talking about g1, as g2 avi has no seeking yet)
> does seek to frame boundaries, using packet size of nBlockAlign.
> although for some codec/encoders, it's set to 1, so it can seek to
> any position. most common case is cbr mp3, where it used to be 1.

My idea was for the demuxer to always seek only to the beginning of a
chunk -- or are encoded audio frames sometimes split across chunks?!
:(

> anywya the pts is still exact, as pts is calculated by samplerate
> (drRate/dwScale) multiplied by block (nBlockAlign size!) number.
> so, for AVI files this is not an issue. anywya there may be formats
> where it can be.

Well, Suppose you want to seek to pts X in a file, and you do so by
this method. But, the resulting byte position happens to be 10 bytes
after the start of the audio frame. So you lose this whole frame, and
begin framing/decoding at the next one, which is maybe 1000 bytes
later. This seems bad for perfect a/v sync. IMO it would be better for
the demuxer to seek to a point where it knows valid frames begin (if
this is always possible) and let the framer pick the exact frame to
start using.

> my "favourite" one is the quicktime mov, where the demuxer cannot
> work without knowing the compression ratio (actually compressed and
> uncompressed frame/block size), as mov audio chunk headers contain
> the uncompressed(!) size of block, while it contains compressed data.
> how dumb they were when created this mess...

Quicktime is idiotic...

> > My recommendation would be to _always_ seek to a boundary the demuxer
> > understands. That way you have exact pts, and no broken packets for
> > the decoder or muxer to deal with. The demuxer can skip video frames
> > up to the next keyframe (the point you were trying to seek to) and the
> > audio pipeline can skip the audio _after_ decoding it so that it can
> > keep track of the exact number of samples. (Since audio decoding is
> > very fast, this should not impact performance when seeking.)
> 
> the framer api -we're talking about yesterday- should solve this.

:))))

> > 2. After seeking, demuxers call resync_audio_stream, which depends on
> >    there being an audio decoder! I found this problem a long time ago
> >    while adding seeking support to mencoder: it was crashing with -oac
> >    copy! It's bad because it makes the demuxer layer dependent on the
> >    codec layer.
> > 
> > My recommendation is to eliminate resync_audio_stream, and instead
> > just report a discontinuity the next time the demuxer stream is read.
> > That way the codec, if one exists, can decide what to do when it reads
> > from the demuxer, without having to use a callback from the demuxer
> > layer to the codec. Also, resync should become unnecessary for most
> > codecs if my above seeking recommendation is implemented.
> 
> i like this idea!
> that resync* shit was always an ugly hack :(

:)))

BTW same thing works for video, to prevent misdecoding of B frames
after seeking and flush inverse telecine buffers after seeking too! :)

Rich