[MPlayer-dev-eng] About the future - libao2

Thu Nov 8 14:18:40 CET 2001

Hi,

> > btw, thinking about libao2 (triggered by this mail :)) i've found that i'm
> > silly :)
> > get-delay() should return seconds (in float or millisec), so mplayer
> > shouldn't care about libao2 resampling and stuff.
> 
> I read the sync source yesterday. As I understand it the sync works like this:
> 
> avd = audio video delay   delay between current video frame and current audio sample

> vit = video input time    number of frames * frames/second
time position in video stream (frameno/fps)

> ait = audio input time    number of audio samples read from file * sample frequency 
time position in audio stream (sampleno/samplerate)

> aod = audio output delay  delay caused by buffering in HW and in stream buffer
yes

> avd(t) = avd(t-1) + 0.1*(vit - ait + aod)
                      ^^^^^^^^^^^^^^^^^^^^^ value of this is also limited
by -mc to avoid big jumps

> where t is the frame index
> 
> Which means that you estimate the audio video delay using a 1st order
> AR model (or if you are an electrical engineer a first order active rc
> filter with gain = 1.1 :), and if the true delay isn't the same as the
> estimated delay then it is compensated for in the sleep before
> displaying next frame.
> 
> I think you are right get-delay() should return the delay in ms it
> would simplify sync. 
yes. the above thing is timer correction with timestamp differences.
the 0.1* and limitation is erquired, as low bitrate files and mpeg files
usually has no exact timestamps, but avegaring their value is ok to
detect and compensate desync.

the audio layer (and libvo2 too) should care the get_delay() function to
return how many seconds are buffered in audio device/driver.

> I also had a look at the buffering of audio data, and I have a
> suggestion of how it could be made more efficient. As it is now the
> output buffer from the decoding of the audio data and the input buffer
> for the sound card has different sizes. Because of this there is a
> memcopy of decoded audio data that doesn't fit in the audio output
> buffer. The data copied here also needs to be taken into account when
> sync is calculated. I suggest that this mechanism should be slightly
> modified:
> 
> Estimate the optimum size of the input buffer to N * audio decoder
> output block size (where N is an integer) before playing is started,
> and inform libao2. This would make the return value of
> audio_out->get_space(); constant and thus remove the need for memcopy
> in mplayer.c.

not so simple :(
codecs are very different. few codec can return exactly teh requested amount
of bytes/samples, but most codec is unpredictable. especially windows ones.

and, get_space() should return free buffer space only, which depends on
audio playback, it can't be constant.
if audio buffer is full, it should return 0 to avoid blocking.

> This will not remove the memcopy (it will still be needed inside
> libao2) but it will reduce the amount of times the data is accessed in
> ram since the data can be processed at the same time as it is moved.
> 
> This approach will also make the sync calculation more simple since
> mplayer.c doesn't have to take the buffer usage into consideration,
> since libao2 is keeping track of it.
i don't understand this.

> What do you think of this idea?
> 
> > the only thing i expect: keep current buffering method, i mean:
> > - provide a method which returns how many bytes (or samples?) can be
> >   played without blocking (required for singlethreaded design)
> >   (currently it's get-space())
> See above.
> 
> > - provide a method to query current delay (both your buffers (for resampling
> >   or other effects) and sound card's buffering)
> See above.
> 
> > - use the general, extendable control() interface for driver-specific things.
> >   it's like ioctl()'s.
> Part of my idea is to make mplayer even more transparent to the
> details of the HW, and I guess that is sort of the point with having
> libao2 at all. I had a look at the control() function an as I can see
> it the only thing it is used for is volume control and status
> information, and I guess that is ok, but I don't think it should be
> extended beyond that. 
control() is not yet used. there are many things to do with it, but
currently it isn't used.

> I have however a suggestion of how HW specific stuff could be
> accessed: modules in libao2 that can be configured registers one or
> several calback functions and strings. This information is passed on
> to the GUI which displays a list (when pressing a button or whatever)
> of the strings. When the user selects from the list, the callback is
> executed and a window is opened (, state is toggled or whatever).
> 
> I just got an idea: This type of interface could be extended to any
> type of event, so that any type of configurable module could register
> it self to be triggered to any event. Example: The fast forward and
> backward audio click removal function could register it self to be
> executed wenever -> or <- is pressed, or the EQ to the sound control
> on the remote. Just an idea :).
ehh.

> > - thinking about: some codecs (mp3lib, ffmpeg?) can do some work about
> >   effects, resampling (1:2, 1:4 for mp3), mono<->stereo.
> I don't think we want to use them, reason: the resampling is made by
> zero padding the data before the IDCT. This is not as efficient as FIR
> resampling since it requires a lot of multiplications and additions on
> 0. The polyphase implementation avoids this. We also don't want to
> upsample before the data is filtered (EQs, effects, etc), to reduce
> number of calculations required.

hmm. i mean natural side-effects of codecs.
mpeg audio decoder has dct->pcm converterts build for 1:1 1:2 1:4
resampling. and if you select mono, it will do less calculations as
it mix channels in dct space, and so on.

i didn't mean PCM data resampling in codecs.

A'rpi / Astral & ESP-team

--
mailto:arpi at thot.banki.hu
http://esp-team.scene.hu