[MPlayer-dev-eng] About the future - libao2

Thu Nov 8 13:11:40 CET 2001

Hi,

> Hi,
> 
> > > no. my plan:
> > > video codec -> libvo2 core -> libvo2 driver
> 
> > I think that your idea of having a conversion layer before the output
> > layer is very good and that it should be extended to libao2 as well:
> > 
> > audio codec -> libao2 core -> libvo2 driver
> you mean libao2, right? :)

Yes, (bug).

> > What to do with sync (I tried to read the source but gave up (didn't
> > know where to start))? I have a few suggestions though.
> most (all :) libao2 drivers has some buffering (usually in hardware or dma
> buffers in kernel) which cause some delay. this delay is usually around 0.3
> sec for 44khz audio, but in some extreme cases (alsa driver with fixed 192k
> buffer, low audio samplerate, mono) it can be up to many seconds.
> mplayer's A-V sync code requires this value for correct timing.
> currently, it asks libao2 driver how many bytes are in the buffer, and
> divides by bytes/sec (sh_audio->o_bps) to get delay in secs.
> But, if you change playback rate of soundcard, it will calc false delay
> (bytes divided by input srate). yes, it can be fixed of course ;)
> 
> btw, thinking about libao2 (triggered by this mail :)) i've found that i'm
> silly :)
> get-delay() should return seconds (in float or millisec), so mplayer
> shouldn't care about libao2 resampling and stuff.

I read the sync source yesterday. As I understand it the sync works like this:

avd = audio video delay   delay between current video frame and current audio sample
vit = video input time    number of frames * frames/second
ait = audio input time    number of audio samples read from file * sample frequency 
aod = audio output delay  delay caused by buffering in HW and in stream buffer

avd(t) = avd(t-1) + 0.1*(vit - ait + aod)

where t is the frame index

Which means that you estimate the audio video delay using a 1st order
AR model (or if you are an electrical engineer a first order active rc
filter with gain = 1.1 :), and if the true delay isn't the same as the
estimated delay then it is compensated for in the sleep before
displaying next frame.

I think you are right get-delay() should return the delay in ms it
would simplify sync. 

I also had a look at the buffering of audio data, and I have a
suggestion of how it could be made more efficient. As it is now the
output buffer from the decoding of the audio data and the input buffer
for the sound card has different sizes. Because of this there is a
memcopy of decoded audio data that doesn't fit in the audio output
buffer. The data copied here also needs to be taken into account when
sync is calculated. I suggest that this mechanism should be slightly
modified:

Estimate the optimum size of the input buffer to N * audio decoder
output block size (where N is an integer) before playing is started,
and inform libao2. This would make the return value of
audio_out->get_space(); constant and thus remove the need for memcopy
in mplayer.c.

This will not remove the memcopy (it will still be needed inside
libao2) but it will reduce the amount of times the data is accessed in
ram since the data can be processed at the same time as it is moved.

This approach will also make the sync calculation more simple since
mplayer.c doesn't have to take the buffer usage into consideration,
since libao2 is keeping track of it.

What do you think of this idea?

> the only thing i expect: keep current buffering method, i mean:
> - provide a method which returns how many bytes (or samples?) can be
>   played without blocking (required for singlethreaded design)
>   (currently it's get-space())
See above.

> - provide a method to query current delay (both your buffers (for resampling
>   or other effects) and sound card's buffering)
See above.

> - use the general, extendable control() interface for driver-specific things.
>   it's like ioctl()'s.
Part of my idea is to make mplayer even more transparent to the
details of the HW, and I guess that is sort of the point with having
libao2 at all. I had a look at the control() function an as I can see
it the only thing it is used for is volume control and status
information, and I guess that is ok, but I don't think it should be
extended beyond that. 

I have however a suggestion of how HW specific stuff could be
accessed: modules in libao2 that can be configured registers one or
several calback functions and strings. This information is passed on
to the GUI which displays a list (when pressing a button or whatever)
of the strings. When the user selects from the list, the callback is
executed and a window is opened (, state is toggled or whatever).

I just got an idea: This type of interface could be extended to any
type of event, so that any type of configurable module could register
it self to be triggered to any event. Example: The fast forward and
backward audio click removal function could register it self to be
executed wenever -> or <- is pressed, or the EQ to the sound control
on the remote. Just an idea :).

> - thinking about: some codecs (mp3lib, ffmpeg?) can do some work about
>   effects, resampling (1:2, 1:4 for mp3), mono<->stereo.
I don't think we want to use them, reason: the resampling is made by
zero padding the data before the IDCT. This is not as efficient as FIR
resampling since it requires a lot of multiplications and additions on
0. The polyphase implementation avoids this. We also don't want to
upsample before the data is filtered (EQs, effects, etc), to reduce
number of calculations required.

> A'rpi / Astral & ESP-team

//Anders