[FFmpeg-devel] Integrating the mod engine into FFmpeg - what is the best design approach?
Stefano Sabatini
stefano.sabatini-lala
Thu Aug 5 02:54:07 CEST 2010
On date Wednesday 2010-08-04 15:36:25 +0200, Sebastian Vater encoded:
[...]
> Yes, we can do this later. Anyway, in the meanwhile I made some thoughts
> on a more precise generic integration plan for lavseq.
> As you currently see, I have a directory lavseq containing:
> avsequencer.h (the connection to rest of FFmpeg).
>
> To give an overview of the AVSequencer:
> At root, we have avsequencer.h, i.e. the AVSequencerContext which is the
> only structure linking to remaining of FFmpeg.
>
> root:
> AVSequencerContext contains a list of modules (module.h and module
> handling is implemented in module.c), the playback handler and a list of
> available mixing engines.
>
> depth 1:
> AVSequencerModule contains songs, instruments, keyboard definitions,
> arpeggio definitions and envelope structures.
>
> depth 2:
> AVSequencerSong contains tracks, an order list which references the
> track numbers being played for each channel (since the sequencer is
> internally track instead of pattern based to allow different speeds.
>
> AVSequencerInstrument contains samples, which can be assigned using the
> keyboard definition. For example, I tell the instrument to use sample
> number 1 for C-5 but number 2 instead for C-6.
> Instruments also determine how envelopes are used (you can assign them
> to vibrato, tremolo, volume handling, etc.).
>
> AVSequencerEnvelope contains the actual envelope data and also it
> properties like loop points.
>
> AVSequencerKeyboard contains the octave/note -> sample mapping for all
> notes from C-0 to B-9 which are 120 entries (10 octaves * 12 notes per
> octave).
>
> AVSequencerArpeggio is mostly like AVSequencerEnvelope with the
> difference you can specify a custom arpeggio layout and the structure is
> designed for that.
>
> depth 3:
> AVSequencerSample contains the sample loop points, auto vibrato
> envelopes and also the PCM data (the PCM data should later be obtained
> by the lavc, so you can also directly use ogg/mp3/flac/wav/etc.). It
> also contains a reference to AVSequencerSynth if it is a programmable
> synth sound.
>
> depth 4:
> AVSequencerSynth contains a list of "machine code" instructions for
> programming the synth sound "DSP", a symbol table for human-readability
> and properties like initial variables (16 general purpose registers).
>
> Hierarchy overview:
> SequencerContext (avsequencer.[hc])
> Module (module.[hc])
> Song (song.[hc])
> Track (track.[hc])
> TrackData
> TrackDataEffect
> OrderList (order.[hc])
> OrderData
> Instrument (instr.[hc])
> Sample (sample.[hc])
> SynthSound (synth.[hc])
> Envelope (instr.[hc])
> Keyboard (instr.[hc])
> Arpeggio (instr.[hc])
> Mixer (allmixers.c, mixer.[hc])
> Null mixer (null_mix.[hc])
> Low quality PCM mixer (lq_mix.[hc])
> High quality PCM mixer (hq_mix.[hc])
> FUTURE: OPL2/3 (AdLib/etc.) FM synthesizer
> SID chip FM (as found in C64) synthesizer
> Floating point mixers
> Player (player.[hc])
OK that's a nice description of the whole BSS design.
Sebastian is currently working on this git branch:
http://github.com/BastyCDGS/ffmpeg-soc.git
> Open discussion points are:
> 1. Best way of integration into rest of FFmpeg
I'm resuming some of the designs which has been already proposed:
please correct me if some information is missing / uncorrect.
1)
The MOD decoder does just one thing: decode a AVPacket to a BSS. It
does not know anything about the player (it doesn't even know _if_ it
will be played or converted to other format or fed to a visualization code).
3- Libavsequencer does just one thing: transforming a BSS in PCM audio.
It knows nothing about file formats (it don't care or know if the BSS
was made from a MOD file or recorded from a MIDI keyboard).
That's why we insist in starting with the implementation of MOD -> XM
conversion: it is much simpler than MOD -> PCM conversion, it doesn't
need an implementation of libavsequencer.
mod file - metadata BSS +
sequencer SAMPLES
MOD file --> MOD demuxer --------------------> MOD decoder ------------------> application
Advantages of this approach as follows:
- Allows for conversion from a format with more features to one with
less doing no mixing or sampling
- Makes each file format very modular (just reading the bitstream and
filling up BSS)
- Better integration with the way FFmpeg works ATM
2)
The demuxer decodes the file to a BSS an output it in an
AVPacket. It would them define a CODEC_ID_SEQUENCER, and the decoder
would be just a wrapper to libavsequencer to make the BSS -> PCM
conversion.
The advantage of this approach is that the concept of demuxing/decoder
does not make much sense for these formats, so this avoid the
artificial distriction. Moreover, it makes a nice distinction of
transcoding from one MOD format to other (with -acodec copy) to
decoding it to PCM. The disadvantages is that API-wise it's less clear
for external applications to get the BSS data (reading the AVPacket
payload). Besides, all the bit-reading API is part of lavc.
...
There are technical reasons for both solutions. I'll try to give more
info tomorrow.
> 2. How to do the mixer so it finally can playback channels in a way like:
> Channel 0: raw PCM
> Channel 1: ogg file
> Channel 2: mp3 file
> [..]
> Channel 62: ImpulseTracker instrument file
> Channel 63: GUS patch sound
> Channel 64: flac file
> (while all these can be played with different loop points, volumes,
> panning positions, etc.)
> 3. What's the best approach for writing demuxers / decoders using the
> AVSequencer
Regards.
--
FFmpeg = Fast & Forgiving Mere Philosofic Elitarian Game
More information about the ffmpeg-devel
mailing list