[FFmpeg-devel] MOD support for FFmpeg (My GSoC 2010 task starts tomorrow)

Sun May 23 23:27:37 CEST 2010

Dear guys!

Since tomorrow my main GSoC task starts, I thought it would be a good
idea now to start discussing of MOD implementation in FFmpeg.

As some of you already know, I want to integrate MOD support by using
TuComposer's engine. Since it has adressed all the issues with MOD
support over 10 years ago and is also designed as a static/shared
library, I think its design fits pretty well into FFmpeg (apart from
code style, but changing this is more a refactoring task than writing
lots of code).

But let's start with a small introduction first: What is a MOD file?

MOD shortly terms "Music module" with attached samples.
Most of you probably know the MIDI format, which is almost the same,
just that MIDI doesn't contain the sample data itself but insteads loads
them from sample banks (like GUS patches, or soundcard wavetable).

This has the advantage that MID files are very small (they just contain
the note data) but also the disavantage that you can't easily add
customized samples and not even ensure that the file sounds the same
with every device (just compare OPL2/3 MIDI with SB AWE32 MIDI as an
example for this).

Professional music makers solve that issue usually by rendering the MIDI
into a PCM file and mix their additional speech stuff etc. into the
final PCM file.

This is where MOD differs, with MOD you have freedom to integrate custom
samples to the note data, so you have a sound file which sounds always
the same (like MP3, WAV, etc.).

That's the theory, in practice I realized that most module players don't
handle these formats very well, so lots of module sound quite different
compared to the tracker output which created the music file.

To summarize up, a module file can contain:
1. Module information (play time, artist, song message, initial speed, etc.)

2. Sub-song information (some module formats support more songs in one
file).

3. Position/Order list data (how the tracks/pattern are played and in
which order).

4. Pattern/track data (this contains the actual notes, associated
instrument/sample and effects). Some trackers organize this by single
tracks and some by patterns. The difference is that track-based trackers
allow each channel run with a different tempo, while pattern-based
trackers share global values like tempo, so all tracks are played at the
same speed.

5. Instrument data (this does not contain the actual sample data but
more musical related information like, keyboard <=> samples mapping, NNA
(New Note Action) stuff, like volume/pitch/resonance/panning envelopes.
One instrument can contain more samples, or if it's a MIDI instrument
the MIDI instrument and channel number.

6. Sample data (this contains the actual sample data as well as sample
related structures, i.e. bits per sample, base frequency, initial/global
volume and panning.

7. Synth sound data (some trackers even attach synths to samples, like
Adlib data, S3M being an example). TuComposer uses a complete synth
sound assembler comparable to the instruction set of a regular
microprocessor for allowing greatest possible degree of freedom).

How these data structures depend on each other?

One module can contain multiple sub-songs, multiple instruments as well
as multiple envelope data.

One sub-song can contain exactly one position/order list table but
multiple patterns/tracks.

One order list table can contain multiple elements pointing to
patterns/tracks with additional information like speed change, transpose.

Each pattern = track * number of channels. So for a 16 channel module a
pattern consists of 16 tracks (which can be the same though).

A sub-song can contain as many tracks/patterns as it likes.

Each instrument can contain multiple samples and assign multiple envelopes.

Each sample can assign one synth sound and each synth sound can have
multiple wavetables and a code engine how to interpret and handle
wavetables).

Since I today finished uploading of the UAE stuff on upload.ffmpeg.org
(see AMIGA sub directory), you can download that and try out TuComposer
in (Win-)UAE.

This way you can concern yourself that TuComposer delivers enough good
quality to be qualified for FFmpeg. ;-)

So why I'm choosing TuComposer as part of FFmpeg for this?

I have spent many years debugging MOD/S3M/XM/IT to playback like the
original tracker which invented the file format which can be a hell and
cost me most of the development time, because lots of this stuff is
either poorly documented if at all.

Another reason is that TuComposer is a complete composer/tracker engine,
not just a playback engine, since FFmpeg is also capable of encoding.

I would be glad to see TuComposer as official part (with a different
name like libavcomposer, although) of FFmpeg.

What would be the benefit of FFmpeg?

It can convert MOD/S3M/XM/IT and TCM modules to each other, you can e.g.
run:
ffmpeg -i my_song.s3m my_song.xm to convert a S3M file to XM, but you
also can do:
ffmpeg -i my_song.it my_song.ogg to render it as OGG vorbis file, and of
course:
ffplay my_song.mod

to simply playback a module with a nice pattern display (maybe like
OpenCubicPlayer or ImpulseTracker ;)). Look at TuCView in the UAE stuff
I uploaded today to see an example of this.

You can use FFmpeg as a base for a future tracker like program, which
adds lots of additional functionality like allowing OGG/MP3/FLAC/APE
etc. samples (any format that FFmpeg supports), even make a tracker
supporting creation of music videos (thanks to FFmpeg video
demuxers/decoders).

Even more, FFmpeg probably will be the first software in the world which
can handle MOD/S3M/XM/IT as audio part of an AVI/MOV/etc.

Finally I get my good old TuComposer mostly platform-independant (making
it so just requires these small lines to be changes which are Amiga
specific like sound output to hardware, where we could simply use
libavdevice instead).

The thing is that TuComposer contains already everything we need for
proper MOD support, so we don't need to rewrite everything, which
probably would require some 2-3 years before it has enough quality, i.e.
far too long for this GSoC task.

Now for the implementation:

There are actually two methods, I will discuss these in detail here:
1. Make a new libavcomposer (beside libavformat, libavfilter, libavcodec):

I know that some of you already said, that you wouldn't like to see a
new library in FFmpeg.

But I don't want libavcomposer to replace libavformat/libavcodec, i.e.
there will be MOD/S3M/XM/IT demuxers which transfer these file formats
into a common-shareable libavcomposer structure (almost all in
libavcomposer will become part of public FFmpeg API like almost all
functions in TuComposer are public, too).

In fact, the TuComposer module demuxer would simply be put in
libavformat/iff.c where the ILBM/8SVX stuff also relies. ;-)

The decoder however, would not have to be rewritten or added for each
new module format (there are plenty of module formats), since the
decoder just uses libavcomposer structure to handle the module.

If we want S3M to XM conversion or sth. like that and not just PCM
rendering, we need a common shareable structure between MOD/S3M/XM/IT,
the job which libavcomposer will take. Speaking of TuComposer analogy,
libavcomposer would be the base library which contains linked-lists of
attached modules, etc.

Another advantage is that you can simply turn off MOD support by:
../configure --disable-avcomposer (I think a lot of people using FFmpeg
don't really need module support, and forcing them type
--disable-mod-codec --disable-s3m-codec --disable-xm-codec
--disable-it-codec --disable-tcm-codec --disable-669-codec
--disable-mtm-codec --disable-mid-codec [...] will just be a pain to
type and even to remember (is this damn fucking format now a module
format or not).

For those already have taken a look at TuComposer's header files: A
valid mapping could be:
tucomposer/tucomposer.h => libavcomposer/avcomposer.h
tucomposer/module.h => libavcomposer/module.h
tucomposer/song.h => libavcomposer/song.h
tucomposer/external* => obselete since FFmpeg already has this, thus can
be deleted.

Each header file also has a single C file (as opposed to TuComposer
where each function was an own C file residing in a sub-directory).
so, for example, module.h has module.c which contains all functions from
Sources/C/tucomposer.library/Modules/*.c and so on.

Please note that TuComposer is around 300k executable size now on m68k
Amiga and an x86 will be twice as large (when I last checked with
DJGPP), so I think 600k are a good point for a new sub-library in FFmpeg.

Also there are over 40k lines of C code already present, although I
think we can reduce that to 10k or sth. by using neat macro stuff and
removing unnecessary parts, I think it's also a good point to manage
them in a different branch (which also ensures that module development
doesn't interfere much with the other stuff in FFmpeg).

Also libavcomposer will be able to save this structure in TCM format
which can then be used by the demuxer to transfer it to the decoder (so
that networking will work, too). The decoder can load the TCM file and
fill the libavcomposer structure and then do the playback.

2. Don't make a new libavcomposer but try to integrate everything with
libavfilter/libavformat/libavcodec

Some of you suggested this, but I'm not pretty sure how well the design
I planned and discussed above will really fit into this.

Adding a huge bunch of structures which probably are rarely used in
compared to most other structures FFmpeg offers, sounds at first a bit
controversal to me.

Maybe we can do sth. like add a AVComposer structure to AVPacket or
AVCodecContext which will simply points to NULL if it's an non-MOD demuxer.

This however will probably (because of public API change) only possible
when we do a major version bump (well the first idea might need this,
too. But at least it won't break old software compiled for older
versions of FFmpeg).

What do you think? Hope I didn't miss anything important out!

-- 

Best regards,
                   :-) Basty/CDGS (-: