[FFmpeg-devel] Integrating the mod engine into FFmpeg - what is the best design approach?

Fri Jul 16 15:46:48 CEST 2010

Hello dears!

I had a discussion with Vitor and Stefano about the best way to
integrate the mod engine into FFmpeg.

Vitor's idea doing this was (quoting him from the mail):
Note that in the way we are suggesting, the MOD decoder decodes the bulk
of the file to a format-independent Big Sound Struct (BSS). With our
approach:

1- The MOD demuxer will do three things:
    a) Probe if a file is a .mod
    b) Extract metadata
    c) Pass the rest of the file in an AVPacket.
2- The MOD decoder does just one thing: decode a AVPacket to a BSS. It
does not know anything about the player (it doesn't even know _if_ it
will be played or converted to other format or fed to a visualization code).
3- Libavsequencer does just one thing: transforming a BSS in PCM audio.
It knows nothing about file formats (it don't care or know if the BSS
was made from a MOD file or recorded from a MIDI keyboard).

That's why we insist in starting with the implementation of MOD -> XM
conversion: it is much simpler than MOD -> PCM conversion, it doesn't
need an implementation of libavsequencer.

                         mod file - metadata                      BSS +
                                                            sequencer SAMPLES
MOD file --> MOD demuxer --------------------> MOD decoder  ------------------> application

Vitor summarized the advantages of his approach as follows:
1- Good coding practice enforcing code modularity
2- Allows for conversion from a format with more features to one with
less doing no mixing or sampling
3- Makes each file format very modular (just reading the bitstream and
filling up BSS)
4- Better integration with the way FFmpeg works ATM
5- No libavsequencer needed to do conversion or visualization or edition
6- No need for new API calls for applications that want to access the
BSS (just plain old avcodec_decode_frame())
7- At last, giving a simple goal to the SoC: getting all the code
besides lavs/ committed.

Since I mostly agree to his approach now, I decided to take that solely
as a starting point of discussion.

I see just one disadvantage here, simply extraction of metadata and
removing it, passing the rest to AVPacket requires parsing of all module
files twice and also manipulating them (correct the offsets, etc.). This
would require duplicate code in the demuxer and decoder, which I would
like to avoid, if possible.

-- 

Best regards,
                   :-) Basty/CDGS (-: