[FFmpeg-devel] Integrating the mod engine into FFmpeg - what is the best design approach?

Fri Jul 16 16:15:26 CEST 2010

On 07/16/2010 03:46 PM, Sebastian Vater wrote:
> Hello dears!
>
> I had a discussion with Vitor and Stefano about the best way to
> integrate the mod engine into FFmpeg.
>
> Vitor's idea doing this was (quoting him from the mail):
> Note that in the way we are suggesting, the MOD decoder decodes the bulk
> of the file to a format-independent Big Sound Struct (BSS). With our
> approach:
>
> 1- The MOD demuxer will do three things:
>      a) Probe if a file is a .mod
>      b) Extract metadata
>      c) Pass the rest of the file in an AVPacket.
> 2- The MOD decoder does just one thing: decode a AVPacket to a BSS. It
> does not know anything about the player (it doesn't even know _if_ it
> will be played or converted to other format or fed to a visualization code).
> 3- Libavsequencer does just one thing: transforming a BSS in PCM audio.
> It knows nothing about file formats (it don't care or know if the BSS
> was made from a MOD file or recorded from a MIDI keyboard).
>
> That's why we insist in starting with the implementation of MOD ->  XM
> conversion: it is much simpler than MOD ->  PCM conversion, it doesn't
> need an implementation of libavsequencer.
>
>                           mod file - metadata                      BSS +
>                                                              sequencer SAMPLES
> MOD file -->  MOD demuxer -------------------->  MOD decoder  ------------------>  application
>
> Vitor summarized the advantages of his approach as follows:
> 1- Good coding practice enforcing code modularity
> 2- Allows for conversion from a format with more features to one with
> less doing no mixing or sampling
> 3- Makes each file format very modular (just reading the bitstream and
> filling up BSS)
> 4- Better integration with the way FFmpeg works ATM
> 5- No libavsequencer needed to do conversion or visualization or edition
> 6- No need for new API calls for applications that want to access the
> BSS (just plain old avcodec_decode_frame())
> 7- At last, giving a simple goal to the SoC: getting all the code
> besides lavs/ committed.
>
> Since I mostly agree to his approach now, I decided to take that solely
> as a starting point of discussion.

That is a pretty good description of what I initially suggested. I'd 
like just to add that in this approach, the decoder would output the BSS 
as some new SAMPLE_FMT_SEQUENCER.

> I see just one disadvantage here, simply extraction of metadata and
> removing it, passing the rest to AVPacket requires parsing of all module
> files twice and also manipulating them (correct the offsets, etc.). This
> would require duplicate code in the demuxer and decoder, which I would
> like to avoid, if possible.

And your first idea did not have this problem. Since it is not a bad 
idea either, I'd like to explain it to see what the rest of the 
community think. In Sebastian's original approach, the demuxer would 
decode the file to a BSS an output it in an AVPacket. It would them 
define a CODEC_ID_SEQUENCER, and the decoder would be just a wrapper to 
libavsequencer to make the BSS -> PCM conversion.

The advantage of this approach is that the concept of demuxing/decoder 
does not make much sense for these formats, so this avoid the artificial 
distriction. Moreover, it makes a nice distinction of transcoding from 
one MOD format to other (with -acodec copy) to decoding it to PCM. The 
disadvantages is that API-wise it's less clear for external applications 
to get the BSS data (reading the AVPacket payload). Besides, all the 
bit-reading API is part of lavc.

I'm still undecided on which approach is best.

-Vitor