[MEncoder-users] Muxing problems (was: mencoder vs ffmpeg)

Sat Jan 5 19:35:46 CET 2008

Le quintidi 15 nivôse, an CCXVI, Rich Felker a écrit :
> No, there are other issues too. That's just the worst one.

By the way, since you seem to understand the problem pretty well, could you
explain it more precisely?

As far as I understand things, the process of video encoding needs the
following steps:

1. Obtaining the source. The source is one or more stream of either images
   or audio samples, all the same format, and each with a timestamp
   (possibly implied by the position in the stream). Obtaining the source
   may involve decoding a video file or grabbing from a capture device, and
   applying filters to the content.

2. Adapting the source to the selected output settings, and especially
   (frame|sample)rate. Doing this involves duplicating or dropping some
   frames|samples (or applying a more subtle interpolation algorithm), and
   updating the timestamps of all frames|samples.

3. Feeding the source streams to the codecs. Each codec repeatedly eats one
   or several frame|sample(s), and outputs one or several packet(s) of
   binary data. Each packet (or maybe only some of them) has an implicit set
   of timestamps, which is probably an interval: the timestamps of the
   samples encoded by the packet.

4. Multiplexing the binary packets into a single binary stream. The
   difficult part of this step is to ensure that packets of different
   streams for corresponding timestamps are stored in nearby parts of the
   stream (although some advance for the audio streams may be wanted).

In mplayer/mencoder, step 2 is specifically part of mencoder. I do not like
that design. Maybe I am missing something crucial here, but I think that it
should be part of the filtering. That is, in the case of mencoder, "-vf
fps=25" instead of "-ofps 25".

Step 4 seems quite straightforward for me. Maybe again I am missing the
whole difficulty, but I would do it by keeping a queue of packets for each
stream, queueing the packets coming from the codecs, and dequeueing and
outputting the packet with the earliest timestamp when all queues have at
least one packet. In the case of output to real-time streams, outputting the
packets as soon as they arrive from the codecs could even be a better
solution.

All things considered, I do not understand what mencoder can be doing wrong.
And that disturbs me, because I am still thinking of trying to merge
mencoder as a audio-video output device of mplayer.

I would really appreciate if someone intimate with the subtleties of codecs
and container formats could point the mistakes and omissions in this summary
I just wrote.

Regards,

-- 
  Nicolas George