[MPlayer-DOCS] encoding guide draft

D Richard Felker III dalias at aerifal.cx
Fri Apr 1 21:13:49 CEST 2005


This is the draft of the old encoding guide I was working on last fall
and never finished. It's far from complete and maybe slightly
inaccurate in some areas, but has a lot of good information. I'm
posting it here in case anyone wants to finish it or incorporate parts
of it into other docs in the meantime.

Rich


P.S. Especially note that I retract some of the things I've said in
the past about -mc 0 and -noskip... So let's discuss them before
including them in official docs.

-------------- next part --------------


Topics:


I. Preparing to encode
   1. Identifying source material and framerate
   2. Selecting the quality you want
   3. Constraints for efficient encoding
   4. Cropping and scaling
   5. Choosing resolution and bitrate

II. Containers and codecs
   1. Where the movie will be played
   2. Constraints of DVD, SVCD, and VCD
   3. Limitations of AVI container

III. Basic MEncoder usage
   1. Selecting codecs & format
   2. Selecting input file or device
   3. Loading video filters
   4. Notes on A/V sync

IV. Encoding procedures
   1. Encoding progressive video
   2. Two-pass encoding
   3. Encoding interlaced video
   4. Deinterlacing
   5. Inverse telecine
   6. Capturing TV input
   7. Dealing with mixed-source content
   8. Low-quality & damaged sources

V. Optimizing encoding quality
   1. Noise removal
   2. Pure quality-gain options
   3. Questionable-gain options
   4. Advanced MPEG-4 features



I. Preparing to encode

Before you even think about encoding a movie, you need to take several
preliminary steps to 


I.1. Identifying source material and framerate

The first and most important step before you encode should be
determining what type of content you're dealing with. If your source
material comes from DVD or broadcast/cable/satellite TV, it will be
stored in one of two formats: NTSC for North America and Japan, and
PAL for Europe, etc. But it's important to realize that this is just
the formatting for presentation on a television, and often does NOT
correspond to the original format of the movie. In order to produce a
suitable encode, you need to know the original format. Failure to take
this into account will result in ugly combing (interlacing) artifacts
in your encode, and will greatly reduce the quality/bitrate ratio of
the encoder!

Here is a list of common types of source material, where you're likely
to find them, and their properties:

Standard Film: Produced for theatrical display at 24fps.

PAL video: Recorded with a PAL video camera at 50 fields per second. A
field consists of just the even or odd numbered lines of a frame.
Television was designed to refresh these in alternation as a cheap
form of analog compression. The human eye supposedly compensates for
this, but once you understand interlacing you'll learn to see it on TV
too and never enjoy TV again. Two fields do NOT make a complete frame,
because they are captured 1/50 of a second apart in time, and thus
they do not line up unless there is no motion.

NTSC Video: Recorded with an NTSC video camera at 59.94 fields per
second, or 60 fields per second in the pre-color era. Otherwise
similar to PAL.

Animation: Usually drawn at 24fps, but animation also comes in
mixed-framerate varieties.

Computer Graphics (CG): Can be any framerate, but 24 and 30 fps are
the most frequently encountered in NTSC regions, and 25 fps in PAL
regions.

Old Film: Various lower framerates.

Movies consisting of frames are referred to as progressive, while
those consisting of independent fields are called interlaced, or
sometimes video, although this latter term is ambiguous.

To further complicate matters, some movies will be a mix of several of
the above.

The most important distinction to make between all of these formats is
that some are frame-based, while others are field-based. WHENEVER a
movie is prepared for display on television (including DVD), it is
converted to a field-based format. The various methods by which this
can be done are collectively referred to as "pulldown", of which the
infamous NTSC "3:2 telecine" is one variety. Unless the original
material was also field-based (and the same fieldrate), you are
getting the movie in a format other than the original.

There are several common types of pulldown:

PAL 2:2 pulldown: The nicest of them all. Each frame is shown for two
fields duration, by extracting the even and odd lines and showing them
in alternation. If the original material is 24fps, this process speeds
up the movie by 4%.

PAL 2:2:2:2:2:2:2:2:2:2:2:3 pulldown: Every 12th frame is shown for
three fields duration, instead of just two. This avoids the 4% speedup
issue, but makes the process much more difficult to reverse. It is
usually seen in musical productions where adjusting the speed by 4%
would seriously damage the musical score.

NTSC 3:2 telecine: Frames are shown alternatively for 3 fields or 2
fields duration. This gives a fieldrate 5/2 times the original
framerate. The result is also slowed down very slightly from 60 fields
per second to 59.94 fields per second to maintain NTSC fieldrate.

NTSC 2:2 pulldown: Used for showing 30fps material on NTSC. Nice, just
like 2:2 PAL pulldown.

There are also methods for converting between NTSC and PAL video. Such
topics are beyond the scope of this guide. If you encounter such a
movie and want to encode it, your best bet is to find a copy in the
original format. NTSC/PAL conversion is highly destructive and cannot
be reversed cleanly, so your encode will greatly suffer if it is made
from a converted source.

When video is stored on DVD, consecutive pairs of fields are grouped
as a frame, even though they are not intended to be shown at the same
moment in time. The MPEG2 standard used on DVD and digital TV provides
a way to encode the original progressive frames, and store the number
of fields for which each should be shown in the frame headers. If this
method has been used, the term "soft telecine" will often be used to
describe the movie, since the process only directs the DVD player to
apply pulldown to the movie rather than altering the movie itself.
This case is highly preferable since it can easily be reversed
(actually ignored) by the encoder, and since it preserves maximal
quality. However, many DVD and broadcast production studios do not use
proper encoding techniques, and instead produce movies with "hard
telecine", where fields are actually duplicated in the encoded MPEG2.

The procedures for dealing with these cases will be covered later in
this guide. For now, we leave you with some guides to identifying
which type of material you're dealing with:

NTSC regions:

- If MPlayer prints that the framerate has changed to 23.976 when
  watching your movie, and never changes back, it's almost certainly
  24fps content that has been "soft telecined".

- If MPlayer shows the framerate switching back and forth between
  23.976 and 29.97, and you see "combing" at times, then there are
  several possibilities. The 23.976 fps segments are almost certainly
  24fps progressive content, "soft telecined", but the 29.97 fps parts
  could be either hard-telecined 24fps content or NTSC video content.
  Use the same guidelines as the following two cases to determine
  which.

- If MPlayer never shows the framerate change, and every single frame
  with motion appears combed, your movie is NTSC video at 59.94 fields
  per second.

- If MPlayer never shows the framerate change, and two frames out of
  every five appear combed, your movie is "hard telecined" 24fps
  content.

PAL regions:

- If you never see any combing, your movie is 2:2 pulldown.

- If you see combing alternating in and out every half second, then
  your movie is 2:2:2:2:2:2:2:2:2:2:2:3 pulldown.

- If you always seem combing during motion, then your movie is PAL
  video at 50 fields per second.

Hint: MPlayer can slow down movie playback with the -speed option. Try
using -speed 0.2 to watch the movie very slowly and identify the
pattern, if you can't see it at full speed.




I.2. Selecting the quality you want

It's possible to encode your movie at a wide range of qualities. With
modern video encoders and a bit of pre-codec compression (downscaling
and denoising), it's possible to achieve very good quality at 700 MB,
for a 90-110 minute widescreen movie. And all but the longest movies
can be encoded with near-perfect quality at 1400 MB.

If you do not plan to store your movies on CD or other size-limited
media, and you want maximal quality at all costs, you can encode in
constant quantizer mode, which will not aim to meet a specific target
bitrate or filesize but instead use the maximal accuracy encoding for
all frames. This is not recommended in most cases, because you can
achieve significantly smaller file sizes without noticeable loss.
However, it may be desirable for the hardcore archivists out there.








I.3. Constraints for efficient encoding

Due to the nature of MPEG-type compression, there are various
constraints you should follow for maximal quality. MPEG splits the
video up into 16x16 squares called macroblocks, each composed of 4 8x8
blocks of luma (intensity) information and two half-resolution 8x8
chroma (color) blocks (one for red-cyan axis and the other for the
blue-yellow axis). Even if your movie width and height are not
multiples of 16, the encoder will use enough 16x16 macroblocks to
cover the whole picture area, and the extra space will go to waste. So
in the interests of maximizing quality at a fixed filesize, it is a
bad idea to use dimensions that are not multiples of 16.

Most DVDs also have some degree of black borders at the edges. Leaving
these in place can hurt quality in several ways:

1. MPEG-type compression is also highly dependent on frequency domain
   transformats, in particular the Discrete Cosine Transform (DCT),
   which is similar to the Fourier transform. This sort of encoding is
   efficient for representing patterns and smooth transitions, but it
   has a hard time with sharp edges. In order to encode them it must
   use many more bits, or else an artifact known as ringing will
   appear.

   The frequency transform (DCT) takes place separately on each
   macroblock (actually each block), so this problem only applies when
   the sharp edge is inside a block. If your black borders begin
   exactly at multiple-of-16 pixel boundaries, this is not a problem.
   However, the black borders on DVDs rarely come nicely aligned, so
   in practice you will always need to crop to avoid this penalty.

In addition to frequency domain transforms, MPEG-type compression uses
motion vectors to represent the change from one frame to the next.
Motion vectors naturally work much less efficiently for new content
coming in from the edges of the picture, because it is not present in
the previous frame. As long as the picture extends all the way to the
edge of the encoded region, motion vectors have no problem with
content moving out the edges of the picture. However, in the presence
of black borders, there can be trouble:

2. For each macroblock, MPEG-type compression stores a vector
   identifying which part of the previous frame should be copied into
   this macroblock as a base for predicting the next frame. Only the
   remaining differences need to be encoded. If a macroblock spans the
   edge of the picture and contains part of the black border, then
   motion vectors from other parts of the picture will overwrite the
   black border. This means that lots of bits must be spent either
   re-blackening the border that was overwritten, or (more likely) a
   motion vector won't be used at all and all the changes in this
   macroblock will have to be coded explicitly. Either way, encoding
   efficiency is greatly reduced.

   Again, this problem only applies if black borders do not line up on
   multiple-of-16 boundaries.

3. Finally, suppose we have a macroblock in the interior of the
   picture, and an object is moving into this block from near the edge
   of the image. MPEG-type coding can't say "copy the part that's
   inside the picture but not the black border." So the black border
   will get copied inside too, and lots of bits will have to be spent
   encoding the part of the picture that's supposed to be there.

   If the picture runs all the way to the edge of the encoded area,
   MPEG has special optimizations to repeatedly copy the pixels at the
   edge of the picture when a motion vector comes from outside the
   encoded area. This feature becomes useless when the movie has black
   borders. Unlike problems 1 and 2, aligning the borders at multiples
   of 16 does not help here.

4. Depite the borders being entirely black and never changing, there
   is at least a minimal amount of overhead involved in having more
   macroblocks.

For all of these reasons, it's recommended to fully crop black
borders. Further, if there is an area of noise/distortion at the edge
of the picture, cropping this will improve encoding efficiency as
well. Videophile purists who want to preserve the original as close as
possible may object to this cropping, but unless you plan to encode at
constant quantizer, the quality you gain from cropping will
considerably exceed the amount of information lost at the edges.


I.4. Cropping and scaling

Recall from the previous section that the final picture size you
encode should be a multiple of 16 (in both width and height). This can
be achieved by cropping, scaling, or a combination of both.

When cropping, there are a few guidelines that must be followed to
avoid damaging your movie. The normal YUV format, 4:2:0, stores chroma
(color) information subsampled, i.e. chroma is only sampled half as
often in each direction as luma (intensity) information. Observe this
diagram, where L indicates luma sampling points and C chroma.

 L L L L L L L L
  C   C   C   C
 L L L L L L L L

 L L L L L L L L
  C   C   C   C
 L L L L L L L L

As you can see, rows and columns of the image naturally come in pairs.
Thus your crop offsets and dimensions MUST be even numbers. If they
are not, the chroma will no longer line up correctly with the luma. In
theory, it's possible to crop with odd offsets, but it requires
resampling the chroma which is potentially a lossy operation and not
supported by the crop filter.

Further, interlaced video is sampled as follows:

    TOP FIELD          BOTTOM FIELD

 L L L L L L L L
  C   C   C   C
                      L L L L L L L L

 L L L L L L L L
                       C   C   C   C
                      L L L L L L L L

 L L L L L L L L
  C   C   C   C
                      L L L L L L L L

 L L L L L L L L
                       C   C   C   C
                      L L L L L L L L

As you can see, the pattern does not repeat until after 4 lines. So
for interlaced video, your y-offset and height for cropping must be
multiples of 4.

So how do you determine a crop rectangle to begin with? Sometimes you
can guess, but the cropdetect filter in MPlayer can make it easy. Run
MPlayer with -vf cropdetect and it will print out the crop settings to
remove the borders. You should let the movie run long enough that the
whole picture area is used, in order to get accurate crop values.
Then, test the values you get with MPlayer, using the command line
cropdetect printed, and adjust the rectangle as needed. The rectangle
filter can help by allowing you to interactively position the crop
rectangle over your movie. Remember to follow the above divisibility
guidelines so that you do not misalign the chroma planes.

If you will be scaling your movie, it's usually best to crop only the
black borders and noise, then scale so that the resulting dimensions
are multiples of 16. This can slightly distort the aspect ratio of
your movie, but in practice the error cannot be seen. It's certainly
much less visible than the MPEG artifacts you will see from failing to
crop & scale well.

In certain cases, scaling may be undesirable. Scaling in the vertical
direction is difficult with interlaced video, and if you wish to
preserve the interlacing, you should usually refrain from scaling. If
you will not be scaling but you still want to use multiple-of-16
dimensions, you will have to overcrop. Do not undercrop, since black
borders are very bad for encoding!




I.5. Choosing resolution and bitrate

If you will not be encoding in constant quantizer mode, you need to
select a bitrate. The concept of bitrate is quite simple. It's the
(average) number of bits that will be consumed to store your movie,
per second. Normally bitrate is measured in kilobits (1000 bits) per
second. The size of your movie on disk is the bitrate times the length
of the movie in time, plus a small amount of "overhead" (see the
section on codecs and containers). Other parameters such as scaling,
cropping, etc. will NOT alter the file size unless you change the
bitrate as well!

Bitrate does NOT scale proportional to resolution. That is to say, a
320x240 file at 200 kbit/sec will not be the same quality as the same
movie at 640x480 and 800 kbit/sec! There are two reasons for this:

1. Perceptual: You notice MPEG artifacts more if they're scaled up
   bigger! Artifacts appear on the scale of blocks (8x8). Your eye
   will not see errors in 4800 small blocks as easily as it sees
   errors in 1200 large blocks (assuming you'll be scaling both to
   fullscreen).

2. Theoretical: When you scale down an image but still use the same
   size (8x8) blocks for the frequency space transform, you move more
   data to the high frequency bands. Roughly speaking, each pixel
   contains more of the detail than it did before. So even though your
   scaled-down picture contains 1/4 the information in the spacial
   directions, it could still contain a large portion of the
   information in the frequency domain (assuming that the high
   frequencies were underutilized in the original 640x480 image).

Past guides have recommended choosing a bitrate and resolution based
on a "bits per pixel" approach, but this is usually not valid due to
the above reasons. A better estimate seems to be that bitrates scale
proportional to the square root of resolution, so that 320x240 and 400
kbit/sec would be comparable to 640x480 at 800 kbit/sec. However this
has not been verified with theoretical or empirical rigor. Further,
given that movies vary greatly with regard to noise, detail, degree of
motion, etc., it's futile to make general recommendations for bits per
length-of-diagonal (the analogue of bits per pixel, using the square
root).

So far we have discussed the difficulty of choosing a bitrate and
resolution.

.................










II. Containers and codecs

II.1. Where the movie will be played

Perhaps the most important factor to choosing the format in which you
will encode your movie is where you want to be able to play it.
Usually this involves a tradeoff between quality and features, since
the formats supported by the widest variety of players are also the
worst in regards to compression.

If you want to be able to play your encode on standalone/set-top
players, your primary choices are DVD, VCD, and SVCD. There are also
extensions such as KVCD and XVCD which violate the standards but work
on many players and deliver higher quality. Modern players are
beginning to support MPEG-4 ("DivX") movies in AVI and perhaps other
containers as well, but these are often buggy and require you to
restrict your encodes to certain subsets of the full MPEG-4
functionality.

If you wish to be able to share your movies with Windows or Macintosh
users, without them having to install additional software, your
choices are very limited. The ancient MPEG-1 format with MP2 or PCM
audio is probably the only choice that is universally supported.
Interoperability with Windows/Mac also comes into play when deciding
how to encode and whether to scale to preserve aspect, since popular
media player applications for these systems do not honor the aspect
ratio encoding stored in MPEG-4 avi files.


II.2. Constraints of DVD, SVCD, and VCD

Unfortunately, the DVD, SVCD, and VCD formats are subject to heavy
constraints. Only a small selection of encoded picture sizes & aspect
ratios are available. If your movie does not meet one of these, you
must scale and crop or add black borders (which are bad for quality!)
to make it compliant.

Format      Resolution  V.Codec A.Codec           FPS    Aspect
NTSC DVD    720x480 *   MPEG-2  MP2,MP3,AC3,PCM   24,30  4:3,16:9
NTSC SVCD   480x480     MPEG-2  MP2               30     4:3
NTSC VCD    352x240     MPEG-1  MP2               24,30  4:3
PAL DVD     720x576 *   MPEG-2  MP2,MP3,AC3,PCM   25     4:3,16:9
PAL SVCD    480x576     MPEG-2  MP2               25     4:3
PAL VCD     352x288     MPEG-1  MP2               25     4:3

* DVD also offers other resolutions but they are usually not
  desirable.

If your movie has 2.35:1 aspect (most recent action movies), you will
have to add black borders or crop the movie down to 16:9 to make a DVD
or VCD. If you add black borders, try to align them at 16-pixel
boundaries in order to minimize the impact on encoding performance.
Thankfully DVD has sufficiently excessive bitrate that you do not have
to worry too much about encoding efficiency, but SVCD and VCD are
highly bitrate-starved and require effort to obtain acceptable
quality.




II.3. Limitations of the AVI container

Although it's the most widely-supported format after MPEG-1, AVI also
has some major drawbacks. Perhaps the most obvious is the overhead.
For each chunk of the AVI file, 24 bytes are wasted on headers and
index. This translates into a little over 5 MB per hour, or 1-2.5%
overhead for a 700 MB movie. This may not seem like much, but it could
mean the difference between being able to use 700 kbit/sec video or
714 kbit/sec, and every bit of quality counts.

In addition to gross inefficiency, AVI also has the following major
limitations:

1. Only fixed-fps content can be stored. This is particularly limiting
   if the original material you want to encode is mixed content, for
   example a mix of NTSC video and film material. Actually there are
   hacks that can be used to store mixed-framerate content in AVI, but
   they increase the (already huge) overhead fivefold or more so they
   are not practical.

2. Audio in AVI files must be either constant-bitrate (CBR) or
   constant-framesize (i.e. all frames decode to the same number of
   samples). Unfortunately, the most efficient codec, Vorbis, does not
   meet either of these requirements. Therefore, if you plan to store
   your movie in AVI, you'll have to use a less efficient codec such
   as MP3 or AC3.

With all of that said, MEncoder does not support variable-fps output
or Vorbis encoding. Therefore, you may not see these as limitations if
MEncoder is the only tool you will be using to produce your encodes.
However, it is possible to use MEncoder only for the video encoding,
and then use external tools to encode the audio and mux it into
another container format.







III. Basic MEncoder usage

III.1. Selecting codecs & format

Audio and video codecs for encoding are selected with the -oac and
-ovc options, respectively. The following choices are available,
although some may not have been enabled at compiletime:

Audio Codecs
mp3lame   Encode VBR or CBR mp3 with LAME
lavc      Use one of libavcodec's audio encoders
pcm       Uncompressed PCM audio
copy      Do not reencode, just copy compressed frames

Video codecs
lavc      Use one of libavcodec's video encoders
xvid      Xvid
raw       Uncompressed video frames
copy      Do not reencode, just copy compressed frames
frameno   Used for 3-pass encoding (not recommended)

Several other video codecs are available, but not recommended. The
lavc audio and video encoders have additional suboptions to select
which codec to use within lavc. The syntax is:

  -lavcopts acodec=audio_codec_name
  -lavcopts vcodec=video_codec_name

Your choices for lavc audio are mp2, ac3, and various adpcm formats
(low efficiency). For lavc video, you have many more choices:

mpeg1video  MPEG-1 video
mpeg2video  MPEG-2 video
mpeg4       MPEG-4 video, standards-compliant
msmpeg4     Pre-standard MPEG-4 used by MS (aka DivX3)
msmpeg4v2   Pre-standard MPEG-4 used by MS (low quality)
msmpeg4v1   Pre-standard MPEG-4 used by MS (low quality)
wmv1        Windows Media Video, V1 (aka WMV7)
wmv2        Windows Media Video, V2 (aka WMV8)
dvvideo     DV video (used by DV cameras)
mjpeg       Motion JPEG
ljpeg       Lossless JPEG
ffv1        Lossless ffmpeg video codec #1 (slow)
huffyuv     A standard lossless codec

...and lots more that aren't worth mentioning for most people.



III.2. Selecting input file or device

MEncoder can encode from files or directly from a DVD or VCD disc.
Simply include the filename on the command line to encode from a file,
or dvd://titlenumber or vcd://tracknumber to encode from a DVD title
or VCD track. If you have already copied a DVD to your hard drive and
wish to encode from the copy, you should still use the dvd:// syntax,
along with -dvd-device followed by the path to the copied DVD root.
The -dvd-device and -cdrom-device options can also be used to override
the paths to the device nodes for reading directly from disc, if the
defaults of /dev/dvd and /dev/cdrom do not work on your system.

When encoding from DVD, it is often desirable to select a chapter or
range of chapters to encode. You can use the -chapter option for this
purpose. For example, -chapter 1-4 will only encode chapters 1 through
4 from the DVD. This is especially useful if you will be making a 1400
MB encode targetted for two CDs, since you can ensure the split occurs
exactly at a chapter boundary rather than in the middle of a scene.

If you have a supported TV capture card, you can also encode from the
TV-in device. Use tv://channelnumber as the filename, and -tv to
configure various capture settings. DVB input works similarly.


III.3. Loading video filters

Learning how to use MEncoder's video filters is essential to producing
good encodes. All video processing is performed through the filters --
cropping, scaling, color adjustment, noise removal, sharpening,
deinterlacing, telecine, inverse telecine, and deblocking, just to
name a few. Along with the vast number of supported input formats, the
variety of filters available in MEncoder is one of its main advantages
over other similar programs.

Filters are loaded in a chain using the -vf option:

  -vf filter1=options,filter2=options,...

Most filters take several numeric options separated by colons, but the
syntax for options varies from filter to filter, so read the man page
for details on the filters you wish to use.

Filters operate on the video in the order they are loaded. For
example, the following chain:

  -vf crop=688:464:12:4,scale=640:464

will first crop the 688x464 region of the picture with upper-left
corner at (12,4), and then scale the result down to 640x464.

Certain filters need to be loaded at or near the beginning of the
filter chain, in order to take advantage of information from the video
decoder that will be lost or invalidated by other filters. The
principal examples are pp (postprocessing, only when it is performing
deblock or dering operations), spp (another postprocessor to remove
MPEG artifacts), pullup (inverse telecine), and softpulldown (for
converting soft telecine to hard telecine).

Advanced topics in filter chains and usage information for specific
filters will follow in chapters IV and V, as they are needed for the
topics covered.



III.4. Notes on A/V sync

MEncoder's audio/video synchronization algorithms were designed with
the intention of recovering files with broken sync. However they seem
to cause unnecessary skipping and duplication of frames, and possibly
slight A/V desync, when used with proper input. It is therefore
recommended that you switch to basic A/V sync with the -mc 0 option,
or put this in your ~/.mplayer/mencoder config file, as long as you
are only working with good sources (DVD, TV capture, high quality
MPEG-4 rips, etc) and not broken ASF/RM/MOV files.

If you want to further guard against strange frame skips and
duplication, you can use both -mc 0 and -noskip. This will prevent ALL
A/V sync, and copy frames one-to-one, so you cannot use it if you will
be using any filters that unpredictably add or drop frames, or if your
input file has variable framerate! Therefore, using -noskip is not in
general recommended.

The so-called "three-pass" encoding which MEncoder supports has been
reported to cause A/V desync. This will definitely happen if it is
used in conjunction with certain filters, therefore, it is now
recommended NOT to use three-pass mode. This feature is only left for
compatibility purposes and for expert users who understand when it is
safe to use and when it is not. If you have never heard of three-pass
mode before, forget that we even mentioned it!

There have also been reports of A/V desync when encoding from stdin
with MEncoder. Do not do this! Always use a file or CD/DVD/etc device
as input.





IV.1. Encoding progressive video

As long as your input video is progressive (see section I.1), 


Let's finally see a few examples:

  Encoding from 2:2 pulldown PAL DVD, title 1
  2.35:1 picture aspect
  1200 kbit/sec mpeg4 video
  128 kbit/sec average-bitrate mp3 audio

  mencoder dvd://1 -vf crop=712:432,scale=640:288 -mc 0 -oac mp3lame\
  -lameopts abr:br=128 -ovc lavc -lavcopts vcodec=mpeg4:vbitrate=1200

The crop size was presumably obtained by using the cropdetect filter
in MPlayer, or experimenting first with crop rectangles in MPlayer.
The output framerate will be 25 fps, the same as the original DVD. It
would be preferable to adjust the playback speed to match the original
24 fps theatrical rate, but this is not yet possible with MEncoder.
The options we pass to libavcodec are the bare minimum, and will yield
relatively poor quality. We will refine then in subsequent sections.

Now, a second example:

  Encoding from soft-telecined NTSC DVD, title 3
  2.35:1 picture aspect
  900 kbit/sec mpeg4 video
  Keeping the original AC3 audio

  mencoder dvd://1 -vf crop=708:360,scale=640:288 -mc 0 -oac copy \
  -ovc lavc -lavcopts vcodec=mpeg4:vbitrate=900 -ofps 23.976023976

This example is very similar to the first example, except for the
-ofps option to adjust the output framerate. Unless you tell it
otherwise, MEncoder takes its output framerate from the input
framerate. This is reported as 29.97 fps (actually 30000/1001), or
rather, 29.97 pairs of fields per second. But since the DVD is
soft-telecined, 1/5 of these fields are not actually present, but
intended to be added by the player when it telecines the movie in
realtime. There are actually only 23.976 (24000/1001) frames per
second. If you leave the framerate at the default, 29.97, it will
still work, but every 4th frame will get encoded in duplicate, making
the motion appear choppy.

Finally, a comment on the number 23.976023976. You'll often see
recommendations to use -ofps 23.976, but this is wrong. MEncoder will
reduce 23.976 to 2997/125, which is not the same as 24000/1001. So in
order to get the right framerate written in the output file's header,
always use plenty of precision.




IV.2. Two-pass encoding

The complexity (and thus the number of bits) required to compress the
frames of a movie can vary greatly from one scene to another. Modern
video encoders can adjust to these needs as they go and vary the
bitrate. However, they cannot exceed the requested average bitrate for
long stretches of time, because they do not know the bitrate needs of
future scenes.

Two-pass encoding solves this problem by encoding the movie twice.
During the first pass, statistics are generated regarding the number
of bits used by each frame and the quantization level (quality) at
which it was encoded. Then, when the second pass begins, the encoder
reads these statistics and redistributes the bits from frames where
they are in excess to frames that are suffering from low quality.

In order for the process to work properly, the encoder should be given
exactly the same sequence of frames during both passes. This means
that the same filters must be used, the same encoder parameters must
be used (with the possible exception of bitrate), and the same frame
drops and duplications (if any) must take place.

In theory it's possible to use -oac pcm or -oac copy during the first
pass to avoid spending time encoding the audio. However, this can
result in slight variations in which frames get dropped or duplicated,
so it may be preferable to encode the audio during the first pass as
well as the second. This also allows you to examine the final audio
bitrate and filesize, and to adjust the audio or video bitrate
slightly between passes if you don't meet your target size.

Here is an example:

  Encoding from an existing AVI file
  500 kbit/sec mpeg4 video
  96 kbit/sec average-bitrate mp3 audio

  mencoder bar.avi -vf scale=448:336 -mc 0 -oac mp3lame -lameopts \
  abr:br=96 -ovc lavc -lavcopts vcodec=mpeg4:vbitrate=500:vpass=1

  mencoder bar.avi -vf scale=448:336 -mc 0 -oac mp3lame -lameopts \
  abr:br=96 -ovc lavc -lavcopts vcodec=mpeg4:vbitrate=500:vpass=2

If you do not want to overwrite the output from the first pass when
you begin the second, you can use the -o option to choose a different
output filename. Note the addition of the vpass option in this
example. If vpass is not specified, single-pass encoding is performed.
If vpass=1, a log file is written with statistics from the first pass.
If vpass=2, the log file is read and the second pass is encoded based
on those statistics. If you are short on disk space or don't want the
extra disk wear from writing the file twice, you can use -o /dev/null
during the first pass. However, sometimes it is beneficial to watch
the first-pass file before beginning the second pass to make sure
nothing went wrong in the encoding.

Next, an example using XviD instead of libavcodec:

  Encoding from an existing AVI file
  500 kbit/sec mpeg4 video
  Copying the existing audio stream unmodified

  mencoder foo.avi -vf scale=320:240 -mc 0 -oac copy -ovc xvid \
  -xvidencopts bitrate=400:pass=1

  mencoder foo.avi -vf scale=320:240 -mc 0 -oac copy -ovc xvid \
  -xvidencopts bitrate=400:pass=2

The options used are slightly different, but the process is otherwise
the same.




IV.3. Encoding interlaced video

If the movie you want to encode is interlaced (NTSC video or PAL
video), you will need to choose whether you want to deinterlace or
not. While deinterlacing will make your movie usable on progressive
scan displays such a computer monitors and projectors, it comes at a
cost: the field rate of 50 or 59.94 fields per second is halved to 25
or 29.97 frames per second, and roughly half the information in your
movie will be lost during scenes with significant motion.

Therefore, if you are encoding for high quality archival purposes, it
is recommended not to deinterlace. You can always deinterlace the
movie at playback time when displaying it on progressive scan devices,
and future players will be able to deinterlace to full fieldrate,
interpolating 50 or 59.94 entire frames per second from the interlaced
video.

Special care must be taken when working with interlaced video:

1. Crop height and y-offset must be multiples of 4.

2. Any vertical scaling must be performed in interlaced mode

3. Postprocessing and denoising filters may not work as expected
   unless you take special care to operate them a field at a time, and
   they may damage the video if used incorrectly.

With these things in mind, here is our first example:

  mencoder capture.avi -mc 0 -oac lavc -ovc lavc -lavcopts \
  vcodec=mpeg2video:vbitrate=6000:ilmv:ildct:acodec=mp2:abitrate=224

Note the ilmv and ildct options. 























More information about the MPlayer-DOCS mailing list