[MPlayer-dev-eng] design for motion-adaptive deinterlacer

D Richard Felker III dalias at aerifal.cx
Sat Apr 24 22:09:35 CEST 2004


I've been thinking about writing a good motion-adaptive deinterlacer,
since they all seem to suck. So I'm sending a few ideas to the list
for feedback (or in case someone wants to implement it since I'm
lazy... :)

Basic procedure is:

[Note: all operations are applied to the older field]

1. Identify areas of motion.

Using a simple difference threshold pixel-by-pixel is no good. If the
threshold is too low you'll get tons of motion from temporal noise,
while if it's too high, you'll miss slow changes in the low frequency
components which cause very ugly noticable combing (think of gradual
change in light level).

So the search for motion needs to be done in a small windowed
frequency space, with low-frequency coefficients weighted much higher
than high-frequency ones.

2. Smooth the motion map.

Tiny components should be eliminated as false positives. The remaining
components should be expanded at their boundaries in case we missed
some pixels.

3. Within the motion map, look for combing.

All local extrema in the vertical direction are potential points of
combing.

4. Smooth the combing map.

Same procedure as for the motion map.

5. Identify pairs of similar size/shape components in the combing map.

If we find such components, generate a conformal map from one to the
other, mark them as a common region, and use map between them to
initialize a map of motion transformations.

6. Perform per-pixel motion estimation.

Comparison function should use a small window with smooth falloff.
Test within a neighborhood of the region in the combing map. Use
guesses from step 5 as a starting point, if present. Optionally also
use motion vectors from the decoding phase as a guide, if they are
available.

7. Apply motion compensation to bring the fields in line.

Deform the older field so that it mostly matches with the newer field,
using the motion vectors/transformations from step 6.

8. Perform a final combing test.

If any regions of combing remain, blend them away.



I've performed tests for step 3 and 4 and it looks promising. The
resulting data also suggested a procedure like 5. I suspect the
results would be much better with steps 1 and 2 to help us throw away
false positives (by only checking for combing where there's motion).

I also tested some motion estimation/compensation stuff, but I have no
experience with that so it came out really bad... :)

Naturally this process (at least with steps 5-7) is not suitable for
realtime use. (I don't really care about realtime since my cpu is too
slow to do any deinterlacing in realtime except -vf field=0... :) The
point is high quality deinterlacing for producing progressive content
from video source.

BTW, some notes on why kerndeint and other similar deinterlacers
suck... For one thing, they deinterlace anything with motion, without
checking for combing. If there's no combing even with motion, then
there's no need to deinterlace. With pure video content the two should
always come together, but with telecine or added computer
effects/overlay during production, it's easy to have noncombed motion.
Also, most of these filters I see use bad deinterlacing, such as a
convolution kernel, which produces horrible ghosting/outlines. Quite
often with video, you _have_ the information for the missing field,
but it's just misaligned/distorted from camera panning or motion. So
why not try to recover it, instead of throwing it away or messing it
up with a convolution?

Rich




More information about the MPlayer-dev-eng mailing list