[FFmpeg-user] decomb versus deinterlace

Sun Apr 19 09:01:49 EEST 2020

On 04/19/2020 12:26 AM, pdr0 wrote:
> Mark Filipak wrote
>>> Deinterlacing does not necessarily have to be used in the context of
>>> "telecast".  e.g. a consumer camcorder recording home video interlaced
>>> content is technically not "telecast".  Telecast implies "broadcast on
>>> television"
>>
>> You are right of course. I use "telecast" (e.g., i30-telecast) simply to
>> distinguish the origin of
>> scans from hard telecines. Can you suggest a better term? Perhaps
>> "i30-camera" versus "i30"? Or
>> maybe the better approach would be to distinguish hard telecine: "i30"
>> versus "i30-progressive"? Or
>> maybe distinguish both of them: "i30-camera" versus "i30-progressive"?
> 
> Some home video cameras can shoot native progressive modes too - 24p ,
> 23.976p. Some DV cameras shoot 24p advanced pulldown or standard.
> 
>   So why not use a descriptive term for what it actually in terms of content,
> and how it's arranged or stored?  (see below)

I would adopt a descriptive term for myself, but I fear that ordinary people would find it difficult 
to understand and remember.

>>> The simplest operational definition is double rate deinterlacing
>>> separates
>>> and resizes each field to a frame +/- other processing. Single rate
>>> deinterlacing does the same as double, but discards either even or odd
>>> frames (or fields if they are discarded before the resize)
>>
>> I think I understand your reference to "resize": line-doubling of
>> half-height images to full-height
>> images, right?
> 
> "Resizing " a field in this context is any method of taking a field and
> enlarging it to a full sized frame. There are dozens of different
> algorithms. Line doubling is one method, but that is essentially a "nearest
> neighbor" resize without any interpolation. That's the simplest type. Some
> complex deinterlacers use information from other fields to fill in the
> missing information with adaptive motion compensation

I would love to use motion compensation but I can't, at least not with ffmpeg. Now, if there was 
such a thing as smart telecine...

A A A+B B B -- input
A A     B B -- pass 4 frames directly to output
   A A+B B   -- pass 3 frames to filter
      X      -- motion compensated filter to output

Unfortunately, ffmpeg can't do that because ffmpeg does not recurse the filter complex. That is, 
ffmpeg can pass the 2nd & 4th frames to the output, or to the filter, but not both.

>> But I don't understand how "double rate" fits in. Seems to me that fields
>> have to be converted
>> (resized) to frames no matter what the "rate" is. I also don't understand
>> why either rate or
>> double-rate would discard anything.
> 
> The "rate" describes the output frame rate.

Of course.

> Double rate deinterlacing keeps all the temporal information. Recall what
> "interlace content" really means. It's 59.94 distinct moments in time
> captured per second . In motion you have 59.94 different images.

That may be true in general, but the result of 55-telecine is
A A A+B B B ... repeat to end-of-stream
So there aren't 59.94 distinct moments. There're only 24 distinct moments (the same as the input).

> Single rate deinterlacing drops 1/2 the temporal information (either even,
> or odd fields are retained)
> 
> single rate deinterlace: 29.97i interlaced content => 29.97p output
> double rate deinterlace: 29.97i interlaced content => 59.94p output

There is no 29.97i interlaced content. There's p24 content (the source) and p60 content that combs 
frames 2 7 12 17 etc. (the transcode).

Okay, by "deinterlace" you mean what I mean by "combing". Let's see:
A   B     B+C     C+D     D   ... repeat to end-of-stream == p30
A A B   B B+C B+C C+D C+D D D ... repeat to end-of-stream == p30x2
A A A+B B B   C   C   C+D D D ... repeat to end-of-stream == p60
Sorry, my friend, I just don't see any temporal information missing.

>>> I know you meant telecine up conversion of 23.976p to 29.97i (not "p").
>>> But
>>> other framerates can be telecined eg. An 8mm 16fps telecine to 29.97i.
>>
>> Well, when I've telecined, the result is p30, not i30. Due to the presence
>> of ffmpeg police, I
>> hesitate to write that ffmpeg outputs only frames -- that is certainly
>> true of HandBrake, though.
>> When I refer to 24fps and 30fps (and 60fps, too) I include 24/1.001 and
>> 30/1.001 (and 60/1.001)
>> without explicitly writing it. Most ordinary people (and most BD & DVD
>> packaging) don't mention or
>> know about "/1.001".

> The result of telecine is progressive content (you started with progressive
> content) , but the output signal is interlaced.

According to the Motion Pictures Experts Group, it's not interlaced because the odd/even lines are 
not separated by 1/fieldrate seconds; they are separated by 1/24 sec.

> That's the reason for
> telecine in the first place - that 29.97i signal is required for equipment
> compatibility. So it's commonly denoted as 29.97i  . That can be confusing
> because interlaced content is also 29.97i.  That's why /content/ is used to
> describe everything .

Once again, this is a terminology problem. You are one person who acknowledges that terminology 
problems exist, and that I find heartening. Here's how I have resolved it:
I call a 30fps stream that's 23-telecined from p24, "t30" -- "t" for "telecine".
I call a 30fps stream that's 1/fieldrate interlaced, "s30" -- "s" for "scan" (see Note).
I call a 60fps stream that's 23-telecined, then frame doubled, "t30x2".
I call a 60fps stream that's 55-telecined from p24, "t60".
Note: I would have called this "i30", but "i30" is already taken.

Now, the reason I write "p24" instead of "24p" -- I'm not the only person who does this -- is so it 
fits an overall scheme that's compact, but that an ordinary human being can pretty much understand:
16:9-480p24 -- this is soft telecine
4:3-480t30 -- this is hard telecine
16:9-1080p24
I'm not listing all the possible combinations of aspect ratio & line count & frame rate, but you 
probably get the idea.

> When I'm lazy I use 23.976p notation (but it really means 24000/1001) ,
> because 24.0p is something else - for example, there are both 24.0p and
> 23.976p blu-ray and they are different frame rates . Similarly, I use
> "29.97" (but it really means 30000/1001), because "30.0" is something else.
> You can have cameras or web video as 30.0p. Both exist and are different and
> should be differentiated otherwise you have time and sync issues

I use 24/1.001 because it's shorter. Then, unless it really, really matters, I use 24. So, who's 
lazier? You or me?  ;-)...Me! Me! Me!

>>> "Combing" is just a generic, non-specific visual description. There can
>>> be
>>> other causes for "combing". eg. A warped film scan that causes spatial
>>> field
>>> misalignment can look like "combing". Interlaced content in motion , when
>>> viewed on a progressive display without processing is also described as
>>> "combing" - it's the same underlying mechanism of upper and lower field
>>> taken at different points in time
>>
>> Again, good points. May I suggest that when I use "combing" I mean the
>> frame content that results
>> from a 1/24th second temporal difference between the odd lines of a
>> progressive image and the even
>> line of the same progressive image that results from telecine? If there's
>> a better term, I'll use
>> that better term. Do you know of a better term?
> 
> I know what you're trying to say , but the term "combing" , it's appearance,
> and underlying mechanism is the same.  This is how the term "combing" is
> currently used in both general public and industry professionals. If you
> specifically mean combining on frames from telecine , then you should say so
> , otherwise you will confuse many people.
> 
> You know this already - there is an important distinction between the actual
> underlying video, but on a single frame or 2 field examination you will not
> see that. You need to either 1) separate the fields and examine the field
> pattern over a range with motion or 2) double rate deinterlace and examine
> the pattern over a range with motion . This is how you determine what the
> content actually is, whether it's interlaced or progressive

Erm... I'm not analyzing a mystery video. I'm transcoding from a known source. I know what the 
content actually is.

>>>> Decombing is smoothing combed frames.
>>>
>>> Yes, but this is an ambiguous term. "Decombing" can imply anything from
>>> various methods of deinterlacing to inverse telecine / removing pulldown
>>> .
>>
>> To the best of my knowlege, ffmpeg doesn't use the terms "combing" or
>> "decombing" -- certainly
>> there's no decomb filter. I don't have a term that distinguishes smoothing
>> of a 1/24th second comb
>> (what I call "decombing") from smoothing of a 1/60th second (or 1/50th
>> second) comb that results
>> from deinterlace (which I don't call "decombing"). Can you suggest a term
>> for the latter? Or terms
>> for the both of them?
> 
> Why do you need a specific "term?"

To facilitate communication and understanding.

> There are dozens of processing algorithms (not just talking about ffmpeg).
> There are many ways to "decomb"  something . The one you ended up using is
> categorized as  a blend deinterlacer because the top and bottom field are
> blended with each other. If you examine the separated fields , the fields
> are co-mingled, no longer distinct. You needed to retain both fields for
> your purpose

No, I don't. I don't want to retain both fields. I want to blend them. That's what 
'pp=linblenddeint' does, and that's why I'm happy with it.

> There is no distinction in terms of distribution of application for this
> type of filter.  You put the distinction on filtering specific frames by
> using select.  You could apply blend deinterlace to every frame too (for
> interlaced content) - how is that any different visually in terms of any
> single frame there vs. your every 5th frame ?

I honestly don't know. What I do know is if I pass select='not(eq(n+1\,5)\,3))' to the output 
unaltered but I filter select='eq(n+1\,5)\,3)',pp=linblenddeint before the output, the video looks 
better on a 60Hz TV. I don't want to pass the progressive frames through 'pp=linblenddeint'.

> You could have used a different filter, maybe a box blur , applied on the
> same frame selection . Would you still call that "decomb" ?

Yes, but only because I call the original frame "combed".

I think I'll call telecine "combed" frames "t-combed" and the combing that results from interlacing 
odd & even fields "i-combed".

>> Regarding inverse telecine (aka INVT), I've never seen INVT that didn't
>> yield back uncombed, purely
>> progressive pictures (as "picture" is defined in the MPEG spec). Can
>> you/will you enlighten me
>> because it's simply outside my experience.
> 
> It happens all the time. This is simply reversing the process. Inverse
> telecine is the same thing as "removing pulldown". You get back the original
> progressive frames you started with.

Okay, that's what I thought. Since INVT produces the original p24, it's not combed. I thought you 
said that inverse telecine can produce combing. My bad. :)