[FFmpeg-user] decomb versus deinterlace

Sun Apr 19 10:35:42 EEST 2020

On 04/19/2020 02:56 AM, pdr0 wrote:
> Mark Filipak wrote
>>
>> I would love to use motion compensation but I can't, at least not with
>> ffmpeg. Now, if there was
>> such a thing as smart telecine...
>>
>> A A A+B B B -- input
>> A A     B B -- pass 4 frames directly to output
>>     A A+B B   -- pass 3 frames to filter
>>        X      -- motion compensated filter to output
>>
>> Unfortunately, ffmpeg can't do that because ffmpeg does not recurse the
>> filter complex. That is,
>> ffmpeg can pass the 2nd & 4th frames to the output, or to the filter, but
>> not both.
> 
> What kind of motion compensation did you have in mind?

Hahahaha... ANY motion compensation. ...Is there more than one?

> "recurse" works if your timestamps are correct.

Carl Eugen said that, too. How could the timestamps be wrong? I was using the test video ("MOVE" 
"TEXT") that you gave me. I proved that, for that video (which I assumed was perfect), ffmpeg 
traversal of the filter complex does not occur.

> Interleave works with
> timestamps. If "X" has the timestamp of position 3 (A+B or the original
> combed frame)

Interleave may work with time stamps (PTC), but is interleave going to prevent recursion to the 
input of a 2nd select?

>>> Double rate deinterlacing keeps all the temporal information. Recall what
>>> "interlace content" really means. It's 59.94 distinct moments in time
>>> captured per second . In motion you have 59.94 different images.
>>
>> That may be true in general, but the result of 55-telecine is
>> A A A+B B B ... repeat to end-of-stream
>> So there aren't 59.94 distinct moments. There're only 24 distinct moments
>> (the same as the input).
> 
> Exactly !
> 
> That refers to interlaced content. This does not refer to your specific
> case. You don't have interlaced content. Do you see the distinction ?

Well, of course, but why are we (you) discussing interlaced content?

> You
> have 23.976 distinct moments/sec, just "packaged" differently with repeat
> fields . That's progressive content. Interlaced content means 59.94 distinct
> moments/sec
> 
>>> Single rate deinterlacing drops 1/2 the temporal information (either
>>> even,
>>> or odd fields are retained)
>>>
>>> single rate deinterlace: 29.97i interlaced content => 29.97p output
>>> double rate deinterlace: 29.97i interlaced content => 59.94p output
>>
>> There is no 29.97i interlaced content. There's p24 content (the source)
>> and p60 content that combs
>> frames 2 7 12 17 etc. (the transcode).
> 
> I thought it was obvious but those comments do  not refer to your specific
> side case. You don't have 29.97i interlaced content .

Okay, it's academic. I'd like to stick to 55-telecine. Okay? I was getting confused.

> You apply deinterlacing to interlaced content.  You don't deinterlace
> progressive content (in general)

I agree wholeheartedly. So why are we discussing deinterlace? I'll tell you why (I think). It's 
because the only ffmpeg filters that will decomb are called deinterlacers.

> 29.97i interlaced content are things like soap operas, sports, some types of
> home video, documentaries

Agreed.

>> Once again, this is a terminology problem. You are one person who
>> acknowledges that terminology
>> problems exist, and that I find heartening. Here's how I have resolved it:
>> I call a 30fps stream that's 23-telecined from p24, "t30" -- "t" for
>> "telecine".
>> I call a 30fps stream that's 1/fieldrate interlaced, "s30" -- "s" for
>> "scan" (see Note).
>> I call a 60fps stream that's 23-telecined, then frame doubled, "t30x2".
>> I call a 60fps stream that's 55-telecined from p24, "t60".
>> Note: I would have called this "i30", but "i30" is already taken.
>>
>> Now, the reason I write "p24" instead of "24p" -- I'm not the only person
>> who does this -- is so it
>> fits an overall scheme that's compact, but that an ordinary human being
>> can pretty much understand:
>> 16:9-480p24 -- this is soft telecine
>> 4:3-480t30 -- this is hard telecine
>> 16:9-1080p24
>> I'm not listing all the possible combinations of aspect ratio & line count
>> & frame rate, but you
>> probably get the idea.
> 
> It's descriptive, but the problem is not very many people use this
> terminology. NLE's , professional programs,  broadcast stations, post
> production houses do not use this notation.  e.g "480p24" anywhere else
> would be native progressive such as web video. Some places use pN to denote
> native progressive . So you're going to have problems with communication...I
> would write out the full sentence

In my guide, I do explain what "16:9-480t30" means, for example. I do it one time, in one chapter 
that's dedicated to terminology. Then I use it throughout. That saves having to write a paragraph 
(and readers having to read a paragraph) every time I refer to 30fps hard telecine, for example.

>> Erm... I'm not analyzing a mystery video. I'm transcoding from a known
>> source. I know what the
>> content actually is.
> 
> I thought it was obvious, those comments  refers to "in general" , not your
> specific case. More communication issues..
> 
> 
>>> There are dozens of processing algorithms (not just talking about
>>> ffmpeg).
>>> There are many ways to "decomb"  something . The one you ended up using
>>> is
>>> categorized as  a blend deinterlacer because the top and bottom field are
>>> blended with each other. If you examine the separated fields , the fields
>>> are co-mingled, no longer distinct. You needed to retain both fields for
>>> your purpose
>>
>> No, I don't. I don't want to retain both fields. I want to blend them.
>> That's what
>> 'pp=linblenddeint' does, and that's why I'm happy with it.
> 
> Yes, that should have said retain both fields blended. The alternative is
> dropping  a field like a standard deinterlacer

Thank you. I think my sanity is returning now.

>>> There is no distinction in terms of distribution of application for this
>>> type of filter.  You put the distinction on filtering specific frames by
>>> using select.  You could apply blend deinterlace to every frame too (for
>>> interlaced content) - how is that any different visually in terms of any
>>> single frame there vs. your every 5th frame ?
>>
>> I honestly don't know. What I do know is if I pass
>> select='not(eq(n+1\,5)\,3))' to the output
>> unaltered but I filter select='eq(n+1\,5)\,3)',pp=linblenddeint before the
>> output, the video looks
>> better on a 60Hz TV. I don't want to pass the progressive frames through
>> 'pp=linblenddeint'.
> 
> It was meant as a rhetorical question.., failed...
> 
> Your combed frame looks exactly like every frame for an interlaced video. If
> I took a 59.94p video and converted it to 29.97i  .Every frame would look
> like your combed frame.

That's not so. For your hypothetical video, the line-to-line "combing" would be 1/60th second. For a 
telecined video, the line-to-line "combing" is 1/24th second -- much worse.

> The point is, visually, when looking at individual
> frames, combing looks the same, the mechanism is the same.

(see above).

> The underlying
> video can be different in terms of content (it might be interlaced, it might
> be progressive) , but you can't determine that on a single frame

Sure I can. If I put 1/60th second of comb next to 1/24th second of comb, I think anyone would see 
that the 1/24th second frame looks worse.

>>> You could have used a different filter, maybe a box blur , applied on the
>>> same frame selection . Would you still call that "decomb" ?
>>
>> Yes, but only because I call the original frame "combed".
> 
> Same here - anything that works to reduce the "combing" can be categorized
> as a decomb filter, including deinterlacing. Including blurring.  Including
> whatever x,y,z filter . It's non specific

You know, I think we agree on everything.

>>>> Regarding inverse telecine (aka INVT), I've never seen INVT that didn't
>>>> yield back uncombed, purely
>>>> progressive pictures (as "picture" is defined in the MPEG spec). Can
>>>> you/will you enlighten me
>>>> because it's simply outside my experience.
>>>
>>> It happens all the time. This is simply reversing the process. Inverse
>>> telecine is the same thing as "removing pulldown". You get back the
>>> original
>>> progressive frames you started with.
>>
>> Okay, that's what I thought. Since INVT produces the original p24, it's
>> not combed. I thought you
>> said that inverse telecine can produce combing. My bad. :)
> 
> Yes.
> 
> But there is something else called "residual combing" .  It's not the same
> thing, it's when fields are slightly misaligned, such as with some old low
> quality DVD's . It's not as distinct as combing, the lines are very fine.

Shall we stick to 55-telecine, please? This academic ride you are putting me on is making me dizzy.

> Also, sometimes there are things like cadence breaks - such as when edits
> were made before IVTCing . Again, with lower quality productions.  So IVTC
> process in software often includes adaptive field matching and additonal
> post processing and comb detection. It's not just a fixed pattern.