[FFmpeg-devel] [PATCH] lavf/vf_freezedetect: improve for the freeze frame detection

Marton Balint cus at passwd.hu
Tue Jul 16 00:07:32 EEST 2019



On Mon, 15 Jul 2019, Limin Wang wrote:

[...]

>> >>>        if (s->width[plane]) {
>> >>>            uint64_t plane_sad;
>> >>>            s->sad(frame->data[plane], frame->linesize[plane],
>> >>>@@ -140,8 +146,12 @@ static int is_frozen(FreezeDetectContext *s, AVFrame *reference, AVFrame *frame)
>> >>>        }
>> >>>    }
>> >>>    emms_c();
>> >>>-    mafd = (double)sad / count / (1ULL << s->bitdepth);
>> >>>-    return (mafd <= s->noise);
>> >>>+    mafd = (double)sad /(count >> (s->bitdepth > 8));
>> >>
>> >>Why? MAFD should be the mean difference normalized to [0..1].
>> >if the bitdeth is 16, it'll divide by 1UL << 16, it'll cause the mafd is
>> >very small. So I choose the scenecut way in the below.
>> 
>> The metric supposed to be indepentent of the bit depth. If you get
>> different mafd for the same source expressed in 8bit, 10bit or 16bit
>> pixel format (provided that you use the same subsampling), then you
>> are doing something wrong.
>> 
>> And you can't reinvent MAFD according to your needs. It is the mean
>> absolute difference of the whole image normalized to 0..1 in this
>> case. If you are not calculating that, then it is not MAFD.
>
> Sorry, now the scenecut detection support rgb only, so I'm changing it to support 
> more format without autoscale, it'll necessary to solved how to calculated mafd for
> 10bits or 12bits. Below is my understanding for the mafd calculation, correct me if 
> I'm misunderstanding. for example, if the bitdepth is 8 bits, it means the pixel 
> block is 8x8 pixel, so maybe we should divide 8x8 instead of 256?
>
> mafd = sad / count / (bitdepth * bitdepth)

bitdepth is the bitdepth of one pixel. For 8 bit, a color component can be
0..255. For 16 bit, a color component can be 0..65535.

8x8 and 16x16 pixel blocks are totally different and unrelated things from 
bit depth. Yes, SAD is usually calculated for 8x8 or 16x16 blocks for 
motion estimation or compression, not here. Here, SAD is the sum of 
absolute differences throughout the entire image, (width x height). Count 
= width * height, so SAD/count is the average of differences. In order to 
get it to [0..1] you have to divide by 2^bitdepth.

Regards,
Marton


More information about the ffmpeg-devel mailing list