[FFmpeg-devel] [PATCH 1/4] ssim: refactor a weird double loop.

Michael Niedermayer michael at niedermayer.cc
Mon Jul 13 02:38:43 CEST 2015


On Sun, Jul 12, 2015 at 10:07:16PM +0000, Paul B Mahol wrote:
> On 7/12/15, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> > Hi,
> >
> > On Sun, Jul 12, 2015 at 10:29 AM, Paul B Mahol <onemda at gmail.com> wrote:
> >
> >> Dana 12. 7. 2015. 14:18 osoba "Ronald S. Bultje" <rsbultje at gmail.com>
> >> napisala je:
> >> >
> >> > Hi,
> >> >
> >> > On Sun, Jul 12, 2015 at 6:48 AM, Paul B Mahol <onemda at gmail.com> wrote:
> >> >
> >> > > Dana 12. 7. 2015. 01:56 osoba "Ronald S. Bultje" <rsbultje at gmail.com>
> >> > > napisala je:
> >> > > >
> >> > > > ---
> >> > > >  libavfilter/vf_ssim.c | 5 ++---
> >> > > >  1 file changed, 2 insertions(+), 3 deletions(-)
> >> > > >
> >> > > > diff --git a/libavfilter/vf_ssim.c b/libavfilter/vf_ssim.c
> >> > > > index 0721ddd..3ef122f 100644
> >> > > > --- a/libavfilter/vf_ssim.c
> >> > > > +++ b/libavfilter/vf_ssim.c
> >> > > > @@ -134,7 +134,7 @@ static float ssim_end1(int s1, int s2, int ss,
> >> int
> >> > > s12)
> >> > > >           / ((float)(fs1 * fs1 + fs2 * fs2 + ssim_c1) * (float)(vars
> >> +
> >> > > ssim_c2));
> >> > > >  }
> >> > > >
> >> > > > -static float ssim_end4(int sum0[5][4], int sum1[5][4], int width)
> >> > > > +static float ssim_endn(int (*sum0)[4], int (*sum1)[4], int width)
> >> > > >  {
> >> > > >      float ssim = 0.0;
> >> > > >      int i;
> >> > > > @@ -169,8 +169,7 @@ static float ssim_plane(uint8_t *main, int
> >> > > main_stride,
> >> > > >                                  &sum0[x]);
> >> > > >          }
> >> > > >
> >> > > > -        for (x = 0; x < width - 1; x += 4)
> >> > > > -            ssim += ssim_end4(sum0 + x, sum1 + x, FFMIN(4, width -
> >> > > > x
> >> -
> >> > > 1));
> >> > > > +        ssim += ssim_endn(sum0, sum1, width - 1);
> >> > > >      }
> >> > > >
> >> > > >      return ssim / ((height - 1) * (width - 1));
> >> > > > --
> >> > > > 2.1.2
> >> > > >
> >> > > >
> >> > >
> >> > > Why? There was reason behind this code I guess.
> >> > >
> >> >
> >> > I think it's for simd code simplification. See, I'm guessing the code
> >> > you
> >> > took from libvpx had an extra condition to do only 4-sized chunks
> >> > through
> >> a
> >> > function pointer, and then the odd tail in c code. If you do this, the
> >> simd
> >> > code has a fixed size (always 4), which makes the implementation much
> >> more
> >> > trivial: 4 16-byte loads, add, transpose4x4d, and then ssim_end1 to get
> >> > 4
> >> > results, which you horizontal-add and return.
> >> >
> >>
> >> I took this from tiny_ssim.c as pengvado said its ok to relicense to lgpl.
> >
> >
> > I think the same reasoning still applies - this will get better
> > performance, particularly if we consider avx2.
> 
> OK, patch lgtm.

applied

thanks

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Everything should be made as simple as possible, but not simpler.
-- Albert Einstein
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20150713/2d60a23e/attachment.sig>


More information about the ffmpeg-devel mailing list