[FFmpeg-cvslog] r24926 - trunk/libavcodec/x86/vp56dsp.asm

Thu Aug 26 22:53:34 CEST 2010

On Thu, Aug 26, 2010 at 09:27:55PM +0100, M?ns Rullg?rd wrote:
> Reimar D?ffinger <Reimar.Doeffinger at gmx.de> writes:
> > But in case I was unclear, I actually managed to construct
> > a case (of course very artificial) that shows the issue
> > _in principle_ even with non-broken/ancient compilers.
> >
> > extern long j;
> > extern char *array;
> > int test(void)
> > {
> >   int i;
> >   int s;
> >   for (i = 0; i < j; i++)
> >       s += array[i];
> >   return s;
> > }
> > int test2(void)
> > {
> >   unsigned i;
> >   int s;
> >   for (i = 0; i < j; i++)
> >       s += array[i];
> >   return s;
> > }
> > int test3(void)
> > {
> >   unsigned long i;
> >   int s;
> >   for (i = 0; i < j; i++)
> >       s += array[i];
> >   return s;
> > }
> >
> > Compiled on PowerPC 64 the three loops are:
> > .L3:
> >         lbzx 0,9,11
> >         addi 11,11,1
> >         add 0,0,3
> >         extsw 3,0
> >         bdnz .L3
> >
> > .L9:
> >         lbzx 9,10,11
> >         addi 0,11,1
> >         rldicl 11,0,0,32
> >         add 9,9,3
> >         cmpd 7,11,8
> >         extsw 3,9
> >         blt 7,.L9
> >
> > .L14:
> >         lbzx 0,9,11
> >         addi 11,11,1
> >         add 0,0,3
> >         extsw 3,0
> >         bdnz .L14
> 
> Yes, the compiler did a badly with the 32-bit unsigned counter.  What
> is that supposed to prove?

That it seems like PPC64 also does not have a 32x32->32 addition
and just like for x86 a register-size type is the easiest way
to tell the compiler that no, we really don't care about the
overflow case, please don't add extra instructions to fix that up?
And yes, in particularly I don't think the compiler messed up
here but this is the best it could do with the information it had.