[FFmpeg-devel] [PATCH] adpcm: Reset the ssd back to zero more often

Martin Storsjö martin
Sun Nov 21 20:30:15 CET 2010


On Sun, 21 Nov 2010, Michael Niedermayer wrote:

> On Sun, Nov 21, 2010 at 01:11:12AM +0200, Martin Storsj? wrote:
> > On Sat, 20 Nov 2010, Michael Niedermayer wrote:
> > 
> > > On Sat, Nov 20, 2010 at 08:59:25AM +0200, Martin Storsj? wrote:
> > > > On Sat, 20 Nov 2010, Michael Niedermayer wrote:
> > > > 
> > > > > On Thu, Nov 18, 2010 at 04:01:31PM +0200, Martin Storsjo wrote:
> > > > > > If using very large trellis sizes (e.g. -trellis 15), the frontier
> > > > > > is so large that the difference between the best and the worst
> > > > > > trellis node in the frontier is large enough to cause wraparound.
> > > > > > 
> > > > > 
> > > > > > Resetting at (1<<20) is enough to avoid the issue at -trellis 16
> > > > > 
> > > > > have you come to this conclusion by proof or by simply changing the
> > > > > threshold and seeing no problems?
> > > > 
> > > > By changing the threshold until there were no more problems with that 
> > > > particular sample - sorry, I should have mentioned that.
> > > > 
> > > > I think it can't be proven how often it needs to be reset. In the worst 
> > > > pathological case, the best trellis node has ssd 0 while the worst one has 
> > > > a ssd of 65535*65535 added in each generation, overflowing even if we'd 
> > > > subtract the best node's ssd each round.
> > > > 
> > > > The attached patch should avoid the issue properly regardless of how 
> > > > pathologically bad case it is, giving a small but tolerable slowdown. Do 
> > > > you think that is enough, or should that one be combined with this 
> > > > previous patch, resetting it to 0 more often (or perhaps even every round) 
> > > > to avoid this issue happening at all?
> > > 
> > > have you tried to change ssd to 64bit ?
> > > maybe its faster
> > 
> > In 64 bit mode, changing ssd to 64 bit is a bit faster (current code 61 
> > sec, adding the wraparound check 64 sec, 64 bit ssd 62 sec), but in 32 bit 
> > mode, it's quite a bit slower (current code 69.6 sec, with wraparound 
> > check 69.8 sec, 64 bit ssd 79.6 sec), so I wouldn't suggest that solution.
> 
> 70->80sec hmm, did you look at the generated code?

No, I'm not familiar enough with x86 asm to be able to say whether the 
compiler is messing up or not.

But I tested these changes on a few other compiler versions (on other 
machines), running in 32 bit mode:
gcc 4.0: current: 68 sec, wraparound check: 66 sec, 64 bit ssd: 73 sec
gcc 4.1: current: 67 sec, wraparound check: 71 sec, 64 bit ssd: 74 sec
gcc 4.4: current: 83 sec, wraparound check: 89 sec, 64 bit ssd: 96 sec

So in all cases, the 64 bit ssd seems to be slower, but on gcc 4.1 the 
difference isn't all that big. And I don't know how to explain the speedup 
on gcc 4.0 by adding the wraparound check in the loop - shuffling the 
random code generator enough to produce faster code?

// Martin



More information about the ffmpeg-devel mailing list