[FFmpeg-devel] [PATCH] adpcm: Store trellis nodes in a heap structure

Martin Storsjö martin
Wed Nov 10 14:13:09 CET 2010


On Wed, 10 Nov 2010, Michael Niedermayer wrote:

> On Wed, Nov 10, 2010 at 11:42:48AM +0200, Martin Storsj? wrote:
> > On Wed, 3 Nov 2010, Martin Storsj? wrote:
> > 
> > > As pointed out by Michael when reviewing the G.722 trellis encoder, the 
> > > stored trellis nodes could be stored in a heap-like structure, instead of 
> > > in a straight sorted array.
> > > 
> > > Currently, when inserting a new trellis node, a linear search (which in 
> > > itself perhaps could be sped up by converting it to a binary search) is 
> > > used to find the spot where it should be inserted, and then all later 
> > > node pointers are moved back one step with memmove. Since only a subset of 
> > > all evaluated nodes are stored, the worst one is removed once the array is 
> > > full.
> > > 
> > > Instead of doing this, the attached patch set stored the node pointers in 
> > > a heap structure, by first adding all evaluated nodes to a heap, as long 
> > > as they all fit. Once they don't all fit, we check through all the 
> > > frontier/2 leaf nodes to find the worst one, replace that one with the 
> > > current and restore the heap property.
> > > 
> > > This doesn't give identical results to the initial version, since the 
> > > nodes from the previous round are used for doing the next step in the 
> > > order they're stored in the array, which is different.
> > > 
> > > Instead of chekcing all the frontier/2 leaf nodes to find the worst one, 
> > > patch #3 just picks one of the leaf nodes and tries replacing that one. By 
> > > picking a different one of the leaf nodes each time, we more or less 
> > > achieve the same thing. (If only one spot would be tested, only the path 
> > > from that node up to the root would be updated once the heap is full.)
> > > 
> > > The last patch removes the search for nodes with an equal sample value to 
> > > the one currently inserted, since it's an O(frontier) operation, which 
> > > speeds things up immensely, without notably affecting the quality.
> > > 
> > > Some numbers, runtime:
> > > Original: 16.1 s
> > > After patch #1: 14.8 s
> > > After patch #3: 10.9 s
> > > After patch #4: 5.6 s
> > > 
> > > Output from tiny-psnr:
> > > No trellis:
> > > stddev:  101.13 PSNR: 56.23 MAXDIFF: 7183 bytes:  4865398/  4865408
> > > -trellis 5, original code:
> > > stddev:   81.78 PSNR: 58.08 MAXDIFF: 4798 bytes:  4865398/  4865408
> > > After patch #1:
> > > stddev:   81.70 PSNR: 58.08 MAXDIFF: 4798 bytes:  4865398/  4865408
> > > After patch #3:
> > > stddev:   80.77 PSNR: 58.18 MAXDIFF: 4766 bytes:  4865398/  4865408
> > > After patch #4:
> > > stddev:   80.94 PSNR: 58.17 MAXDIFF: 4524 bytes:  4865398/  4865408
> > > 
> > > So even if patch #3 and #4 in theory should worsen the output slightly, 
> > > they actually seem to improve the result in this case, since other nodes 
> > > happen to be stored/thrown away. The main point is that it doesn't seem to 
> > > harm the output quality significantly while improving the runtime 
> > > performance massively.
> > 
> > Updated benchmarks of this code - the sample in question is a 30 second 
> > clip, 44 kHz mono. Now I've done the testing with adpcm_ima_wav as codec,
> 
> can that clip be downloaded somewhere?

I uploaded it to http://albin.abo.fi/~mstorsjo/adpcm-input.wav.

For testing, I do e.g. this:
time ./ffmpeg -y -i adpcm-input.wav -acodec adpcm_ima_wav -trellis 7 adpcm.wav
./ffmpeg -y -i adpcm.wav out.wav > /dev/null 2>&1
tests/tiny_psnr adpcm-input.wav out.wav 2


> > Also, for reference, the same input with different trellis sizes, after 
> > patch #4:
> 
> it would be interresting to also see this with pre #4 so one can compare it

Ok, here are even more numbers:

Original:
0:
user    0m0.037s
stddev:   40.03 PSNR: 64.28 MAXDIFF: 2874 bytes:  2646016/  2649218

1:
user    0m0.245s
stddev:   37.28 PSNR: 64.90 MAXDIFF: 2683 bytes:  2646016/  2649218

2:
user    0m0.544s
stddev:   35.32 PSNR: 65.37 MAXDIFF: 2697 bytes:  2646016/  2649218

3:
user    0m1.444s
stddev:   34.20 PSNR: 65.65 MAXDIFF: 2676 bytes:  2646016/  2649218

4:
user    0m3.267s
stddev:   33.42 PSNR: 65.85 MAXDIFF: 2653 bytes:  2646016/  2649218

5:
user    0m7.224s
stddev:   32.55 PSNR: 66.08 MAXDIFF: 2640 bytes:  2646016/  2649218

6:
user    0m16.551s
stddev:   31.99 PSNR: 66.23 MAXDIFF: 2643 bytes:  2646016/  2649218

7:
user    0m39.347s
stddev:   31.57 PSNR: 66.34 MAXDIFF: 1869 bytes:  2646016/  2649218

8:
user    1m30.373s
stddev:   31.30 PSNR: 66.42 MAXDIFF: 1539 bytes:  2646016/  2649218


After patch #1:
0:
user    0m0.037s
stddev:   40.03 PSNR: 64.28 MAXDIFF: 2874 bytes:  2646016/  2649218

1:
user    0m0.210s
stddev:   37.26 PSNR: 64.90 MAXDIFF: 2683 bytes:  2646016/  2649218

2:
user    0m0.519s
stddev:   35.53 PSNR: 65.32 MAXDIFF: 2698 bytes:  2646016/  2649218

3:
user    0m1.206s
stddev:   34.19 PSNR: 65.65 MAXDIFF: 2664 bytes:  2646016/  2649218

4:
user    0m2.832s
stddev:   33.34 PSNR: 65.87 MAXDIFF: 2652 bytes:  2646016/  2649218

5:
user    0m6.074s
stddev:   32.58 PSNR: 66.07 MAXDIFF: 2640 bytes:  2646016/  2649218

6:
user    0m13.320s
stddev:   32.02 PSNR: 66.22 MAXDIFF: 2638 bytes:  2646016/  2649218

7:
user    0m29.337s
stddev:   31.50 PSNR: 66.36 MAXDIFF: 1798 bytes:  2646016/  2649218

8:
user    1m1.238s
stddev:   31.25 PSNR: 66.43 MAXDIFF: 1834 bytes:  2646016/  2649218


After patch #3:
0:
user    0m0.037s
stddev:   40.03 PSNR: 64.28 MAXDIFF: 2874 bytes:  2646016/  2649218

1:
user    0m0.205s
stddev:   37.26 PSNR: 64.90 MAXDIFF: 2683 bytes:  2646016/  2649218

2:
user    0m0.477s
stddev:   35.15 PSNR: 65.41 MAXDIFF: 2724 bytes:  2646016/  2649218

3:
user    0m1.092s
stddev:   33.83 PSNR: 65.74 MAXDIFF: 2649 bytes:  2646016/  2649218

4:
user    0m2.426s
stddev:   32.85 PSNR: 66.00 MAXDIFF: 2654 bytes:  2646016/  2649218

5:
user    0m5.177s
stddev:   32.28 PSNR: 66.15 MAXDIFF: 2682 bytes:  2646016/  2649218

6:
user    0m11.403s
stddev:   31.77 PSNR: 66.29 MAXDIFF: 2435 bytes:  2646016/  2649218

7:
user    0m26.120s
stddev:   31.38 PSNR: 66.39 MAXDIFF: 1942 bytes:  2646016/  2649218

8:
user    0m57.067s
stddev:   31.16 PSNR: 66.46 MAXDIFF: 1512 bytes:  2646016/  2649218


After patch #4:
0:
user    0m0.038s
stddev:   40.03 PSNR: 64.28 MAXDIFF: 2874 bytes:  2646016/  2649218

1:
user    0m0.190s
stddev:   37.48 PSNR: 64.85 MAXDIFF: 3026 bytes:  2646016/  2649218

2:
user    0m0.438s
stddev:   35.22 PSNR: 65.39 MAXDIFF: 2681 bytes:  2646016/  2649218

3:
user    0m0.881s
stddev:   33.87 PSNR: 65.73 MAXDIFF: 2643 bytes:  2646016/  2649218

4:
user    0m1.684s
stddev:   32.96 PSNR: 65.97 MAXDIFF: 2650 bytes:  2646016/  2649218

5:
user    0m3.254s
stddev:   32.34 PSNR: 66.13 MAXDIFF: 2467 bytes:  2646016/  2649218

6:
user    0m6.079s
stddev:   31.69 PSNR: 66.31 MAXDIFF: 1907 bytes:  2646016/  2649218

7:
user    0m11.909s
stddev:   31.57 PSNR: 66.34 MAXDIFF: 2105 bytes:  2646016/  2649218

8:
user    0m22.973s
stddev:   31.35 PSNR: 66.40 MAXDIFF: 1782 bytes:  2646016/  2649218


// Martin



More information about the ffmpeg-devel mailing list