[FFmpeg-devel] [PATCH] RealAudio 14.4K encoder
Michael Niedermayer
michaelni
Mon May 24 02:53:53 CEST 2010
On Sun, May 23, 2010 at 08:52:30PM +0200, Francesco Lavra wrote:
> On Sun, 2010-05-23 at 00:51 +0200, Michael Niedermayer wrote:
> > On Sat, May 22, 2010 at 07:33:13PM +0200, Francesco Lavra wrote:
> > > > > Floating point, with orthogonalization, with gain quantization done the
> > > > > fast way
> > > > > stddev: 818.14 PSNR: 38.07 bytes: 200320/ 200334
> > > > > stddev: 986.48 PSNR: 36.45 bytes: 144000/ 144014
> > > > > stddev: 811.68 PSNR: 38.14 bytes: 745280/ 745294
> > > > > stddev: 3762.86 PSNR: 24.82 bytes: 5370880/ 5370880
> > > > > stddev: 2635.10 PSNR: 27.91 bytes: 814400/ 814400
> > > > > stddev: 3647.02 PSNR: 25.09 bytes: 432640/ 432640
> > > > > stddev: 2862.79 PSNR: 27.19 bytes: 1741440/ 1741440
> > > >
> > > > some files loose quality by enabling orthogonalization, thats odd but
> > > > possible.
> > > > assuming there is no bug in the orthogonalization then you could try to
> > > > run the quantization with both codebooks found with and without
> > > > orthogonalization, this should always be better. And or avoid codebook
> > > > choices that would need quantization factors that are far away from
> > > > available values
> > >
> > > The first 3 files are uncompressed recordings, while the last 4 files
> > > are RealAudio decoded samples, so statistics for the latter probably are
> > > not that meaningful.
> > > If you are wondering why PSNR values are so low for the last 4 files
> > > (ideally, they should approach infinity), the problem is that I couldn't
> > > come up with an exact method of calculating the frame energy (assuming
> > > one exists, because from the current decoder output I'm not sure we can
> > > reconstruct the encoded stream exactly as it was), so having an energy
> > > value different form what it ought to be influences negatively the
> > > codebook searches.
> >
> > how far away is the correct value from what you choose?
> > (if its just +-1 maybe bruteforce search might be an option)
>
> I chose the formula to calculate the energy such that in most cases it
> is either the correct value or +-1. But a brute force approach on the
> energy value would be extremely slow: you have to re-encode the whole
> frame as many times as the number of energy values you want to try.
> Also, there are the LPC coefficients, whose values don't correspond
> exactly to those of the original encoded stream, so I don't know how
> much improvement a brute force approach on the energy value could bring.
> Last but not least, yesterday a made some mistakes getting the PSNR
> values, messing up with the shift and skip arguments to tiny_psnr: now
> the results are far better :) see below.
>
> > orthogonalization is a win and should be done of course.
> > the 5 entry quantization needs work, there should be no quality
> > loss. What about 10 or 20 entries?
>
> Below are the correct results (a bug in the floating point code has been
> fixed too, and PSNR has benefited from that). As you can see, the fast
> gain quantization is as good as the brute force one, so there is no need
> to worry about a mixed approach.
>
> Fixed point, without orthogonalization, with brute force gain
> quantization
> stddev: 424.27 PSNR: 43.78 bytes: 200000/ 200320
> stddev: 263.80 PSNR: 47.90 bytes: 143680/ 144000
> stddev: 380.05 PSNR: 44.73 bytes: 744960/ 745280
> stddev: 854.26 PSNR: 37.70 bytes: 5370560/ 5370880
> stddev: 472.50 PSNR: 42.84 bytes: 814080/ 814400
> stddev: 548.55 PSNR: 41.54 bytes: 432320/ 432640
> stddev: 428.05 PSNR: 43.70 bytes: 1741120/ 1741440
>
> Floating point, without orthogonalization, with brute force gain
> quantization
> stddev: 422.45 PSNR: 43.81 bytes: 200000/ 200320
> stddev: 268.66 PSNR: 47.75 bytes: 143680/ 144000
> stddev: 381.76 PSNR: 44.69 bytes: 744960/ 745280
> stddev: 851.79 PSNR: 37.72 bytes: 5370560/ 5370880
> stddev: 486.95 PSNR: 42.58 bytes: 814080/ 814400
> stddev: 568.53 PSNR: 41.23 bytes: 432320/ 432640
> stddev: 436.89 PSNR: 43.52 bytes: 1741120/ 1741440
>
> Floating point, with orthogonalization, with brute force gain
> quantization
> stddev: 210.49 PSNR: 49.86 bytes: 200000/ 200320
> stddev: 201.69 PSNR: 50.24 bytes: 143680/ 144000
> stddev: 200.49 PSNR: 50.29 bytes: 744960/ 745280
> stddev: 784.77 PSNR: 38.43 bytes: 5370560/ 5370880
> stddev: 422.10 PSNR: 43.82 bytes: 814080/ 814400
> stddev: 484.69 PSNR: 42.62 bytes: 432320/ 432640
> stddev: 392.32 PSNR: 44.46 bytes: 1741120/ 1741440
>
> Floating point, with orthogonalization, with gain quantization done the
> fast way
> stddev: 210.14 PSNR: 49.88 bytes: 200000/ 200320
> stddev: 202.50 PSNR: 50.20 bytes: 143680/ 144000
> stddev: 196.30 PSNR: 50.47 bytes: 744960/ 745280
> stddev: 786.06 PSNR: 38.42 bytes: 5370560/ 5370880
> stddev: 422.29 PSNR: 43.82 bytes: 814080/ 814400
> stddev: 495.53 PSNR: 42.43 bytes: 432320/ 432640
> stddev: 396.24 PSNR: 44.37 bytes: 1741120/ 1741440
>
> Floating point, with orthogonalization, with gain quantization done
> taking into account the rounding error of the 5 best entries
> stddev: 210.49 PSNR: 49.86 bytes: 200000/ 200320
> stddev: 201.69 PSNR: 50.24 bytes: 143680/ 144000
> stddev: 200.05 PSNR: 50.31 bytes: 744960/ 745280
> stddev: 786.22 PSNR: 38.42 bytes: 5370560/ 5370880
> stddev: 419.41 PSNR: 43.88 bytes: 814080/ 814400
> stddev: 497.65 PSNR: 42.39 bytes: 432320/ 432640
> stddev: 395.23 PSNR: 44.39 bytes: 1741120/ 1741440
>
> I'd say we should go for the fast gain qantization, and in attachment is
> an cleaned up patch for it, with code duplication removed.
the attached code looks like float brute
> I still have to try the iterative method, will do that in a few days I
> think.
great
[...]
> + best_error = FLT_MAX;
> + gain = 0;
> + for (n = 0; n < 256; n++) {
> + g[1] = ((ff_gain_val_tab[n][1] * m[1]) >> ff_gain_exp_tab[n]) *
> + (1/4096.0);
> + g[2] = ((ff_gain_val_tab[n][2] * m[2]) >> ff_gain_exp_tab[n]) *
> + (1/4096.0);
> + error = 0;
> + if (cba_idx) {
> + g[0] = ((ff_gain_val_tab[n][0] * m[0]) >> ff_gain_exp_tab[n]) *
> + (1/4096.0);
> + for (i = 0; i < BLOCKSIZE; i++) {
> + data[i] = zero[i] + g[0] * cba[i] + g[1] * cb1[i] +
> + g[2] * cb2[i];
> + error += (data[i] - sblock_data[i]) *
> + (data[i] - sblock_data[i]);
> + }
> + } else {
> + for (i = 0; i < BLOCKSIZE; i++) {
> + data[i] = zero[i] + g[1] * cb1[i] + g[2] * cb2[i];
> + error += (data[i] - sblock_data[i]) *
> + (data[i] - sblock_data[i]);
> + }
> + }
> + if (error < best_error) {
> + best_error = error;
> + gain = n;
> + }
> + }
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Republics decline into democracies and democracies degenerate into
despotisms. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100524/dfd2eff3/attachment.pgp>
More information about the ffmpeg-devel
mailing list