[BoW] r14 - trunk/video_coding/qns.txt

Wed Oct 15 12:16:50 CEST 2008

Author: darkshikari
Date: Wed Oct 15 12:16:49 2008
New Revision: 14

Log:
Quantization noise shaping.

Added:
   trunk/video_coding/qns.txt   (contents, props changed)

Added: trunk/video_coding/qns.txt
==============================================================================

--- (empty file)
+++ trunk/video_coding/qns.txt	Wed Oct 15 12:16:49 2008
@@ -0,0 +1,27 @@
+Quantization Noise Shaping
+
+We have nice algorithms like trellis to optimize quantization for optimal rate-distortion score, using sum of squared error (PSNR) as the distortion metric.  But what about other metrics?  PSNR is a special case because the sum of squared error is the same in both frequency and spatial domain, according to Parseval's theorem.  However, this is not true of other metrics, and most spatial quality metrics cannot be measured in the frequency domain--in other words, each frequency component of the block is not independent of other frequency components with regard to effect on quality.  This presents a problem, as it means the optimal quantization process is no longer polynomial-time, but rather NP, and thus completely intractable.
+
+However, this is a solution: the greedy algorithm method known as QNS, or Quantization Noise Shaping.  This works as follows:
+
+1.  For each quantized frequency coefficient, try raising or lowering the coefficient by 1.
+2.  Measure the RD score for each possibility by doing an inverse transform and taking the appropriate bit cost and spatial distortion measurements.
+3.  Pick the best possibility of all the ones tried above, and apply it to the quantized coefficients.
+4.  Repeat this process until none of the options give any benefit.
+
+There are many shortcuts to improve performance:
+
+1.  Don't look at any coefficient beyond the last nonzero coefficient.
+2.  Don't look at any possible coefficient that isn't a "round up" or "round down" from the actual coefficient.
+3.  Use an optimized inverse transform that takes into account the fact that only a single coefficient has been changed.
+4.  Use a fast bit cost calculation algorithm, like the one used in trellis.
+
+The most common spatial distortion measurement, and the one used in ffmpeg, is SSD weighted by local 3x3 variance.  The purpose of this is to try to hide quantization noise in areas with high detail, where it is less noticable.  However, QNS theoretically allows any arbitrary spatial metric to be used; this makes it extremely useful for research into psychovisual metrics.  Its also a poor man's trellis in the case that trellis itself is not available for the given entropy coding scheme.
+
+However, one must be careful, for QNS will break your metric.  To explain why this is the case, an anecdote might be useful:
+
+Once upon a time, there was a programmer who built a genetic algorithm to analyse the rather complicated process of walking.  He built it with a complex physics engine to design creatures that could walk as fast as possible given a set of constraints.  The creatures were built out of various blocks and joints and so forth.  Finally, after finishing it, he ran it for months--bit by bit he watched its progress numerically.  Over time, the algorithm improved, getting to the finish line faster and faster with each new generation.  Eventually, he opened it up to see what the results were.
+
+The algorithm had stacked all the blocks on top of each other lengthwise to build a gigantic tower creature.  The tower fell down, crossing the finish line almost instantly.
+
+And thus, much like the genetic algorithm, QNS will find the hole in your "interesting" metric.  It will find a way to optimize for it in a way you probably didn't expect, and that may not be a good thing.  So one should be cautious when using QNS on arbitrary metrics.  Of course, just as well, this provides an extra check on one's metrics to ensure that they are actually good quality measurements, and not merely working out of coincidence.