[FFmpeg-devel] [PATCH] correct make test failure from 15261 release until now (15899)
Fri Nov 21 22:52:05 CET 2008
On Friday 21 November 2008, Siarhei Siamashka wrote:
> On Friday 21 November 2008, Michael Niedermayer wrote:
> > On Fri, Nov 21, 2008 at 09:15:30PM +0100, Vitor Sessak wrote:
> > > David Geldreich wrote:
> > > > Hello Guillaume,
> > > >
> > > > Le 21 nov. 08 ? 17:25, Guillaume POIRIER a ?crit :
> > > >> This is not the way to go. Reg tests pass on AMD64/Linux, so the
> > > >> code must be fixed to work the same on any plateform. The md5sum
> > > >> should not match X plateform results, but all plateforms result.
> > > >
> > > > That's why I made another post to tell to ignore my proposed patch.
> > > >
> > > > I found no way of making sin/sinf works the same way on all the
> > > > platform. In my case, OSX ppc and intel gives different results.
> > > >
> > > > So changes r15261 and r14982 are incomplete ... they correct the
> > > > problem for AMD64 but breaks in on Intel32.
> > > >
> > > > We must iterate to find a "stable" sine window generating function.
> > >
> > > Even if we find a way to generate a sine window in an arch-independent
> > > way, the codec still uses floating points in other places, so if it
> > > ever is bit-identical across PPC and I32, I don't see any reason not to
> > > see a different output when testing on ARM or SH or GCC 6.4 or whatever
> > > we'll encounter in future. Unless someone tells me why it is supposed
> > > to work as is, I think that this test should be removed...
> > ratecontrol in video uses floats, and other parts do too, we arent seeing
> > problems with these and arent disabling them
> > If you argue that the wma test should be disabled because it is not
> > matching between some important systems, thats something i can understand
> > but, arguing that itz should be disabled because it might theoretically
> > not work on some architecture or not yet existing compiler is well ...
> > And last, id say disable the float code for the regression tests,
> > replacing the whole fft by a memcpy() if it doesnt match beteen archs
> > is alot better than removing a regression test.
> > These tests are important to catch bugs early ...
> Still what about trying to make regression tests resistant to minor
> acceptable differences in the generated output?
OK, let's start brainstorming. Here is the first idea.
What about trying to use sum of all the samples from the file generated by
reference decoder and the sum of samples generated by tested decoder? If PSNR
is good, the difference between these two sums will be reasonably small (zero
for the identical files).
Of course just a single check is not enough, so this test can be extended by
calculating not just a simple sum, but also sums using different signs of
values according to some (semirandom) patterns when accumulating. The set of
such values would represent a decoding result fingerprint which is to be
compared to a fingerprint generated by a reference decoder.
This difference of values from the fingerprints for two files will probably
have gaussian distribution according to central limit theorem (at least for
the case of minor differences). This set of differences can be analyzed by
some statistical method. I would suggest to have a look at
http://en.wikipedia.org/wiki/Chi-square_distribution for the start.
But any kind of empirical test might be useful if it is reliable enough to
detect bugs on some practical cases. One might also try to do a search in the
Internet, maybe some kind of algorithms of doing such comparison already exist
and there is no need to reinvent the wheel :)
More information about the ffmpeg-devel