[MPlayer-dev-eng] comprehensive lavcopts test - in progress. NODAEMON -- latest revision of idea.

Sun Jun 27 19:51:15 CEST 2004

On Sunday 27 June 2004 08:50, you wrote:
> I think that all this preopts/lavcopts separation is just a way to somehow
> cope with the fact that the option space is simply too big to be fully
> searched.
> You are trying to classify options into "fundamental" and "tweaks"
> categories to reduce the number of tries, but you can't be sure you aren't
> completely wrong in this distinction.
>
> You didn't comment on the specific idea I proposed, which only needs
> a listing of the options.
>
> For example, something as simple as (invented options)
>   option[0].name="motionestim"
>   option[0].values="auto,algo1,algo2,algo3"
>   option[1].name="bframes"
>   option[1].values="enabled,disabled"
>   option[2].name="threshold"
>   option[2].values="0,5,25,100"
>   option[3].name="maxrepetitions"
>   option[3].values="1,2,3,4,5"
> then choose one value for each option[i] (using the random/inheritance
> rules I described)

Not 100% sure, but I think thats what I'm doing. Lavcopts is infact something 
like: option[0].name="motionestim"; option[0].values="auto,algo1,algo2,algo3 
-- 
Have a look:

predia=-5,predia=1,predia=5,\
dia=-3,dia=-2,dia=2,dia=3,dia=5,\

You see? -- I just grouped certain things to save screensize.. You do have a 
very valid point though, I will change the options into a dictionary, so it 
will be like this:

predia:-5,1,5;\
dia:-3,-2,2,3,5;\

I dont see why test randomly though, but I suppose that could be added to 
script (generate a command list, then go through list randomly)

Now, my preopts at the moment are short, yes, but they will grow after first 
test, and I am testing how lavcopts opts work together. 

For example, I find out that  vme=1 improves PSNR and keeps filesize the same 
(or lower) for all current preopts, I add a preopt like mbd=2:v4mv:vme=1, and 
check what options improve or make it worse using my lavcopts. Get it? -- and 
ofcourse the log file is parsed, so identical test is never repeated.

So after the first step, I instantly see inter dependency of lavcopts, but not 
to a very deep level, so I start grouping them based on the results I get and 
run the test again with more preopts. The lavcopts never change.

Would be nice if you could review my lavcopts, and add/remove/change parts: 
for example, maybe I'm not testing enough dia values? or maybe too many, 
etc..

>
> I would also log all the tests so we can use a different quality metric
> (PSNR/time for example) in the future without recalculating everything.
>

naturally all tests are logged, and log file is parsed before running command 
to make sure tests are never repeated, so tests wont be repeated when you 
re-run script.

> The genetic algorithm could be run by many people at the same time
> and "good candidates" exchanged via email, so to have children
> from them.

indeed, that will be very possible with the script I'm making.

> It could be a sort of competition about who is providing the best
> candidates.
> All the logs should be saved to create a big collection of the tested
> combinations (useful when we change the quality metric).
>

Yes, you're right, but input files have to be carefully chosen. I think I 
chose a nice range: low/high motion movie/cartoon.

This script can ofcourse be also ran for a full size movie, for a more fair 
test, but that will take a while :) -- there is definitely some potential in 
releasing this script on the net, and let people run their own tests -- but 
its not done yet.

>
> So you already have the list, the problem is that you will not test the
> combinations.
>
> I have some experience designing video codecs and the options are often
> strongly interdependent. For example in some stages you make an apparently
> wrong choice just to have something else work better (for example,
> zeroing low coefficients certainly degrades quality, but it can
> improve the efficiency of the following statistical compressor, so
> the quantizer can be finer and there is an overall net gain).
>

Hopefully I made it clear now, I am testing combinations: I just dont have 
many combinations at the moment, I will group and match after results first 
results: so for example dia=2 gives a good result for all my current preopts. 
I add it to a new preopt with other lavcopts.

Interdependency is important, I certainly dont have your experience, but 
when lavcopts make an improvement in PSNR value *and* make filesize smaller,
they dont end up giving a negative effect when combined with other lavcopts. 
Although the improvement can be less than when used on its own, so there is no 
point testing each combination with and without those lavcopts, and been 
proven to be pointless from tests on the net. (have a look at mbd=2 v4mv and 
trell for example, no matter what you combined with, they always made an 
improvement compared to without using them).

But theoretically, yes, there may be lavcopts that improve PSNR and make 
filesize smaller in some preopts, but not in others. I only heard of this in 
theory though, wasnt able to find anything to back up that claim, and would 
be nice if you can provide an example.

>
> This is a complication deriving from the arbitrary separation in two
> classes.
>
> > This is a test for both quality and filesize, and I will compare ratios
> > of filesize:PSNR to see what options are better.
> >
> > Each file in $tests is tested with each preopt and each preopt is tested
> > with each lavcopt (unless the test was already done or there are
> > conflicting options). This gives a large variety and a very fair test.
>
> Large variety, but the sector of the options space you're exploring is
> strongly based on your classification of the options.
>
> > I've decided to generate graphs using calc from openoffice after the test
> > returns its results.
>
> Maybe the best graph would be a 3D scatter diagram (PSNR, filesize, time).
> Anyway having the data is the first step.
>
> (Any reason you replied personally without CC to the ML?).

If I understand correctly, the results are based on the type of input file, 
yes. I think thats an advantage though, as then you can make your choice 
based on what you're encoding, not based on inaccurate random results found 
all over the mailing list, where they re-encode already encoded trailors, 
etc.

Also, yes, graphs will be dealt with later, but I would like some comments on 
result formatting, like would this be sufficient:

--------------------------------------------------------------------------
file: paycheck-lowmotion.vob 900
--------------------------------------------------------------------------
mbd=2:v4mv:scplx_mask=0   pass   40.23   20.6Mb 1.952912621  82sec
mbd=1:vmax_b_frames=1     pass   40.25   20.2Mb 1.992574257  94sec
precmp=9:cmp=9:subcmp=9  fail      0         8Mb      0                  13sec

Not too great for parsing though... Any suggestions?

And no, no particular reason I replied to you personally, I just hit reply and 
didnt notice mplayer mailing list address wasnt in cc.