[Scip] Inconsistent solving on different computers

Wed Nov 20 18:18:46 CET 2013

Hi Stefan,

It's exactly as Pierre says. This is a behavior that can be observed for a
major solvers and it most often is even more extreme than in your example.

A few amendments to Pierre's e-mail:

1) The good news: it might be algorithmically explored. See, e.g., a
recent report by Matteo Fischetti et al:
http://www.dei.unipd.it/~salvagni/pdf/randomness.pdf‎ or the ramp-up
of ParaSCIP: http://opus4.kobv.de/opus4-zib/files/1192/ZR-10-27.pdf‎

2) A main reason is imperfect tie breaking: Due to numerical errors,
values are equal although they should not be in exact arithmetic, or
values are differrent, although they should be the same.
 Whether or not such errors occur might depend on the architecture since
basic arithmetic operations are implemented differently on different
systems. Thus, floating point values that arise from the same piece of
code might differ in the last bit. a) That might propagate b) Whether or
not you treat them as if they were equal will at one point results in
different paths being taken by the solver. Thus, this behavior is, to a
certain extent, unavoidable. Though I am not saying, that every effort
has been taken to avoid this in SCIP (or Cplex, or...). It is a bit like
fighting windmills.

3)
>> Asked provocatively: How can I ever make statements about the
>> performance if it's not reproducible (at least up to scaling)?
The results are reproducible, you simply need the same architecture and
the same OS, with the same versions of the compiler, libraries,...
That is asking a lot, but a good reason that one should specify exactly
how computational experiments were carried out.

4) The most important question has not been asked yet: How can one ensure
that improvements in the algorithm that one observes are actually caused
by a new idea that one added, and not only an effect of performance
variability?
Partial answers that I am aware of are
a) use huge testsets (hoping that random noise flattens out)
b) use several permutations of each instances and average the results to
capture the performance variability
c) use statistical tests to analyze your results
d) use many, more than one (or two or three) performance measures at a
time to illustrate the impact of a new algorithm.
I would be happy to hear about further suggestions.

Best regards,
Timo

> Stefan,
>
> I believe what you describe is referred to as performance variability.
> This is a known phenomenon and it is not specific to SCIP.
> See for instance http://coral.ie.lehigh.edu/~jeff/mip-2008/talks/danna.pdf
> See also Section 5 of Mixed Integer Programming Library version 5, Koch et
> al, Math. Prog. Comp. (2011).
>
> I hope this helps.
> Pierre
>
> ----- Original Message -----
>
>> From: "Stefan LÃ¶rwald" <stefan.loerwald at gmail.com>
>> To: "SCIP Mailingliste" <scip at zib.de>
>> Sent: Wednesday, November 20, 2013 11:11:29 AM
>> Subject: [Scip] Inconsistent solving on different computers
>
>> Dear SCIP Developers,
>
>> a friend of mine recently discovered that SCIP behaves differently on
>> different machines.
>
>> Although on every tested machine, the solution value is identical,
>> the solution process differs quite radically. This are the results
>> for the LOP example with problem ex1.mat
>
>> Virtual machine (64 bit)
>
>> Solutions found : 11 (11 improvements)
>> First Solution : +2.65100000000000e+03 (in run 1, after 1 nodes, 0.29
>> seconds, depth 592, found by <shiftandpropagate>)
>> Solving Time (sec) : 21.39
>> Solving Nodes : 506
>> Root Dual Bound : +3.10268636989738e+03
>
>> Virtual machine (32 bit)
>
>> Solutions found : 10 (9 improvements(null))
>> First Solution : +2.65100000000000e+03 (in run 1, after 1 nodes, 0.38
>> seconds, depth 592, found by <shiftandpropagate>)
>> Solving Time (sec) : 28.52
>> Solving Nodes : 546
>> Root Dual Bound : +3.10271332082478e+03
>
>> netbook (32 bit)
>
>> Solutions found : 11 (10 improvements(null))
>> First Solution : +2.65100000000000e+03 (in run 1, after 1 nodes, 3.17
>> seconds, depth 592, found by <shiftandpropagate>)
>> Solving Time (sec) : 245.61
>> Solving Nodes : 1093
>> Root Dual Bound : +3.10271332082478e+03
>
>> Our expectation was that both the number of solutions and the number
>> of solving nodes should be identical, but they aren't.
>> The actual solving time is of course irrelevant because it depends
>> heavily on the hardware.
>> To my knowledge, on every platform the code was compiled with the
>> newest version of SCIP / SoPlex available.
>
>> Now two possible causes come to my mind:
>
>> Undefined behaviour and/or unspecified behaviour.
>
>> So the question is: Is this behaviour intended? If so, why? For
>> comparability it seems logical to me to compare the number of
>> solving nodes rather than solving time. That's because time
>> measurements are not portable (practically always scaled). However
>> the number of logical steps the algorithm takes (if deterministic)
>> should be the same.
>> Asked provocatively: How can I ever make statements about the
>> performance if it's not reproducible (at least up to scaling)?
>
>> Thanks in advance for clarification,
>> Stefan
>> _______________________________________________
>> Scip mailing list
>> Scip at zib.de
>> http://listserv.zib.de/mailman/listinfo/scip
> _______________________________________________
> Scip mailing list
> Scip at zib.de
> http://listserv.zib.de/mailman/listinfo/scip
>