<div dir="ltr"><div>Hi Selin,</div><div><br></div><div>Depending on what you are working on, you may be able to use the total number of LP iterations or B&B nodes as a surrogate measure for time. These are unaffected by cluster issues.</div><div><br></div><div>As a start, you could determine how closely related to time these measures are on your single-thread experiments, using a simple linear regression for example.</div><div><br></div><div>Depending on how related these measures are, you may be able to keep using the cluster with 48 concurrent jobs and track performance using these surrogate measures for some or all of your results (except perhaps the final ones).</div><div><br></div><div>I hope that makes sense.</div><div><br></div><div>Kind regards</div><div>Pierre<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 28 Nov 2024 at 11:12, Bayramoglu, Selin <<a href="mailto:sbayramoglu3@gatech.edu">sbayramoglu3@gatech.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg-4030861796854737413">
<div lang="en-TR" style="overflow-wrap: break-word;">
<div class="m_-4030861796854737413WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt">Hi Sascha,<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt">Many thanks for your detailed response! I will definitely experiment with allocating a fraction of the cores for the runs.<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt">Best wishes,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt"><u></u> <u></u></span></p>
<div>
<div>
<p class="MsoNormal"><b><span lang="TR" style="font-size:12pt;font-family:"Calibri",sans-serif;color:rgb(51,51,51)">Selin Bayramoglu<u></u><u></u></span></b></p>
<p class="MsoNormal"><span lang="TR" style="font-size:12pt;font-family:"Calibri",sans-serif;color:rgb(51,51,51)">Ph.D. Student<u></u><u></u></span></p>
<p class="MsoNormal"><b><span style="font-size:12pt;font-family:"Calibri",sans-serif;color:rgb(51,51,51)">H. Milton Stewart School of Industrial and Systems Engineering</span></b><span style="font-size:12pt;font-family:"Calibri",sans-serif;color:black"><br>
</span><span style="font-size:12pt;font-family:"Calibri",sans-serif;color:rgb(51,51,51)">Georgia Institute of Technology</span><span style="font-size:12pt;font-family:"Calibri",sans-serif;color:black"><u></u><u></u></span></p>
</div>
</div>
<p class="MsoNormal"><a href="mailto:sbayramoglu3@gatech.edu" target="_blank"><span lang="TR" style="font-size:12pt;font-family:"Calibri",sans-serif;color:rgb(5,99,193)">sbayramoglu3@gatech.edu</span></a><span style="font-size:11pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt"><u></u> <u></u></span></p>
<div id="m_-4030861796854737413mail-editor-reference-message-container">
<div>
<div>
<div style="border-width:1pt medium medium;border-style:solid none none;border-color:rgb(181,196,223) currentcolor currentcolor;padding:3pt 0cm 0cm">
<p class="MsoNormal" style="margin-right:0cm;margin-bottom:12pt;margin-left:36pt">
<b><span style="font-size:12pt;color:black">From: </span></b><span style="font-size:12pt;color:black">s schnug <<a href="mailto:sascha.schnug@gmail.com" target="_blank">sascha.schnug@gmail.com</a>><br>
<b>Date: </b>Tuesday, 26 November 2024 at 18:04<br>
<b>To: </b>Bayramoglu, Selin <<a href="mailto:sbayramoglu3@gatech.edu" target="_blank">sbayramoglu3@gatech.edu</a>><br>
<b>Cc: </b><a href="mailto:scip@zib.de" target="_blank">scip@zib.de</a> <<a href="mailto:scip@zib.de" target="_blank">scip@zib.de</a>><br>
<b>Subject: </b>Re: [SCIP] About variances in solving time<u></u><u></u></span></p>
</div>
<table border="0" cellspacing="0" cellpadding="0" align="left" width="100%" style="background:revert;color:revert;direction:revert;font-size:revert;height:revert;letter-spacing:revert;line-height:revert;margin:revert;opacity:revert;outline:revert;overflow:revert;padding:revert;text-align:revert;text-indent:revert;text-orientation:revert;text-overflow:revert;text-transform:revert;vertical-align:revert;white-space:revert;word-break:revert;word-spacing:revert;writing-mode:revert;zoom:revert;border:0px;display:table;width:100%;table-layout:fixed;float:none;border-spacing:0px">
<tbody>
<tr style="background:revert;border:revert;color:revert;direction:revert;display:revert;font-size:revert;height:revert;letter-spacing:revert;line-height:revert;margin:revert;opacity:revert;outline:revert;overflow:revert;padding:revert;table-layout:revert;text-align:revert;text-indent:revert;text-orientation:revert;text-overflow:revert;text-transform:revert;vertical-align:revert;white-space:revert;width:revert;word-break:revert;word-spacing:revert;writing-mode:revert;zoom:revert">
<td style="background-position:revert;background-repeat:revert;background-image:revert;background-size:revert;background-origin:revert;background-clip:revert;border:revert;color:revert;direction:revert;display:revert;font-size:revert;height:revert;letter-spacing:revert;line-height:revert;margin:revert;opacity:revert;outline:revert;overflow:revert;table-layout:revert;text-align:revert;text-indent:revert;text-orientation:revert;text-overflow:revert;text-transform:revert;vertical-align:revert;white-space:revert;word-break:revert;word-spacing:revert;writing-mode:revert;zoom:revert;padding:2px;background-color:rgb(166,166,166);width:0px">
</td>
<td width="99%" style="background-position:revert;background-repeat:revert;background-image:revert;background-size:revert;background-origin:revert;background-clip:revert;border:revert;direction:revert;display:revert;height:revert;letter-spacing:revert;line-height:revert;margin:revert;opacity:revert;outline:revert;overflow:revert;table-layout:revert;text-indent:revert;text-orientation:revert;text-overflow:revert;text-transform:revert;vertical-align:revert;white-space:revert;word-break:revert;word-spacing:revert;writing-mode:revert;zoom:revert;width:100%;background-color:rgb(234,234,234);padding:15px;font-size:12px;font-weight:normal;color:rgb(33,33,33);text-align:left">
<div>
<p class="MsoNormal">
<span style="font-size:12pt;font-family:"Segoe UI",sans-serif;color:black">You don't often get email from <a href="mailto:sascha.schnug@gmail.com" target="_blank">sascha.schnug@gmail.com</a>.
</span><span style="color:black"><a href="https://aka.ms/LearnAboutSenderIdentification" target="_blank"><span style="font-size:12pt;font-family:"Segoe UI",sans-serif">Learn why this is important</span></a></span><span style="font-size:12pt;font-family:"Segoe UI",sans-serif;color:black">
</span><span style="font-size:12pt;font-family:"Segoe UI",sans-serif"><u></u><u></u></span></p>
</div>
</td>
<td style="background-position:revert;background-repeat:revert;background-image:revert;background-size:revert;background-origin:revert;background-clip:revert;border:revert;direction:revert;display:revert;height:revert;letter-spacing:revert;line-height:revert;margin:revert;opacity:revert;outline:revert;overflow:revert;table-layout:revert;text-indent:revert;text-orientation:revert;text-overflow:revert;text-transform:revert;vertical-align:revert;white-space:revert;word-break:revert;word-spacing:revert;writing-mode:revert;zoom:revert;width:75px;background-color:rgb(234,234,234);padding:5px;font-size:12px;font-weight:normal;color:rgb(33,33,33);text-align:left">
</td>
</tr>
</tbody>
</table>
<div>
<div>
<p class="MsoNormal" style="margin-left:36pt"><span style="font-size:12pt">Hi,<br>
<br>
i think this a very complex multi-dimensional question.<br>
<br>
Some of those questions and my personal opinions (i'm no SCIP-dev):<br>
<br>
1: What influences solving-time with "deterministic tasks" (here meaning: same decisions/path -> but potentially earlier/later finishing)<br>
- 1a: Thread starvation (e.g. overcommitment of cpu-cores)<br>
- 1b: memory-limit (leading to memory-trashing)<br>
- 1c: memory-bandwith limit<br>
- 1d: cache invalidation<br>
- 1e: IO-bottleneck<br>
<br>
1a is related to question 3<br>
1b should be easy to estimate / check -> care has to be taken as memory-trashing is catastrophic in regards to solving-time (probably not your issue as it's usually worse than only a 70% slow-down)<br>
1c can be a real issue (imho) with many LP/MILP solvers running concurrently as some subroutines (e.g. simplex) are quite memory-bandwith hungry -> tuning recommended (see "recommendation")<br>
1d not much one can do without removing concurrent workers / noisy neighbors -> hopefully not that important as mostly a L3-cache issue (which is slow anyway) until threads are highly overcommitted (many context-switches)<br>
1e only relevant when all your 48 tasks read their input the same time -> not sure if relevant: but easy to solve if needed => add a randomized sleep before doing IO which is not counting towards solving-time<br>
<br>
Conclusion: 1a would be a real problem (see 3); 1b should be analyzed and 1c might need tuning. 1d we ignore and 1e might be analyzed and approaches if nothing else works.<br>
<br>
2: "Determinism" (again: decisions/path)<br>
- 2a: Slow-down while using a different solver-path (different decisions)<br>
- 2b: Slow-down while not using a different solver-path (same path, but faster or slower)<br>
<br>
Non-deterministic throughput (OS-scheduling and all topics above) is bad itself but it's probably even worse when the algorithm is working in a time-dependent manner. In this case, 2a, any bad effect can become much much worse.<br>
<br>
I'm somewhat afraid <u>and <b>this is a question for SCIP-devs</b></u> that SCIP is not (opposed to some other solvers based on cooperative-scheduling / task-interleaving) necessarily following the same path before being cut-off by some time-limit (or terminating
with a solution). I interpret the source-code like "#define DEFAULT_HEURTIMELIMIT 60.0 /**< time limit for a single heuristic run */" as indication, that effects/influences from 1 can make two solver-runs "diverge" in their decisions (run A can work with
heuristic success solution; run B cannot) => the bad thing here is that decisions can have exponentially bad influence on solving-time (compare with "performance variability in mixed-integer programming").<br>
<br>
Conclusion : Probably not much one can do except for reducing the effects of 1.<br>
<br>
3: "CPU-core overcommitment (self-scheduled + "noisy neighbors" = other peoples scheduled tasks):<br>
- 3a: Scheduling 100 tasks on 48 cores leads to <= 48 tasks running concurrently (bounded pool)<br>
- 3b: Scheduling 100 tasks on 48 cores leads to 49 or more tasks running concurrently (unbounded pool)<br>
<br>
Related to 1a:<br>
<br>
Conclusion: 3b is bad in any case, but even worse when other peoples tasks are not controlled by the scheduler. I assume 3a is the case.
<u>If not, it's not a good environment/setup for benchmarking.</u><br>
<br>
4: Single-core performance variability<br>
- 4a: Turbo-boosted CPUs<br>
- 4b: More "stable" CPUs<br>
<br>
Especially desktop CPUs are hard to use for benchmarks due to turbo-boosting and co. Some libraries like "google benchmark" even tries to detect this and outputs strong warnings.<br>
There are similar (inverse) topics related to noisy-neighbors (AVX512 instruction-counting based cpu-clock throttling).<br>
<br>
Terrible situation for everyone doing benchmarks. BUT hopefully your VM is a classic server-like setup where turbo-boosting and co. is not there and things are much more symmetric.
<u></u><u></u></span></p>
<div>
<p class="MsoNormal" style="margin-left:36pt"><u><span style="font-size:12pt">If not: again... bad environment/setup.</span></u><span style="font-size:12pt"><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36pt"><span style="font-size:12pt">The AVX512 topic can hurt a lot (~30% less clock-speed) but pretty much no software is using it. I guess this includes simplex-based MILP solvers (especially SCIP). Not so sure about
IPMs though.<br>
<br>
-----<br>
<br>
Long story short:<br>
<br>
Assuming that it's a server-like environment ("stable CPU cores" + high memory-bandwith) i have only one real recommendation:<br>
<br>
- Don't let the scheduler run with a pool-size of N=cpu-cores=48 but M<<N<br>
- Tune M by doing experiments: M converging to 0 should make performance-variability converge to 0 (~ your serial experiments) and you should decide for yourself if 1%, 5% or 10% deviation is ok (when 70% is not) -> pick M achieving the % desired<br>
- This should help with all topics in 1, 2, and 3<br>
<br>
***** Hopyfully, you have some kind of control of the scheduler to do this! (maybe there is an admin to contact and talk to) *****<br>
Best-case: M includes other peoples tasks (probably a strong no by your admin due to economic factors)<br>
Maybe okay: M includes only your tasks (for which we now are very cpu/memory-bandwith hungry)<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36pt"><span style="font-size:12pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36pt"><span style="font-size:12pt">(Without control there might be some merrit in scheduling dummy-tasks; we control; limiting negative impact. But in most cases this assumes something about the scheduler-decisions
and is a rather hacky endavour)<br>
<br>
Greetings,<br>
Sascha<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36pt"><span style="font-size:12pt"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36pt"><span style="font-size:12pt">PS: There is, of course, also statistics. Maybe it makes sense to replace serial runs with
<u>doing parallel runs multiple times (each) and smooth the results</u> (including analysis of variance). <u></u><u></u></span></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:36pt"><span style="font-size:12pt"><u></u> <u></u></span></p>
<div>
<div>
<p class="MsoNormal" style="margin-left:36pt"><span style="font-size:12pt">Am Di., 26. Nov. 2024 um 21:48 Uhr schrieb Bayramoglu, Selin <</span><a href="mailto:sbayramoglu3@gatech.edu" target="_blank"><span style="font-size:12pt">sbayramoglu3@gatech.edu</span></a><span style="font-size:12pt">>:<u></u><u></u></span></p>
</div>
<blockquote style="border-width:medium medium medium 1pt;border-style:none none none solid;border-color:currentcolor currentcolor currentcolor rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin:5pt 0cm 5pt 4.8pt">
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt">Hello,</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt"> </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt">I have a question regarding the timing of algorithms. I am recording the solving times of SCIP on 100 problems. My end goal is to compare SCIP under different configurations. I am running my experiments on a dedicated
48-core virtual machine. I have run these problems in two settings:</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt"> </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span style="font-size:11pt">1. </span><span lang="EN-US" style="font-size:11pt">Submit a single job which solves the problems sequentially on a single core. I did not submit any other job, but the machine has regular processes going on in the background.</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span style="font-size:11pt">2. </span><span lang="EN-US" style="font-size:11pt">Submit 100 jobs and use all cores of the machine. The jobs get assigned to different cores of the machine and run in parallel. </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt"> </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt">I use HTCondor for scheduling jobs. I observed that the solving time can go up by up to 70% in case 2 compared to case 1. In addition, running case 1 several times gives very consistent runtimes, but the runtimes
vary a lot more in case 2. </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt"> </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt">I was wondering if there are certain measures to take so that timing values are stable (one can guarantee to get the a similar solving time for every run on the same problem) and are short. Even though the results
from case 1 are satisfactory, it is highly time consuming to run all jobs on a single core and keep the remaining 47 cores idle.</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt"> </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt">I would be happy to take any suggestions on this matter.</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt"> </span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt">Thanks.</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="EN-US" style="font-size:11pt"> </span><span style="font-size:12pt"><u></u><u></u></span></p>
<div>
<div>
<p class="MsoNormal" style="margin-left:36pt">
<b><span lang="TR" style="font-size:12pt;font-family:"Calibri",sans-serif;color:rgb(51,51,51)">Selin Bayramoglu</span></b><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<span lang="TR" style="font-size:12pt;font-family:"Calibri",sans-serif;color:rgb(51,51,51)">Ph.D. Student</span><span style="font-size:12pt"><u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:36pt">
<b><span style="font-size:12pt;font-family:"Calibri",sans-serif;color:rgb(51,51,51)">H. Milton Stewart School of Industrial and Systems Engineering</span></b><span style="font-size:12pt;font-family:"Calibri",sans-serif;color:black"><br>
</span><span style="font-size:12pt;font-family:"Calibri",sans-serif;color:rgb(51,51,51)">Georgia Institute of Technology</span><span style="font-size:12pt"><u></u><u></u></span></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:36pt">
<a href="mailto:sbayramoglu3@gatech.edu" target="_blank"><span lang="TR" style="font-size:12pt;font-family:"Calibri",sans-serif;color:rgb(5,99,193)">sbayramoglu3@gatech.edu</span></a><span style="font-size:12pt"><u></u><u></u></span></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:36pt"><span style="font-size:12pt">_______________________________________________<br>
Scip mailing list<br>
</span><a href="mailto:Scip@zib.de" target="_blank"><span style="font-size:12pt">Scip@zib.de</span></a><span style="font-size:12pt"><br>
</span><a href="https://listserv.zib.de/mailman/listinfo/scip" target="_blank"><span style="font-size:12pt">https://listserv.zib.de/mailman/listinfo/scip</span></a><span style="font-size:12pt"><u></u><u></u></span></p>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
_______________________________________________<br>
Scip mailing list<br>
<a href="mailto:Scip@zib.de" target="_blank">Scip@zib.de</a><br>
<a href="https://listserv.zib.de/mailman/listinfo/scip" rel="noreferrer" target="_blank">https://listserv.zib.de/mailman/listinfo/scip</a><br>
</div></blockquote></div>