[SCIP] Segmentation fault caused by 64-bit address returned from SCIPgetBestSol

avrech at campus.technion.ac.il avrech at campus.technion.ac.il
Sun May 31 10:20:34 CEST 2020


Hello,

I'm using SCIP and PySCIPOpt for learning better cut selection algorithm.
I added some functionality to SCIP source code, to help me calculate the directed cutoff distance as it is done in SCIPselectCuts()
My function is:

SCIP_Real SCIProwGetDirCutoffDistance(
   SCIP*                 scip,               /**< SCIP data structure */
   SCIP_ROW*             row                 /**< LP row */
   )
{
    /* get the best current solution as in SCIPselectCuts */
    SCIP_SOL* sol = SCIPgetBestSol(scip);
    if( sol != NULL )
    {
        return SCIProwGetLPSolCutoffDistance(row, scip->set, scip->stat, sol, scip->lp);
    }
    else
    {
        return 0.0;
    }
}
I implemented this function in `lp.c`, declared it with `SCIP_EXPORT` in `pub_lp.h`, and correspondingly added the declaration to PySCIPOpt `scip.pxd`.
In runtime, the function is called from within a separator.sepaexeclp(), which is written in Python.
My program runs, and at some point, after solving many instances without any problem,
I get a segmentation fault:
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007fffd7b3e3c4 in SCIPsolGetVal (sol=0xffffffff8043b850, set=0x7e470390, stat=0x7fc95db0, var=0x60eb2b78) at /home/avrech/scipoptsuite-6.0.2-avrech/scip/src/scip/sol.c:1314
1314   assert(sol->solorigin == SCIP_SOLORIGIN_ORIGINAL
1: sol = (SCIP_SOL *) 0xffffffff8043b850
2: *sol = <error: Cannot access memory at address 0xffffffff8043b850>
...
The code crashes while trying to access `sol->solorigin`.
Printing the backtrace:
(gdb) bt
#0  0x00007fffd7b3e3c4 in SCIPsolGetVal (sol=0xffffffff8043b850, set=0x7e470390, stat=0x7fc95db0, var=0x60eb2b78) at /home/avrech/scipoptsuite-6.0.2-avrech/scip/src/scip/sol.c:1314
#1  0x00007fffd78b5b73 in SCIProwGetLPSolCutoffDistance (row=0x7c3b9ff0, set=0x7e470390, stat=0x7fc95db0, sol=0xffffffff8043b850, lp=0x790ce040)
    at /home/avrech/scipoptsuite-6.0.2-avrech/scip/src/scip/lp.c:6730
#2  0x00007fffd78b5eba in SCIProwGetDirCutoffDistance (scip=0x7eecb5c0, row=0x7c3b9ff0) at /home/avrech/scipoptsuite-6.0.2-avrech/scip/src/scip/lp.c:6772
#3  0x00007fffd84c8229 in __pyx_pf_9pyscipopt_4scip_5Model_530getState (__pyx_v_prev_state=<optimized out>, __pyx_v_state_format=0x7ffff5aa45e0, __pyx_v_query=<optimized out>,
    __pyx_v_get_available_cuts=0x9d3580 <_Py_TrueStruct>, __pyx_v_self=<optimized out>) at src/pyscipopt/scip.c:138080
I see that `sol` is unexpectedly 64-bit address.
This pointer was generated by calling ```SCIP_SOL* sol = SCIPgetBestSol(scip);``` above.
I trace back to the frame where it was generated, and finds that it is really 64-bit.
(gdb) frame 2
#2  0x00007fffd78b5eba in SCIProwGetDirCutoffDistance (scip=0x7eecb5c0, row=0x7c3b9ff0) at /home/avrech/scipoptsuite-6.0.2-avrech/scip/src/scip/lp.c:6772
warning: Source file is more recent than executable.
6772   case 'd':
(gdb) info locals
sol = 0xffffffff8043b850
(gdb) display sol
81: sol = (SCIP_SOL *) 0xffffffff8043b850
(gdb) display *sol
82: *sol = <error: Cannot access memory at address 0xffffffff8043b850>
Looking into `SCIPgetBestSol(scip)`, I see that it returns `scip->primal->sols[0]`, which is stored correctly as a 32-bit address:
(gdb) display scip->primal->sols[0]
100: scip->primal->sols[0] = (SCIP_SOL *) 0x8043b850
Moreover, if I override `sol` manually using `SCIPgetBestSol(scip)` it gets its correct 32-bit value:
(gdb) set var sol = SCIPgetBestSol(scip)
(gdb) display sol
101: sol = (SCIP_SOL *) 0x8043b850
(gdb) display *sol
102: *sol = {obj = -14.386630020491893, time = 0.031049999999999998, ..., solorigin = SCIP_SOLORIGIN_ZERO, hasinfval = 0}
(gdb) display sol->solorigin
103: sol->solorigin = SCIP_SOLORIGIN_ZERO
Now that `sol` is a 32-bit address, it is accessible, and everything seems to work fine.

I don't understand how this faulty casting into 64-bit address happens.
It seems to happen sporadically at any step of the solving process.
I reproduced it many times, it can happen after solving hundreds of instances, either in the first separation round, or after many LP rounds in which
it doesn't make any problems.

What could be the reason for this weird segfault?
Have I done something wrong in my code?
Is there an alternative way to compute the directed cutoff distance of a LP row using PySCIPOpt?

I didn't look in the compilation output for warnings when I compiled it the last time.
If needed, I will compile again and print out the warnings if any exists.

More technical details:
I'm using
scipoptsuite-6.0.2 (with some extensions as shown)
PySCIPOpt-2.2.3 (also with some extensions)
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Python 3.6.9 (virtualenv)
pip 20.1.1
Cython 0.29.18


Thank you for helping!
Avrech








-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.zib.de/pipermail/scip/attachments/20200531/13845d27/attachment.html>


More information about the Scip mailing list