Detailed Description
ensemble cut selector
This cut selector is based on the paper: M. Turner, T. Berthold, and M. Besançon.
A Context-Aware Cutting Plane Selection Algorithm for Mixed-Integer Programming.
arXiv preprint arXiv:2307.07322 (2023).
The ensemble cut selector scores cuts by using a weighted sum of normalised efficacy, normalised directed cutoff distance (only at the root node), normalised expected objective improvement, objective parallelism, integer support, density, dynamism, normalised pseudo-costs, and normalised number of locks. It also has a variety of filtering methods. If density filtering is enabled, then it filters all cuts before scoring over some relative density threshold. After scoring, it selects the cuts with the highest score, where after each selection, the remaining cuts are either filtered or penalised via parallelism checks. Whether the cuts are filtered or penalised is a users choice. The algorithm terminates when some limit of selected cuts is reached, there are no cuts remaining to select, or the score of all remaining cuts is below minscore.
If a cut is given by \( a^T x \leq b \), then
- the efficacy is defined as the distance between the LP solution and the hyperplane \( a^T x = b \). It is normalised by the largest efficacy from the given array of cuts. ((log(eff(cut) + 1) / log(maxeff + 1))**2 ;
- the directed cutoff distance is defined as the distance between the LP solution and the hyperplane \( a^T x = b \) restricted to the line segment joining the LP solution to the currently best primal solution; therefore, it is only defined when a primal solution is available. It is normalised by the largest cutoff distance from the given array of cuts. ((log(dcd(cut) + 1) / log(maxdcd + 1))**2;
- the objective parallelism is how parallel the vector \( a \) is w.r.t. the objective function \( c \). That is, the objective parallelism is given by \( \frac{a^T c}{\|a\| \|c\|} \). Notice that the vectors are parallel when this formula returns 1;
- the expected objective improvement is defined by the difference of the objective evaluated at the LP solution and when evaluated at the orthogonal projection onto the cut. As we normalise the value, we remove the objective vector multiplication from its calculation. We calculate it as efficacy * objective parallelism. We also normalise it according to the equation ((log(expimprov(cut) + 1) / log(maxexpimprov + 1))**2;
- the integer support of a cut is the ratio between the number of nonzero integer columns and the number of nonzero columns.
- the density of a cut is the ratio of non-zero entries divided by the number of columns in the LP;
- the dynamism of a cut is the ratio between the maximum absolute value over all coefficients and the minimum absolute value over all coefficients. If this ratio is below a threshold, we give the cut a flat reward for good numerics;
- the pseudo-cost score of the cut is the pseudo-cost score of each variable with non-zero coefficient in the cut multiplied by the distance in that variable dimension to the orthogonal projection of the LP solution onto the cut. We normalise the result by the maximum over all cuts: pscost / maxpscost
- the number of variable locks for a cut is the average amount of locks attached to variables with non-zero coefficients in the cut. We normalise the result by the maximum over all cuts: numlocks / maxnumlocks
These features of a cut can be recovered and/or computed with the functions SCIPgetCutEfficacy(), SCIPgetCutLPSolCutoffDistance(), SCIPgetRowObjParallelism(), and SCIPgetRowNumIntCols(), SCIProwGetNNonz(), SCIProwGetVals(), SCIProwGetCols(), SCIPgetVarPseudocostScore(), SCIPvarGetNLocksUp(), SCIPvarGetNLocksDown().
The filtering (density) works as follows: The non-forced cuts are scanned through, and any cut over a density threshold (0,1) is dropped from consideration.
The filtering / penalise (parallelism) step works as follows: First, the forced cuts — cuts that are going to enter the LP no matter what — are used to filter / penalise the non-forced cuts. This means that for each forced cut, fcut
, the parallelism between fcut
and every non-forced cut, cut
, is computed (the parallelism between two cuts \( a^T x \leq b \) and \( d^T x \leq e\) is \( \frac{a^T d}{\|a\| \|d\|} \)). If the parallelism with fcut
is larger or equal than some maximum threshold then it is either removed from consideration (if filter by parallelism), or its score is decreased (if penalise by parallelism). If the score drops below some threshold minscore
, then the cut is removed from consideration. Each time a cut is selected (which is always greedily w.r.t. the scores), the same filtering procedure for parallelism described above is run.
- Note
- The maximum parallelism is a parameter that can be set, as well as the weights for the score.
- In the case of no primal solution, the weight assigned to the directed cutoff distance is transferred to the efficacy.
Definition in file cutsel_ensemble.h.
#include "scip/scip.h"
Go to the source code of this file.
Functions | |
SCIP_RETCODE | SCIPincludeCutselEnsemble (SCIP *scip) |
SCIP_RETCODE | SCIPselectCutsEnsemble (SCIP *scip, SCIP_ROW **cuts, SCIP_ROW **forcedcuts, SCIP_CUTSELDATA *cutseldata, SCIP_Bool root, int ncuts, int nforcedcuts, int maxselectedcuts, int *nselectedcuts) |