I am trying to learn to fit a linear integer programming optimization model in R using the ompr
package that a colleague had previously fit using CPLEX/GAMS (specifically, the one described here: Haight et al. 2021). I am running my implementation on a Linux Supercomputing server at my University that has 248gb of memory, which I'd think would be sufficient for the job.
Here is my code and output from the failure report from the server:
#Read in the necessary pre-generated data and packages
library(pacman); library(dplyr); library(ROI); library(ompr); library(ompr.roi)
n.ij = readRDS(file="nij1.rds") #An indexing vector.
B = 10 #Budget constraint--inspect only 10 lakes maximum
#Initialize model prior to setting the objective.
mod1 = MILPModel() %>%
add_variable(u[i, j], type = "binary", i = 1:n.ij, j = 1:n.ij) %>%
add_variable(x[i], type = "binary", i = 1:n.ij) %>%
add_variable(x[j], type = "binary", j = 1:n.ij) %>%
add_constraint(x[i] + x[j] >= u[i,j], i = 1:n.ij, j = 1:n.ij) %>%
add_constraint(sum_expr(x[i], i = 1:n.ij) <= B)
#Read in the relevant adjacency matrix of boat movements between every pair of lakes.
boats.n.ij = readRDS(file="boatsnij1.rds")
#Some system and object size info.
system(paste0("cat /proc/",Sys.getpid(),"/status | grep VmSize"))
VmSize: 13017708 kB
object.size(mod1)
6798778288 bytes
#Now, set objective with this specific boats.n.ij file.
mod1.full = mod1 %>%
set_objective(sum_expr(u[i,j] * boats.n.ij[i, j], i = 1:n.ij, j = 1:n.ij))
Error in subCsp_ij(x, i, j, drop = drop) :
Cholmod error 'problem too large' at file ../Core/cholmod_sparse.c, line 89
Calls: %>% ... [ -> callGeneric -> eval -> eval -> [ -> [ -> subCsp_ij
Execution halted
For the purposes of creating a reproducible example, mock versions of n.ij
and boats.n.ij
can be generated as follows:
library(Matrix)
boats = rpois(7940*7940, 2)
keep = sample(c(0,1), 7940*7940, replace=T, prob = c(0.8, 0.2))
boat.dat = boats*keep
boats.n.ij = matrix(boat.dat, nrow=7940, ncol=7940)
diag(boats.n.ij) = 0
boats.n.ij = Matrix(boats.n.ij, sparse = T)
boats.n.ij[1:10, 1:10]
n.ij = 1:7940
Why am I failing to add the objective to my model? Is it just that I am implying the existence of three very large matrices (the decision matrix u
, the boats.n.ij
matrix, and their product matrix)? Is it because the model is already a file that is about 6.8gb? Is there a cap on memory or object size imposed by R I am running into? Are these functions just not capable of considering an objective with this many decision points?
I can confirm that I have been able to run a scaled-down version of the model on a very small subset of boats.n.ij
that optimizes just fine, so I don't think it's an issue with my model specification, but I could be wrong...I should also state explicitly I am not interested in solutions that do not involve solving this model in R, as that is the express objective here. I am open, however, to using other packages if there's a more robust one available (although I like this one otherwise).
Note: Unlike in the paper I've cited, I have eliminated the need for a vector called b.ij
that my colleague does use, so that isn't the issue here.
Edit: Note that @nicola's reforming of the objective will set and solve, but the original constraints and/or variables would no longer have the same relationship with it, so it'd be fitting a different model than the one I want to fit. In the original construction, only a max of 10 values in x[i], and thus a max of 10 unique values of i within the decision variable u[i,j], would be allowed to be 1s thanks to the constraint involving our budget parameter B
. In @nicola's version, far more than 10 unique values of i are permitted to be 1s within u[i,j]. It's actually unclear to me at least how the constraints as originally written interact with @nicola's objective, if at all. However, I suspect an objective like @nicola's could definitely be used to exploit the sparsity of my boats.n.ij matrix so as to avoid the "problem too large" error, but it would require the variables and/constraints to be modified accordingly. I changed the title of the question to be much clearer as to what I am looking for--I want to avoid the error but otherwise fit an equivalent model.
Second edit: @nicola's solution does work after all! However, the variables and constraints needed a little modification due to updates to ompr
since I posted this question. See the following toy example:
library(Matrix)
library(slam)
library(dplyr)
library(tidyr)
library(ROI)
library(ompr)
library(ompr.roi)
library(Rglpk)
library(ROI.plugin.glpk)
library(lattice)
set.seed(101)
N = 500
boats = rpois(N*N, 2)
keep = sample(c(0,1), N*N, replace=T, prob = c(0.97, 0.03))
boat.dat = boats*keep
boats.n.ij = Matrix(boat.dat, nrow=N, ncol=N, sparse =T)
diag(boats.n.ij) = 0
boats.n.ij[1:10, 1:10]
n.ij = N
B = 5
mod1 = MIPModel() %>%
add_variable(u[i, j], type = "binary", i = 1:n.ij, j = 1:n.ij) %>%
add_variable(x[i], type = "binary", i = 1:n.ij) %>%
add_variable(y[j], type = "binary", j = 1:n.ij) %>%
add_constraint(x[i] == y[j], i = 1:n.ij, j = 1:n.ij, i == j) %>%
add_constraint(sum_over(x[i], i = 1:n.ij) <= B) %>%
add_constraint(u[i,j] <= x[i] + y[j], i = 1:n.ij, j = 1:n.ij)
boatsSTM = as.simple_triplet_matrix(boats.n.ij)
#setting the objective function
mod.2nd = mod1 %>% set_objective(sum_over(u[boatsSTM$i[k], boatsSTM$j[k]] * boatsSTM$v[k], k = 1:length(boatsSTM$i)))
mod.2nd.solved = mod.2nd %>%
solve_model(with_ROI("glpk", verbose=TRUE))
testB = get_solution(mod.2nd.solved, u[i,j])
test2B = pivot_wider(testB, names_from = j, values_from = value) %>% dplyr::select(-variable, -i)
test3B = as.matrix(test2B, nrow=100)
levelplot(test3B)
An attempt:
require(slam)
boatsSTM<-as.simple_triplet_matrix(boats.n.ij)
...
#setting the objective function
set_objective(sum_expr(u[boatsSTM$i[k], boatsSTM$j[k]] * boatsSTM$v[k], k = 1:length(boatsSTM$i)))
We exploit the sparsity of the matrix. In a simple triplet matrix you just list the values that are not zero, implying that if an element is not listed is equal to zero. The values are denoted with a (i, j, v)
triplet, where i
denotes the row index, j
the column index and v
the value. So, for instance, a (2, 4, 10.32)
triplet denotes that m[2, 4] = 10.32
.
In your sum_expr
line, we exploit this and just add the elements that are different from zero. We don't multiply each element of u
with each element of boats
, since most are zero and irrelevant to the sum; rather we just do the above for the elements that matter.
The slam
package implements the simple triplet matrix which is, at its root, just a list of i
, j
and v
values.