I've been running some regressions at work and have been asked to set the intercept and regression coefficients to be certain values (no specific numbers at the moment, but being able to do it is the point). I've had a look online and found some answers for linear regression using offset (example below is taken from a stackoverflow answer looking at some prices and stuff using linear regression):
price ~ offset(c1*memory) + offset(c2*screen_size) + rep(c0, length(memory)) - 1
When I try using the above approach to my logistic regression I can't seem to change the intercept and I don't get anything from the summary() function. So, my logit:
## Logistical regression formula
my_logit <-glm(Insolvency ~ profits + assets, data = my_data, family="binomial")
## Summary of logit
summary(my_logit)
And when I try offsetting:
## Logistical regression formula
my_logit <-glm(Insolvency ~ offset(c1*profits) + offset(c2*assets), data = my_data, family="binomial")
## Summary of logit
summary(my_logit)
If I put in the rep(c0 length(Insolvency))-1
in then the regression doesn't work. If I run the above only the intercept value is spat out by the summary.
My question is, how can I manually set the regression coefficients and intercept to a particular value of my choosing and get R to spit out the summary? I realise that, given that R is trying to optimise something, if I set the values then I will probably not get the optimised result. But that's what I've been asked to do.
I can't post legit data as I may lose my job, but it looks something like the below but, obviously, much larger:
Insolvency
0
0
0
0
0
0
0
1
Lprofits
-23.43471027
-23.39077178
-23.1376606
-22.95771212
-22.88628836
-22.69567881
-22.29604723
-22.07703701
Lassets
25.68146508
25.7462893
22.72271675
24.3626251
24.39917186
26.66993697
21.91259524
23.80678002
You can use the offset()
for the intercept, too, it just has to be as long as the rest of the variables. You can also use -1
in the formula to suppress the estimated intercept:
my_data <- structure(list(Insolvency = c(0, 0, 0, 0, 0, 0, 0, 1), Lassets = c(25.68146508,
25.7462893, 22.72271675, 24.3626251, 24.39917186, 26.66993697,
21.91259524, 23.80678002), Lprofits = c(-23.43471027, -23.39077178,
-23.1376606, -22.95771212, -22.88628836, -22.69567881, -22.29604723, -22.07703701)), class = "data.frame", row.names = c(NA, -8L))
c0 <- -1
c1 <- 1
c2 <- 1
my_logit <-glm(Insolvency ~ -1+
offset(rep(c0, nrow(my_data))) +
offset(c1*Lprofits) +
offset(c2*Lassets),
data = my_data,
family="binomial")
summary(my_logit)
#>
#> Call:
#> glm(formula = Insolvency ~ -1 +
#> offset(rep(c0, nrow(my_data))) +
#> offset(c1 * Lprofits) +
#> offset(c2 * Lassets),
#> family = "binomial",
#> data = my_data)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -2.4593 -1.7439 -1.3775 -0.6665 0.8870
#>
#> No Coefficients
#>
#> (Dispersion parameter for binomial family taken to be 1)
#>
#> Null deviance: 17.682 on 8 degrees of freedom
#> Residual deviance: 17.682 on 8 degrees of freedom
#> AIC: 17.682
#>
#> Number of Fisher Scoring iterations: 0
Created on 2023-03-22 with reprex v2.0.2