model convergence issues using glmer

I am trying to run a mixed effect poisson model. I am having a problem with model convergence when I enter a specific variable and I am hoping to get thoughts on why that might be. Here is a segment of my data.

id   gender race gene   grade  y
1     0      1    -1.5     6   4
1     0      1    -2.1     7   2
1     0      1     1.5     8   6
2     1      2     3.6     6   4
2     1      2     2.1     7   3
2     1      2     1.6     8   1

I used the code below and I am getting the error message below.

m2<-glmer(y ~ gender + race + gene + grade +
            (1 | id), data=data_long_1, family = "poisson"(link = "log"), control = glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=2e5)))
                                                                                               
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 0.00392577 (tol = 0.002, component 1)

The problem is the "grade" variable as when I remove the variable, I don't get that error message. Everyone has 3 grandes (6,7,8). I, ideally, want to run grade x gene interactions, but I won't be able to do that if grade isn't in the model.

The estimated coefficients are:

Fixed effects:
                                Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    1.683e+00  4.653e-02  36.159  < 2e-16 ***
gender1                       -3.194e-02  3.584e-02  -0.891  0.37288    
race1                          1.329e-01  4.249e-02   3.127  0.00177 ** 
gene                           8.298e-03  2.499e-02   0.332  0.73983    
grade                          2.980e-07  6.552e-03   0.000  0.99996    
gene:grade                     3.346e-07  6.768e-03   0.000  0.99996

Can someone provide insight into why this variable might be a problem?

Solution

I can't replicate your convergence warnings: with the data you sent off-line, on Linux, with a development version of lme4, I don't get any convergence warnings — such platform-dependence is not terribly unusual ...

However, I think I can explain your results based on the structure of the data you sent. Here is a sample for a typical individual, with the values modified for confidentiality:

     id gender race  y      gene grade
1  xxxx      1    1  8 -1.543210     6
2  xxxx      1    1  8 -1.543210     7
3  xxxx      1    1  8 -1.543210     8
4  xxxx      1    1  8 -1.543210     9

there are many individuals (between 500 and 1000) in the data set
the values of gender, race, gene, and y, the response variable do not vary within id (this is important)
only the value for grade varies within id, and it is perfectly balanced — each id has exactly four observations, for grade=6,7,8,9

This means that the average effect of grade on y, or the interaction of anything with grade, is exactly zero!

Since this data set doesn't really have more than one observations' worth of information about each id (i.e. the same values are repeated 4 times for each individual, except for grade), it might be to better to take only the first observation for each individual and fit

glm(y ~ gender + race + gene, data=..., family=poisson)

(I usually omit the (link="log") because it's the default, but it's fine to include it if it makes the code clearer).

A similar question shows that things get more pathological if you try to fit a model with a residual variance term (e.g. LMM/Gaussian response) to such a data set ...