Search code examples

PLM regression with log variables returning non-finite values error when there are no null or NA values in the data

I'm using plm package to analyse my panel data, which comprises a set of states for 14 years. In the course of running plm regressions, I've encountered a lot of times the error "model matrix or response contain non-finite values", but i've eventually solved them by deleting observations with null or NA values. However, I'm doing the regression:

mod_3.1_within_log_b <- plm(log(PIB) ~ txinad + prod + op + emp + log(RT) + log (DC) + log(DK) + Gini + I(log(DC)*Gini) + I(log(DK)*Gini), data = dd, effect = 'individual')

summary (mod_3.1_within_log_b)

which returns

Error in model.matrix.pdata.frame(data, rhs=1, model=model, effect=effect,
model matrix or response contains non-finite values (NA/NaN/inf/-inf)

But, as I said, my data contains no more null or NA values. Just to test this, I've run the separate regressions

mod_3.1_within_log_b <- plm(log(PIB) ~ txinad + prod + op + emp + log(RT) + log (DC) + Gini + I(log(DC)*Gini) + I(log(DK)*Gini), data = dd, effect = 'individual')


mod_3.1_within_log_b <- plm(log(PIB) ~ txinad + prod + op + emp + log(RT) + log(DK) + Gini + I(log(DC)*Gini) + I(log(DK)*Gini), data = dd, effect = 'individual')

summary (mod_3.1_within_log_b)

and both worked, indicating that it is when I run with log(DK) and log(DC) together that I receive the error.

Thanks in advance!


  • As @StupidWolf suggested in the comment, your model matrix may contain contain zero's or possibly negative values (log(-1) returns NaN and log(0) return Inf).

    plm does not handle this by removing incomplete observations manually, but we can do this manually by checking the model matrix used (or looking at the original data). Without complete data this is just a suggestion to check for some simple problems in the model matrix.

    Note that I've shortened the formula to improve readability.

    mm <- model.matrix(txinad + prod + op + emp + log(RT) + 
                        (log(DC) + log(DK)) * Gini, data = dd)
    ## Check complete.cases
    if(any(icc <- !complete.cases(mm))){
        cat('Rows in dd causing trouble:\n')
        print(dd[icc, ])

    This would print any rows in dd, that causes problem in the model.matrix.