Search code examples
rcategorical-data

Model.matrix contrast error


I have a dynamic string to create a model.matrix. Value of string is as follows:

total_matrix_str
[1] "model.matrix( ~ date + MDSE_ITEM_I + COLR_N + SLS_TYPE_GRP_C + dayofwk +
 MDSE_ITEM_REF_I + WK_END_D + GREG_D + SIZE_N + MDSE_STYL_N + COLR_FMLY_N + LATTD_I 
+ LNGTD_I + weekend + dsp + assort_size + colr_per + pctTillDate + weeknr + MEANTEMPM 
+ MEANVISM + MEANWINDSPDM + MAXHUMIDITY + MINHUMIDITY + MEANDEWPTM + MEANPRESSUREM 
+ FOG + RAIN + THUNDER ,data =  total ,
contrasts.arg =list( MDSE_ITEM_I=contrasts(total$MDSE_ITEM_I,contrasts = F) , 
CO_LOC_I=contrasts(total$CO_LOC_I,contrasts = F) ,
COLR_N=contrasts(total$COLR_N,contrasts = F) ,
dayofwk=contrasts(total$dayofwk,contrasts = F) ,
SIZE_N=contrasts(total$SIZE_N,contrasts = F) ,
MDSE_STYL_N=contrasts(total$MDSE_STYL_N,contrasts = F) ,
COLR_FMLY_N=contrasts(total$COLR_FMLY_N,contrasts = F) ,
assort_size=contrasts(total$assort_size,contrasts = F) ,
weeknr=contrasts(total$weeknr,contrasts = F) ))"

Here are distinct value counts of the categorical variables:

> length(unique(total$MDSE_ITEM_I))
[1] 30
> length(unique(total$CO_LOC_I))
[1] 5
> length(unique(total$COLR_N))
[1] 6
> length(unique(total$dayofwk))
[1] 7
> length(unique(total$SIZE_N))
[1] 9
> length(unique(total$MDSE_STYL_N))
[1] 6
> length(unique(total$COLR_FMLY_N))
[1] 4
> length(unique(total$assort_size))
[1] 7
> length(unique(total$weeknr))
[1] 7

still this command results in error as follows:

total_matrix <- eval(parse(text = total_matrix_str))
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

Any clues why I might be getting this error? How to resolve error dynamically by auto detecting such scenario?


Solution

  • This problem is coming because of single unique value of one string variable.

    In Gregor's words:

    All non-integer/numeric variables will have contrasts, either the default contrasts or the ones you specify. Your contrasts.arg overrides the default contrasts for certain variables you specify - any other categorical variables will get default contrasts.

    So essentially all factor and string variables will inevitably get contrasts. However this will fail if any factor or string variable have just one unique value.