Search code examples
rvariableshierarchical-clustering

Columns in where all the values are identical Error


In the R Package - ClustOfVar, there are methods to cluster Variables with Each other. It gives the error as below -

train2 = train[!duplicated(lapply(train, summary))]
> tree <- hclustvar(train2[, 2:10])
Error in recodquant(X.quanti) : 
  There are columns in X.quanti where all the values are identical

From what I understand, to ensure that my Variables are not identical, i have applied the duplicated logic to remove duplicate variables.

I checked the package code on https://github.com/cran/PCAmixdata/blob/master/R/recodquant.R but could not identify the mistake.

Any Ideas?

Thanks, Manish


Solution

  • The below code is not correct to correctly identify duplicates -

    train2 = train[!duplicated(lapply(train, summary))]
    

    Pls use the below -

    library(digest)
    train2 = train[!duplicated(lapply(train, digest))]