Search code examples
rpca

Getting Error while doing correlation in R


I have a code where i am extracting a CSV file in R . I have close to 40-50 variables and i want to reduce the dimensions for further analysis . Most of the columns in Dataset are either INT ,FACTOR OR NUM. The typeof my dataframe BO is list. The error on the below code is 'Error in cor(BO) : 'x' must be numeric'

heatmap(cor(BO),Rowv = NA,Colv = NA)

Solution

  • As mentioned in the comments, you have non numeric values in your data.frame, which you need to exclude:

    heatmap(cor(BO[, sapply(BO, is.numeric)]),Rowv = NA,Colv = NA)
    

    Explanation

    With sapply you loop through all columns of your data frame (which is internally stored as a list with the in-variant that all elements must be of the same length) and apply the function is.numeric to the columns. You get back a logical vector for all columns whihc are numeric. With this vector you can now select the proper columns.

    Example With a BuiltIn dataset

    ## does not work for the same reason
    heatmap(cor(iris))
    # Error in cor(iris) : 'x' must be numeric
    
    ## works
    heatmap(cor(iris[, sapply(iris, is.numeric)]))