Search code examples
rdataframecsvnumber-formatting

R won't read in numeric values of CSV in data frame


I am a beginner in R, so please bear with me! I have a csv read into R, it shows the relative abundance of different species at 9 sites. I am trying to get a summation of the rows, however R will not read the numerical values as such, which is going to lead to more issues as I try to run an NMDS ordination analysis. Here are some details about my data:

class(MSBees.spp)

[1] "data.frame"

First few rows of what str() prints:

str(MSBees.spp)  

'data.frame':   9 obs. of  29 variables:  
 $ X     : chr  "Site 1" "Site 2" "Site 3" "Site 4" ...  
 $ COLALG: num  0.02013 0.00997 0 0 0 ...  
 $ COLCAL: num  0 0 0 0 0 0 0 0 0  
 $ COLCON: num  0 0 0 0 0 0 0.00339 0 0  
 $ COLFUL: num  0 0 0 0 0 0.00361 0 0 0  
 $ COLOBS: num  0.00671 0.00664 0.00755 0.00292 0 ... 

What my code looks like so far:

MSBees.spp <- read.csv("data_transposed.csv", header = TRUE)   
MSBees.env <- read.csv("data_transposed.env.csv", header = TRUE, row.names = 1)   
MSBees_PA.spp <- read.csv("data_transposed.csv", header = TRUE, row.names = 1)  
MSBees_abund.spp <- read.csv("data_transposed.csv", header = TRUE, row.names = 1)   
sum.of.rows <- apply(MSBees.spp, 1, sum) 

which returns:
Error in FUN(newX[, i], ...) : invalid 'type' (character) of argument

I have tried to coerce them to be numeric with:

as.numeric (MSBees.spp)

and

data.frame(lapply(MSBees.spp [2:9, 2:29], as.numeric)) 

but no luck in being able to run the apply() function successfully.
Am I sure I am just not reading the csv in properly or coercing them properly, but just can't get it right. Any tips? Please and thank you!


Solution

  • The row wise sum

    apply(MSBees.spp, 1, sum) 
    

    is an issue because the first column based on the str is character.

    $ X     : chr  "Site 1" "Site 2" "Site 3" "Site 4" ...  
    

    We need to exclude the first column

    apply(MSBees.spp[-1], 1, sum) 
    

    and get the sum or use vectorized rowSums

    rowSums(MSBees.spp[-1], na.rm = TRUE)
    

    Also, there is no need to reconvert the dataset to numeric as the data is correctly read as numeric column types. In addition, as.numeric expects a vector as input where as MSBees.spp is a data.frame i.e. as.numeric(MSBees.spp[[2]]) works as it is a vector