Search code examples
rnasapply

How do I remove NA from a data frame with the intention of using sapply on the data frame


I have a data frame:

            colA           colB
1           15.3           1.76
2           10.8           1.34
3            8.1           1.27
4           19.5           1.47
5            7.2           1.27
6            5.3           1.49
7            9.3           1.31
8           11.1           1.09
9            7.5           1.18
10          12.2           1.22
11           6.7           1.25
12           5.2           1.19
13          19.0           1.95
14          15.1           1.28
15           6.7           1.52
16           8.6             NA
17           4.2           1.12
18          10.3           1.37
19          12.5           1.19
20          16.1           1.05
21          13.3           1.32
22           4.9           1.03
23           8.8           1.12
24           9.5           1.70

How would I be able to remove/change the value of all NAs such that when I use sapply (i.e. sapply(x, mean)), I am taking the mean of 24 rows in the case of colA and 23 columns for colB?

I understand that data frames have to have the same number of rows so using something like na.omit() would not work because it'd remove, in this case, row 16; I'd lose a row of data when I'm calculating the mean for colA.

Thanks!


Solution

  • You should be able to pass na.rm = TRUE and get the mean.

    Example:

    df <- data.frame(A = 1:3, B = c(NA, 1, 2))
    apply(df, 2, mean, na.rm = TRUE)
    
    #   A   B 
    # 2.0 1.5