I have a data frame:
colA colB
1 15.3 1.76
2 10.8 1.34
3 8.1 1.27
4 19.5 1.47
5 7.2 1.27
6 5.3 1.49
7 9.3 1.31
8 11.1 1.09
9 7.5 1.18
10 12.2 1.22
11 6.7 1.25
12 5.2 1.19
13 19.0 1.95
14 15.1 1.28
15 6.7 1.52
16 8.6 NA
17 4.2 1.12
18 10.3 1.37
19 12.5 1.19
20 16.1 1.05
21 13.3 1.32
22 4.9 1.03
23 8.8 1.12
24 9.5 1.70
How would I be able to remove/change the value of all NA
s such that when I use sapply
(i.e. sapply(x, mean)
), I am taking the mean of 24 rows in the case of colA
and 23 columns for colB
?
I understand that data frames have to have the same number of rows so using something like na.omit()
would not work because it'd remove, in this case, row 16; I'd lose a row of data when I'm calculating the mean for colA
.
Thanks!
You should be able to pass na.rm = TRUE
and get the mean.
Example:
df <- data.frame(A = 1:3, B = c(NA, 1, 2))
apply(df, 2, mean, na.rm = TRUE)
# A B
# 2.0 1.5