I just want to calculate the mean/median of all the values in my table. I tried several functions but nothing seems to work, I'm always getting errors like 'Argument is not nummeric or Logical: returning NA' or 'object cannot be coerced to type 'double'.
I have a table consisting of 11 columns. I have several NA's in my data.
I tried the following:
mean(WDB1, na.rm=TRUE)
Didn't work so I thought Maybe as.numeric
will help:
as.numeric(WDB1, na.rm=TRUE)
I also tried to build a dataframe and to use apply
.
The output from str(WDB1) is:
'data.frame': 18 obs. of 11 variables:
$ Artname: Factor w/ 18 levels "Andrena carotonica",..: 11 9 10 7 8 12 15 14 1 3 ...
$ X1 : int 2 0 7 NA NA NA NA NA NA NA ...
$ X2 : int 4 1 41 NA NA NA NA NA NA NA ...
$ X3 : int 27 7 39 5 NA NA NA NA NA NA ...
$ X4 : int 37 5 32 NA 7 2 NA 1 NA NA ...
$ X5 : int 38 3 33 2 NA NA NA NA NA NA ...
$ X6 : int 35 12 33 NA NA NA NA NA NA NA ...
$ X7 : int 12 4 44 NA NA NA NA NA NA NA ...
$ X8 : int 12 15 24 NA NA NA NA NA NA NA ...
$ X9 : int 30 0 39 NA NA NA NA NA NA NA ...
$ X10 : int 18 2 33 1 NA NA NA NA 1 NA ...
dput(WDB1)
structure(list(Artname = structure(c(11L, 9L, 10L, 7L, 8L, 12L,
15L, 14L, 1L, 3L, 2L, 4L, 5L, 17L, 13L, 16L, 18L, 6L), .Label = c("Andrena carotonica",
"Andrena cineraria", "Andrena dorsata", "Andrena flavipes", "Andrena nigriceps",
"Anthopora plumipes", "Bombus hortorum", "Bombus humilis", "Bombus lapidarius",
"Bombus lucorum", "Bombus pascuorum", "Bombus pratorium", "Colletes similis",
"Heriades truncorum", "Lasioglossum punctatissimum", "Lasioglosum lucidulum",
"Melitta haemorrhoridales", "Sphecodes puncticeps"), class = "factor"),
X1 = c(2L, 0L, 7L, NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA,
2L, 1L, 1L, NA, NA), X2 = c(4L, 1L, 41L, NA, NA, NA, NA,
NA, NA, NA, NA, 1L, 1L, NA, NA, NA, 1L, NA), X3 = c(27L,
7L, 39L, 5L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA), X4 = c(37L, 5L, 32L, NA, 7L, 2L, NA, 1L, NA,
NA, 1L, NA, NA, NA, NA, NA, NA, 3L), X5 = c(38L, 3L, 33L,
2L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), X6 = c(35L, 12L, 33L, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 1L, NA), X7 = c(12L, 4L, 44L, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X8 = c(12L,
15L, 24L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA), X9 = c(30L, 0L, 39L, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA), X10 = c(18L, 2L, 33L,
1L, NA, NA, NA, NA, 1L, NA, NA, 1L, 1L, NA, NA, 1L, NA, 1L
)), class = "data.frame", row.names = c(NA, -18L))
I'm new to R and really thankful for any help!
I already have the mean/median of each column. Now I Need it from all values in my dataframe.
Presumably, that means the mean of all columns except the first one (which is a factor column).
The steps for doing that are:
Subset the data.frame to remove the first column:
WDB1[,-1]
# X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
#1 2 4 27 37 38 35 12 12 30 18
#2 0 1 7 5 3 12 4 15 0 2
#3 7 41 39 32 33 33 44 24 39 33
#4 NA NA 5 NA 2 NA NA NA NA 1
#5 NA NA NA 7 NA NA NA NA NA NA
#6 NA NA NA 2 NA NA NA NA NA NA
#7 NA NA NA NA NA NA NA NA NA NA
#8 NA NA NA 1 NA NA NA NA NA NA
#9 NA NA NA NA NA NA NA NA NA 1
#10 NA NA NA NA NA NA NA NA NA NA
#11 NA NA NA 1 NA NA NA NA NA NA
#12 1 1 NA NA NA NA NA NA NA 1
#13 NA 1 NA NA NA NA NA NA NA 1
#14 2 NA NA NA NA NA NA NA NA NA
#15 1 NA NA NA NA NA NA NA NA NA
#16 1 NA NA NA NA NA NA NA NA 1
#17 NA 1 NA NA NA 1 NA NA NA NA
#18 NA NA NA 3 NA NA NA NA NA 1
Transform the result into a vector because mean
doesn't except data.frames as input. I use unlist
because data.frame's are lists, but you also could use as.matrix
:
unlist(WDB1[,-1])
# X11 X12 X13 X14 X15 X16 X17 X18 X19 X110 X111 X112 X113 X114 X115 X116 X117 X118 X21 X22 X23 X24 X25 X26
# 2 0 7 NA NA NA NA NA NA NA NA 1 NA 2 1 1 NA NA 4 1 41 NA NA NA
# X27 X28 X29 X210 X211 X212 X213 X214 X215 X216 X217 X218 X31 X32 X33 X34 X35 X36 X37 X38 X39 X310 X311 X312
# NA NA NA NA NA 1 1 NA NA NA 1 NA 27 7 39 5 NA NA NA NA NA NA NA NA
# X313 X314 X315 X316 X317 X318 X41 X42 X43 X44 X45 X46 X47 X48 X49 X410 X411 X412 X413 X414 X415 X416 X417 X418
# NA NA NA NA NA NA 37 5 32 NA 7 2 NA 1 NA NA 1 NA NA NA NA NA NA 3
# X51 X52 X53 X54 X55 X56 X57 X58 X59 X510 X511 X512 X513 X514 X515 X516 X517 X518 X61 X62 X63 X64 X65 X66
# 38 3 33 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 35 12 33 NA NA NA
# X67 X68 X69 X610 X611 X612 X613 X614 X615 X616 X617 X618 X71 X72 X73 X74 X75 X76 X77 X78 X79 X710 X711 X712
# NA NA NA NA NA NA NA NA NA NA 1 NA 12 4 44 NA NA NA NA NA NA NA NA NA
# X713 X714 X715 X716 X717 X718 X81 X82 X83 X84 X85 X86 X87 X88 X89 X810 X811 X812 X813 X814 X815 X816 X817 X818
# NA NA NA NA NA NA 12 15 24 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
# X91 X92 X93 X94 X95 X96 X97 X98 X99 X910 X911 X912 X913 X914 X915 X916 X917 X918 X101 X102 X103 X104 X105 X106
# 30 0 39 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 18 2 33 1 NA NA
# X107 X108 X109 X1010 X1011 X1012 X1013 X1014 X1015 X1016 X1017 X1018
# NA NA 1 NA NA 1 1 NA NA 1 NA 1
Pass the vector to the mean
function (make sure to deal with NA
values by setting na.rm = TRUE
):
mean(unlist(WDB1[,-1]), na.rm = TRUE)
#[1] 12.2549