I have a dataframe of which the columns contain a variable amount of numbers and a variable amount of NA's. The dataframe looks like this:
V1 V2 V3 V4 V5 V6
1 0 11 4 0 0 10
2 0 17 3 0 2 2
3 NA 0 4 0 1 9
4 NA 12 NA 1 1 0
<snip>
743 NA NA NA NA 8 NA
744 NA NA NA NA 0 NA
I want to make a boxplot out of this, but when I do
boxplot(dataframe)
I get the error
adding class "factor" to an invalid object
When I do
lapply(dataframe,class)
I get the folowing output:
$V1
[1] "factor"
$V2
[1] "factor"
<snip>
$V6
[1] "factor"
So how can I change my dataframe so that the columns are seen as numeric?
You want to apply as.numeric(as.character(...))
to each factor column. The code below shows how this can be done affecting only the factor variables leaving the numeric types alone.
## dummy data
df <- data.frame(V1 = factor(sample(1:5, 10, rep = TRUE)),
V2 = factor(sample(99:101, 10, rep = TRUE)),
V3 = factor(sample(1:2, 10, rep = TRUE)),
V4 = 1:10)
df2 <- data.frame(sapply(df, function(x) { if(is.factor(x)) {
as.numeric(as.character(x))
} else {
x
}
}))
This gives:
> df2
V1 V2 V3 V4
1 4 101 2 1
2 1 100 1 2
3 5 99 2 3
4 4 99 2 4
5 2 100 1 5
6 2 100 2 6
7 2 101 2 7
8 4 100 1 8
9 2 101 2 9
10 4 101 1 10
> str(df2)
'data.frame': 10 obs. of 4 variables:
$ V1: num 4 1 5 4 2 2 2 4 2 4
$ V2: num 101 100 99 99 100 100 101 100 101 101
$ V3: num 2 1 2 2 1 2 2 1 2 1
$ V4: num 1 2 3 4 5 6 7 8 9 10