I have a string vector in R as follows:
[1] Type 1 Type 2 Type 4 Type 3 Type 4 Type 6 Type 2 Type 5
[9] Type 2 Type 3 Type 7
Also note:
str(data)
# Factor w/ 7 levels "Type 1","Type 2",..: 1 2 1 3 4 1 2 4 2 3 ...
I want to convert this to an integer vector to be able to perform cluster analysis (get cluster performance index). Because I am getting the following error: argument 'part' must be an integer vector
What would be the most effective solution?
The str
output shows that you have a factor, not a vector of character strings. It also shows that the level labels are Type 1
, Type 2
and so on. A factor will represent the first level internally as 1, the second as 2 and so on. So, supposing we have data
shown reproducibly in the Note at the end, to convert it to an integer vector we only have to use as.integer
:
as.integer(data)
## [1] 1 2 1 3 4 1 2 4 2 3
If the label levels are not actually Type 1
, Type 2
so that, for example, the third level is represented by Type 93
, say, rather than Type 3
then we can implicitly convert to character and remove the non-digit characters and finally convert the rest to an integer vector.
as.integer(gsub("\\D", "", data))
## [1] 1 2 1 3 4 1 2 4 2 3
data <- structure(c(1L, 2L, 1L, 3L, 4L, 1L, 2L, 4L, 2L, 3L), .Label = c("Type 1",
"Type 2", "Type 3", "Type 4"), class = "factor")