I have a column of data with the following types of dates and number entries:
I want to convert these all into numbers, by doing two things. First, where the data have a number before a dash (as in the first three examples), I want to trim the data from the dash onwards. So the entries would appear 16, 21 and 7.
Second, where the entry is written in month-date format (e.g. Aug-99), I want to convert that to the number of the month and then trim it. so this example, would be to convert the date to 8-99 then trim to just 8.
How can I do this in R? When I use grep, sub and match commands, as in the answer below, I get: [1] 16 21 7 5 8
When I am after: [1] 16 21 7 8 5
We use grep
to find the index of elements that start with alphabets. Remove the substring that starts from -
to the end of the string with sub
. Subset the 'v2' based on 'i1' and convert to numeric
while we match
the ones starting with alphabets to month.abb
and get the index of month, concatenate the output.
i1 <- grepl("^[A-Z]", v1)
v2 <- sub("-.*", "", v1)
c(as.numeric(v2[!i1]), match(v2[i1], month.abb))
#[1] 16 21 7 8
For the new dataset, we can use ifelse
i1 <- grepl("^[A-Z]", df1$v1)
v2 <- sub("-.*", "", df1$v1)
as.numeric(ifelse(i1, match(v2, month.abb), v2))
#[1] 16 21 7 8 5
v1 <- c('16-Jun','21-01A','7-04','Aug-99')
df1 <- structure(list(v1 = c("16-Jun", "21-01A", "7-04", "Aug-99", "5-09"
)), .Names = "v1", class = "data.frame", row.names = c(NA, -5L))