Search code examples
rstring-parsing

Convert strings representing unit of time or distance to numeric


I would like to search a data.frame column with string distances and convert them to numeric fields. I would do the same on twitter style dates such as '3 days ago' using the same function.

If I was starting with:

x <- c("5 days ago", "1 day ago", "6 days ago")

I would end up with:

x <- c(120, 24, 144)

Any help would be appreciated!


Solution

  • If your data consist only "number days ago" or "number miles" you can use regular expressions:

    > x <- c("5 days ago", "1 day ago", "6 days ago", "21.2 miles", "1 mile")
    > x[grep(" day",x)] <- as.numeric(gsub("[ daysago]","",x[grep(" day",x)] ))*24
    > x
    [1] "120"        "24"         "144"        "21.2 miles" "1 mile"    
    > x[grep(" mile",x)] <- as.numeric(gsub("[ miles]","",x[grep(" mile",x)] )) 
    > x
    [1] "120"  "24"   "144"  "21.2" "1"   
    > x <- as.numeric(x)
    > x
    [1] 120.0  24.0 144.0  21.2   1.0