Search code examples
rregexregex-group

Remove text after second colon


I need to remove everything after the second colon. I have several date formats, that need to be cleaned using the same algorithm.

a <- "2016-12-31T18:31:34Z"
b <- "2016-12-31T18:31Z"

I have tried to match on the two column groups, but I cannot seem to find out how to remove the second match group.

sub("(:.*){2}", "", "2016-12-31T18:31:34Z")

Solution

  • A regex you can use: (:[^:]+):.*

    which you can check on: regex101 and use like

    sub("(:[^:]+):.*", "\\1", "2016-12-31T18:31:34Z")
    [1] "2016-12-31T18:31"
    sub("(:[^:]+):.*", "\\1", "2016-12-31T18:31Z")
    [1] "2016-12-31T18:31Z"