Search code examples
.netregexreplacenintex-workflow

Regex to capture timestamp in different formats


I have different time formats that come into my report. I’m trying to standardize the format. The two I have seen so far is as follows.

3. When: 150845Z - 161045Z Jan 16
When: 15 08:45Z - 16 10:45Z Jan 16

My goal is to transform that data from the above input into the following

Start date and time 01/15/2016 08:45
End date and time 01/16/2016 10:45

I have multiple capture groups and splits to make this happen and to be hones it is rather large and I think it can be simplified.

I can post each step of code I have however it would really bloat this post. For the start date and time I do the following

(?s)(?<=^.When:\s)[a-zA-Z0-9]+

For the end date and time I do the following

When:.+(?<=- )(\w.*)

I would really like to reduce this as much as possible. I tried to implement this method Regex for capturing different date formats however I'm really new to Regex and piece items together until I get it to work.

Thanks

Additional Information

I'm currently bound to using Nintex Workflows to transform the data. I would like to start the capture after When:\s then I can use [, :] to remove the remaining spaces and colons. This would leave the data in a format I can manipulate.


Solution

  • You can use

    ^.*?When:\s*(\d{2})\s*(\d{2}):?(\d{2}Z)\s*-\s*(\d{2})\s*(\d{2}):?(\d{2}Z)\s*(\‌​w+)\s*(\d{1,2})$
    

    And replace with $1$2$3$4$5$6$7$8.

    See the regex demo

    The point is to match and capture what we need and reinsert these captured texts with the help of backreferences ($ns) in the replacement pattern, and those parts we just match will be removed from the resulting string.

    Here are some more details for you to be able to adjust the pattern later:

    • ^ - start of string/line (no idea if the tool allows matching across lines)
    • .*? - match 0+ characters other than a newline as few as possible up to the first
    • When: - literal string When:
    • \s* - 0+ whitespace symbols
    • (\d{2}) - 2 digits (Group 1)
    • \s* - 0+ whitespace symbols
    • (\d{2}) - 2 digits (Group 2)
    • :? - optional :
    • (\d{2}Z) - 2 digits + Z (Group 3)
    • \s*-\s* - 0+ whitepsaces, literal - and 0+ whitespace
    • (\d{2})\s*(\d{2}):?(\d{2}Z)\s* - see above (Group 4, 5, 6)
    • (\‌​w+) - 1+ word characters (letters, digits, or underscore) (Group 7)
    • \s* - 0+ whitepsaces
    • (\d{1,2}) - 1 or 2 digits (Group 8)
    • $ - end of string