I am having trouble with a complicated regex. I have tried this every way I can think of, and I can always only get "almost" there --
I have a block of 5 messages:
---Agent 1: Wednesday 08/16/2017 | 11:43 AM ---
Message 1
--- Agent 1: Friday 06/09/2017 | 9:02 AM ---
Message 2
--- Agent 1: Friday 04/14/2017 | 10:35 AM ---
Message 3
--- Agent 1: Monday 02/13/2017 | 12:07 AM ---
This
is
message
3
--- Agent 1: Monday 12/19/2016 | 1:31 PM ---
Message 4
--- Agent 1: Monday 10/24/2016 | 10:48 AM ---
Message 5
One problem is that some of them have a space before the first ---
. Another is multi-line messages.
What I am trying to do is peel out all the individual messages. Basically everything between the first occurrence of ---
and every other occurrence thereafter. I would like my result to look like:
---Agent 1: Wednesday 08/16/2017 | 11:43 AM ---
Message 1
I have tried variations of ---.*? (---)
(matching every other ---
) But then I have no way of parsing out the message itself. I have also tried to manually do this:
(?<=\: )(.*?)(?= \|)|(\---)(\r\n|\r|\n)(\r\n|\r|\n)(.*?)(\r\n|\r|\n)(\r\n|\r|\n)(\---)
Which works until you have a multi-line message. (message 3)
I have also tried multiple steps -- trimming
the first or last ---
using str_replace
but this is foiled by the ones that have a preceding space!
It's always the second ---
after the time, the multi-line message, or the preceding space before ---
that trips me up. Does anyone have a more elegant solution than the monstrosity I am creating?
/---.*---\s*\R.*(?=---|$)/gsU
---.*---\s*\R
- selects the 1st line with message description. Than .*(?=---|$)
gets the rest of a message up to start of the next one (---) or the end of the string.