I have a string value and because of how the string is populated (out of my control) I have \n new line instances right in the middle of a company name.
I wanted to do a regex replace on the particular matches to replace the \n with a space.
This is a snippet of my output (it can change. but all I'm trying to match all occurrences to the first \n it finds before a Date. and extract the text between those.
\nGBP*\nAA1234567 A random company name - I 03-Mar-2023 BUY 42.6400 42.6900 GBP 1,820.3016 1.0000 1,842.4400\nAA1234568 Another randon company name - H-M 03-Mar-2023 BUY 11.9880 845.6000 GBP 10,137.0528 1.0000 10,159.1700\nAA12345679 Third Party Utilies - Fund - Class\nAA-B Income\n03-Mar-2023 BUY 6.4120 836.9100 GBP 5,366.2669 1.0000 5,388.5200\nAA12345670 Company 4 - M 03-Mar-2023 BUY 205.6830 10.8500 GBP 2,231.6606 1.0000 2,253.7800\nAA2345678 Another random page up company - I 03-Mar-2023 BUY 66.3850 45.4400 GBP 3,016.5344 1.0000 3,038.6500\nASSET SCHEDULE\nPolicy Number 1234-56789\nAA2345679 Company 5 Utilities- M 03-Mar-2023 BUY 76.7370 13.7400 GBP 1,054.3664 1.0000 1,076.4900\nTotal
Its currently returning.
GBP*\nAA1234567 A random company name - I 03-Mar-2023
AA1234568 Another random company name - H-M 03-Mar-2023
AA12345679 Third Party Utilities - Fund - Class\nAA-B Income\n03-Mar-2023
AA12345670 Company 4 - M 03-Mar-2023
AA2345678 Another random page up company - I 03-Mar-2023
ASSET SCHEDULE\nPolicy Number 1234-56789\nAA2345679 Company 5 Utilities- M 03-Mar-2023
But what I want to retrieve is the following.
AA1234567 A random company name - I 03-Mar-2023 BUY 42.6400 42.6900 GBP 1,820.3016 1.0000 1,842.4400
AA1234568 Another random company name - H-M 03-Mar-2023 BUY 11.9880 845.6000 GBP 10,137.0528 1.0000 10,159.1700
AA12345679 Third Party Utilities - Fund - Class\nAA-B Income\n03-Mar-2023 BUY 6.4120 836.9100 GBP 5,366.2669 1.0000 5,388.5200
AA12345670 Company 4 - M 03-Mar-2023 BUY 205.6830 10.8500 GBP 2,231.6606 1.0000 2,253.7800
AA2345678 Another random page up company - I 03-Mar-2023 BUY 66.3850 45.4400 GBP 3,016.5344 1.0000 3,038.6500
AA2345679 Company 5 Utilities- M 03-Mar-2023 BUY 76.7370 13.7400 GBP 1,054.3664 1.0000 1,076.4900
The third row in this occasion contains 2 new lines Class\nAA-B Income\n
My Pattern is as follows
if there's an easier way please let me know.
Thanks in advance
Tried multiple patterns but cant seem to quite get it.
You may use this regex:
RegEx Demo:
: Lookbehind to assert presence of \n
at the previous position(?:
: Start non-capture group
: Match 1+ of uppercase letters[0-9]
: Match a digit[A-Z0-9]*
: Match 0 or more uppercase letters or digits|
: Match a -
: End non-capture group(?:\s+\w+)+
: Match company separated with 1+ whitespaces.*?
: Match 0+ of any character (non-greedy)[a-zA-Z]{3}-\d{4}
: Match month-year
: Match 1+ of any character (non-greedy)(?=\\n)
: Lookahead to assert presence of \n
at the next position