We receive auto-generated emails from an application, and we export those to our database as they arrive at the Inbox. The table is called dbo.MailArchive
.
Up until recently, the body of the email has always looked like this...
Status: Completed
Successful actions count: 250
Page load count: 250
...except with different numbers and statuses. Note that there is a carriage return on the blank line after Page load count.
The entirety of this data gets written to a field called Mail_Body
- then we run the following statement using OPENJSON
to parse those lines into their own columns in the record:
DECLARE @PI varchar(7) = '%[^' + CHAR(13) + CHAR(10) + ']%';
SELECT j.Status,
j.Successful_Actions_Count,
j.Page_Load_Count
FROM dbo.MailArchive m
CROSS APPLY(VALUES(REVERSE(m.Mail_Body),PATINDEX(@PI,REVERSE(m.Mail_Body)))) PI(SY,I)
CROSS APPLY(VALUES(REVERSE(STUFF(PI.SY,1,PI.I,''))))S(FixedString)
CROSS APPLY OPENJSON (CONCAT('{"', REPLACE(REPLACE(S.FixedString, ': ', '":"'), CHAR(13) + CHAR(10), '","'), '"}'))
WITH (Status varchar(100) '$.Status',
Successful_Actions_Count int '$."Successful actions count"',
Page_Load_Count int '$."Page load count"') j;
Beginning today, there are certain emails where the body of the email looks like this:
Agent did not meet defined success criteria on this run.
Status: Completed
Successful actions count: 250
Page load count: 250
To clarify, that's one new line at the top, a carriage return at the end of that line, and a carriage return on the blank line between the new line and the Status
line. At this time, there is no consistent way to predict which emails will come in with this new line, and which ones won't.
How can I modify our OPENJSON
statement to say, If this first line exists in the body, skip/ignore it and parse lines 3 through 5, else just do exactly what I have above? Or perhaps even better to future-proof it, always ignore everything before the word Status
?
Since your data has new leading and trailing rows, I think a simple aggregation in concert with a string_split()
and a CROSS APPLY
would be more effective than my previous XML answer and the current JSON approach
Example or dbFiddle
Select A.ID
,Status = stuff(Pos1,1,charindex(':',Pos1),'')
,Action = try_convert(int,stuff(Pos2,1,charindex(':',Pos2),''))
,PageCnt = try_convert(int,stuff(Pos3,1,charindex(':',Pos3),''))
From YourTable A
Cross Apply (
Select [Pos1] = max(case when Value like 'Status:%' then value end)
,[Pos2] = max(case when Value like '%actions count:%' then value end)
,[Pos3] = max(case when Value like 'Page load count:%' then value end)
From string_split(SomeCol,char(10))
) B
Returns
ID Status Action PageCnt
1 Completed 250 250
Note: Use an OUTER APPLY
if you want to see NULLs