To keep it short, I have this ".xes" (Extensible Event Stream) file, which is in fact an XML, and looks like this (this code only shows an example of an event - the file contains multiple events similar to this one):
<?xml version="1.0" encoding="UTF-8" ?>
<log xes.version="1.0" xes.features="nested-attributes" openxes.version="1.0RC7" xmlns="http://www.xes-standard.org/">
<trace>
<string key="concept:name" value="0"/>
<event>
<string key="org:resource" value="Call Centre Agent"/>
<date key="time:timestamp" value="2006-01-01T00:00:00.000+01:00"/>
<string key="concept:name" value="check if sufficient information is available"/>
<string key="lifecycle:transition" value="start"/>
</event>
</trace>
...
This file represents in fact a business process event log which contains the events of different activities of a process with timestamp and further information.
I need to extract the information from this data and prepare some SQL queries as well.
I am currently using a SQL Server 2014 Express database, and having trouble importing the data and querying it.
This is a general approach to get a file's content into a variable: This is a general approach to get a file's content into a variable:
DECLARE @xml XML=
(SELECT * FROM OPENROWSET(BULK 'C:\YourPath\XMLFile.xml',SINGLE_CLOB) AS x);
SELECT @xml;
As this is nested data (with unclear level of nesting...) this is my suggestion:
DECLARE @log XML=
'<log xmlns="http://www.xes-standard.org/" xes.version="1.0" xes.features="nested-attributes" openxes.version="1.0RC7">
<trace>
<string key="concept:name" value="0" />
<event>
<string key="org:resource" value="Call Centre Agent" />
<date key="time:timestamp" value="2006-01-01T00:00:00.000+01:00" />
<string key="concept:name" value="check if sufficient information is available" />
<string key="lifecycle:transition" value="start" />
</event>
<event>
<string key="second-resouce" value="Call Centre Agent" />
<date key="second:timestamp" value="2006-01-01T00:00:00.000+01:00" />
<string key="second:name" value="check if sufficient information is available" />
<string key="second:transition" value="start" />
</event>
</trace>
</log>';
WITH XMLNAMESPACES(DEFAULT 'http://www.xes-standard.org/')
SELECT TraceNode.value('string[1]/@key','varchar(max)') AS Trace_String_Key
,TraceNode.value('string[1]/@value','int') AS Trace_String_Value
,EventNode.value('date[1]/@key','varchar(max)') AS Trace_Event_Date_Key
,EventNode.value('date[1]/@value','datetime') AS Trace_Event_Date_Value
,EventStringNode.value('@key','varchar(max)') AS Trace_Event_String_Key
,EventStringNode.value('@value','varchar(max)') AS Trace_Event_String_Value
FROM @log.nodes('/log/trace') AS a(TraceNode)
OUTER APPLY TraceNode.nodes('event') AS b(EventNode)
OUTER APPLY EventNode.nodes('string') AS c(EventStringNode)
do you have any suggestions on how, and for what, could I query this data? Some practical examples would be useful
Well, that's really up to you... If you ask such a question, you should know what you need it for :-)
One idea:
Create a relational table structure
You can use the query to retrieve your data to insert this into your tables...