I need to parse a XML response in C# and load in SQL. Just to brief,i know how to use XMLSerializer to parse the xml, so that is not i am looking for. My concern is my XML structure which i received from from web request. Below is the subset of xml, i received from xml
<apiXML>
<recordList>
<record id="31" >
<administration_name>admin1</administration_name>
<creator>Leekha, Mohit</creator>
<object_category>painting</object_category>
<object_number>1243</object_number>
<id>31</id>
<reproduction.reference>2458.jpg</reproduction.reference>
<title lang="nl-NL" invariant="false">The Title1</title>
<title lang="nl-NL" invariant="false">The Title</title>
<title lang="nl-NL" invariant="false">Different Title</title>
</record>
<record id="32" >
<administration_name>admin1</administration_name>
<creator>Leekha, Mohit</creator>
<object_category>painting</object_category>
<object_number>AL1111</object_number>
<id>32</id>
<reproduction.reference>AL1111.jpg</reproduction.reference>
<title lang="nl-NL" invariant="false">Health</title>
</record>
<record id="34" >
<administration_name>admin2</administration_name>
<creator>Leekha,Mohit</creator>
<creator>System</creator>
<object_category>earthenware</object_category>
<object_category>ABC</object_category>
<object_category>Remote</object_category>
<object_number>Z.567 & X-124</object_number>
<id>34</id>
<reproduction.reference>Z.567 & X-124(1).jpg</reproduction.reference>
<reproduction.reference>Z.567 & X-124(2).jpg</reproduction.reference>
<reproduction.reference>Z.567 & X-124(3).jpg</reproduction.reference>
</record>
</recordList>
</apiXML>
My Concerns:
So all i am asking is for suggestions how could i deal with scenario. Any suggestions are welcome
You'll need a couple of supporting classes to get that XML deserialized as is, as you didn't specify any other requirements.
Your database wold have tables for your record elements and all the collections within.
Those classes will hold an in memory representation of your XML. At the root will be the Api
class.
[XmlRoot("apiXML")]
public class Api
{
[XmlArray("recordList")]
[XmlArrayItem("record", typeof(Record))]
public List<Record> RecordList {get;set;}
}
[Serializable]
public class Record
{
[XmlAttribute("id")]
public int RecordId {get;set;}
[XmlElement("id")]
public int Id {get;set;}
[XmlElement("administration_name")]
public string AdministrationName {get;set;}
[XmlElement("object_number")]
public string ObjectNumber {get;set;}
[XmlElement("creator")]
public List<Creator> Creators {get;set;}
[XmlElement("object_category")]
public List<ObjectCategory> ObjectCategories {get;set;}
[XmlElement("reproduction.reference")]
public List<ReproductionReference> ReproductionReferences {get;set;}
[XmlElement("title")]
public List<Title> Titles {get;set;}
}
[Serializable]
public class Title:Child
{
[XmlAttribute("invariant")]
public bool Invariant {get;set;}
[XmlAttribute("lang")]
public string Culture {get;set;}
[XmlText]
public string Text {get;set;}
}
public class Child
{
[XmlIgnore]
public int ParentId {get;set;}
}
[Serializable]
public class Creator:Child
{
[XmlText]
public string Text {get;set;}
}
[Serializable]
public class ObjectCategory:Child
{
[XmlText]
public string Text {get;set;}
}
[Serializable]
public class ReproductionReference:Child
{
[XmlText]
public string Text {get;set;}
}
With the classes correctly annotated deserializing the XML only needs a couple of lines:
var ser = new XmlSerializer(typeof(Api));
var sr = new StringReader(xml);
var api = (Api) ser.Deserialize(sr);
In the variable api
we now have the in-memory object graph which you can project on a relational database schema. For normalized model you'll need the following tables:
Between these tables you'll need link tables that follow all the same convention like the one between Record and Creator:
I'll assume you know how to create those tables and create a connection to your database.
// use an SqlAdapter.Fill to get the below dataset call
// sqlAdapter.Fill(ds);
var ds = new DataSet();
// this is here so you can test without a database
// test mocking code
var recTable = ds.Tables.Add("Record");
recTable.Columns.Add("Id");
recTable.Columns.Add("AdministrationName");
recTable.Columns.Add("ObjectNumber");
var creTable = ds.Tables.Add("Creator");
creTable.Columns.Add("Id", typeof(int)).AutoIncrement = true;
creTable.Columns.Add("Text");
var reccreTable = ds.Tables.Add("RecordCreator");
reccreTable.Columns.Add("RecordId");
reccreTable.Columns.Add("CreatorId");
// end mocking code
// copy object graph and build link tables
foreach(var record in api.RecordList)
{
// each main record is created
var rtRow = recTable.NewRow();
rtRow["Id"] = record.Id;
rtRow["AdministrationName"] = record.AdministrationName;
rtRow["ObjectNumber"] = record.ObjectNumber;
recTable.Rows.Add(rtRow);
// handle each collection
foreach(var creator in record.Creators)
{
DataRow creRow; // will hold our Creator row
// first try to find if the Text is already there
var foundRows = creTable.Select(String.Format("Text='{0}'", creator.Text));
if (foundRows.Length < 1)
{
// if not, add it to the Creator table
creRow = creTable.NewRow(); // Id is autoincrement!
creRow["Text"] = creator.Text;
creTable.Rows.Add(creRow);
}
else
{
// otherwise, we found an existing one
creRow = foundRows[0];
}
// link record and creator
var reccreRow = reccreTable.NewRow();
reccreRow["RecordId"] = record.Id;
reccreRow["CreatorId"] = creRow["Id"];
reccreTable.Rows.Add(reccreRow);
}
// the other collections follow a similar pattern but is left for the reader
}
// now call Update to write the changes to the db.
// SqlDataAdapter.Update(ds);
That concludes the code and structure you'll need to store that SQL in an RDBMS database without losing information.