Search code examples
c#linq-to-xmlcompact-frameworkxmldocumentxmlnodelist

How can XML be parsed into custom classes?


I have strings containing xml that I need to loop through, parse, and build instantiations of custom classes out of, for inserting into my database.

The pseudocode for what I need would be something like this:

private List<SiteMapping> ExtractSiteMappingsFromXML(String xmlData)
{
    List<SiteMapping> sitemaps = new List<SiteMapping>();
    // parse xmlData, dynamically instantiating a SiteMapping class for each SiteMapping "record" 
in the xml
    foreach (record rec in xmlData)
    {
        SiteMapping sm = new SiteMapping();
        sm.Id = //current id found in the xml data
        sm.siteName = // current site name found in the xml data
        . . .
        sitemaps.Add(sm);
    }
    return sitemaps;
}

The caller of ExtractSiteMappingsFromXML() would then loop through the returned list of SiteMapping, and inserts records into the database.

Based on an idea I got from here, I'm thinking something like this might be possible:

XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlData);
XmlNodeList _ids = doc.GetElementsByTagName("Id");
XmlNodeList _sitenames = doc.GetElementsByTagName("siteName");
. . . // add an XmlNodeList for each element

And then I could loop through the XmlNodeLists, something like:

for (int i = 0; i < _ids.Count; i++)
{
    SiteMapping sm = new SiteMapping();
    sm.Id =_ids[i];
    sm.siteName = _sitenames[i];
    . . . // add the rest
    sitemaps.Add(sm);
}

Is this sensible? Will this still work if one or more of the elements has blank values? IOW, if an element is sometimes blank, will it add a blank value to the corresponding XmlNodeList (that's what I'd want), or would it add nothing, and thus create a mismatch?

Is there, perhaps, an elegant linqy (LINQ-to-XML) way of doing this?

Note: This is a Compact Framework app, and thus suffers from those limitations, implementation-wise.

UPDATE

I thought maybe this code:

XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(omnivore);
List<SiteQuery> sitequeries =
  (from sitequery in xmlDoc.Descendants("SiteQuery")
   select new SiteQuery
   {
       Id = sitequery.Element("Id").Value,
       UPC_PackSize = sitequery.Element("UPC_PackSize").Value,
       UPC_Code = sitequery.Element("UPC_Code").Value,
   }).ToList<SiteQuery>();

...which I adapted from here, would do the trick, but I get, "No overload for method 'Descendants' takes 1 arguments"

UPDATE 2

I tried this (XDocument instead of XmlDocument):

XDocument xmlDoc = new XDocument();
XDocument.Parse(omnivore);
List<SiteQuery> sitequeries =
 (from sitequery in xmlDoc.Descendants("SiteQuery")
  select new SiteQuery
  {
      Id = Convert.ToInt32(sitequery.Element("Id").Value),
      UPC_PackSize = Convert.ToInt32(sitequery.Element("UPC_PackSize").Value),
      UPC_Code = sitequery.Element("UPC_Code").Value
  }).ToList<SiteQuery>();

It seemed bizarre to me that I had to use "XDocument.Parse(omnivore);" instead of "xmlDoc.Parse(omnivore);", but the compiler informed me that that was necessary...?!?

Not too surprisingly, sitequeries had a count of 0 after this code ran, though...

UPDATE 3

Perhaps Nitin Aggarwal's code would work (it does compile), but at runtime I get:

System.InvalidOperationException was unhandled
  _HResult=-2146233079
  _message=There is an error in XML document (1, 2).
  HResult=-2146233079
  IsTransient=false
  Message=There is an error in XML document (1, 2).
  Source=System.Xml
  StackTrace:
       at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
       at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader). . .

It may be just that the XML is bad; but also, I don't know if these jet-age classes are available to me in Compact Framework (I've got it compiling in a .NET 4.5.1 test app).

UPDATE 4

Vishal, to answer your question, here's the XML that I'm trying to parse:

<ArrayOfSiteQuery xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/CStore.DomainModels.HHS"><SiteQuery><Id>00006000002</Id><UPCPackSize>1</UPCPackSize><UPC_Code>00006000002</UPC_Code><crvId></crvId><dept>8</dept><description>ZZ</description><openQty>0.0</openQty><packSize>1</packSize><subDept>80</subDept><unitCost>1.25</unitCost><unitList>5.0</unitList><vendorId>CONFLICT</vendorId><vendorItem>123456</vendorItem></SiteQuery>
.  . . (beaucoup other SiteQuery "records")
<SiteQuery><Id>5705654</Id><UPCPackSize>1</UPCPackSize><UPC_Code>5705654</UPC_Code><crvId></crvId><dept>2</dept><description>what do you want</description><openQty>0.0</openQty><packSize>1</packSize><subDept>0</subDept><unitCost>0.55</unitCost><unitList>1.62</unitList><vendorId></vendorId><vendorItem></vendorItem></SiteQuery></ArrayOfSiteQuery>

Do I need to first strip out the preliminary bits (<ArrayOfSiteQuery xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/CStore.DomainModels.HHS">) and the "close tag" at the end ()?

BTW, "CStore.DomainModels.HHS" is in the server app, and the client presumably has no clue what that is.

UPDATE 5

After looking at the xml in the string, I saw that its contents did not match my custom class (it was the same data, but some of the member names differed, and they were out of order with each other), so I changed the custom class to match the xml:

public class SiteQuery
{
    public int Id { get; set; }
    public int UPCPackSize { get; set; }
    public String UPC_Code { get; set; }
    public String crvId { get; set; }
    public int dept { get; set; }
    public String description { get; set; }
    public Double openQty { get; set; }
    public int packSize { get; set; }
    public int subDept { get; set; }
    public Decimal unitCost { get; set; }
    public Decimal unitList { get; set; }
    public String vendorId { get; set; }
    public String vendorItem { get; set; }
}

...but I still get that same InvalidOp exception...

UPDATE 6

Even after I stripped out the preamble and postamble from the xml, so that it only contains SiteQuery "xml records", saved that as a file and loaded it up for processing:

String testData = File.ReadAllText("siteQueryTest.txt");
XmlSerializer serializer = new XmlSerializer(typeof(List<SiteQuery>));
XmlReader reader = XmlReader.Create(new StringReader(testData));
List<SiteQuery> siteQueries;
siteQueries = (List<SiteQuery>)serializer.Deserialize(reader);

...I still get the runtime error:

System.InvalidOperationException was unhandled
  _HResult=-2146233079
  _message=There is an error in XML document (1, 2).
  HResult=-2146233079
  IsTransient=false
  Message=There is an error in XML document (1, 2).
  Source=System.Xml
  StackTrace:
       at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
       at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader)
       at Sandbox.Form1.button56_Click(Object sender, EventArgs e) in c:\HoldingTank\Sandbox\Form1.cs:line 2061
    . . .
       StackTrace:
            at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderList1.Read3_ArrayOfSiteQuery()
       InnerException: 

How can this be? The contents of "testData" string is:

<SiteQuery><Id>00006000002</Id><UPCPackSize>1</UPCPackSize><UPC_Code>00006000002</UPC_Code><crvId></crvId><dept>8</dept><description>ZZ</description><openQty>0.0</openQty><packSize>1</packSize><subDept>80</subDept><unitCost>1.25</unitCost><unitList>5.0</unitList><vendorId>CONFLICT</vendorId><vendorItem>123456</vendorItem></SiteQuery>
. . . // a ton of other StieQuery records
<SiteQuery><Id>5705654</Id><UPCPackSize>1</UPCPackSize><UPC_Code>5705654</UPC_Code><crvId></crvId><dept>2</dept><description>what do you want</description><openQty>0.0</openQty><packSize>1</packSize><subDept>0</subDept><unitCost>0.55</unitCost><unitList>1.62</unitList><vendorId></vendorId><vendorItem></vendorItem></SiteQuery>

How could there be "an error in XML document (1, 2)"?

Line 1, column 2 is "S"; what's the matter with "S"? I presume nothing, so what is it expecting, as it also didn't like "A" (from <ArrayOfSiteQuery)?

UPDATE 7

I prepended:

<?xml version="1.0" encoding="UTF-8"?>

...to the file, and I get the same error, but now it's at 1,40 (still the "S" in the first "<SiteQuery>").


Solution

  • you can try this:

               XmlSerializer serializer = new XmlSerializer(typeof(List<SiteMapping>)); 
                XmlReader reader = XmlReader.Create(new StringReader(xmlData));
                List<SiteMapping> siteMappings;
                siteMappings = (List<SiteMapping>)serializer.Deserialize(reader);
    

    Please let me know if this works