I have a large XML file that I need to import and parse into tabular structure ("flatten") in SQL Server. By "large" I mean a file that is around 450 MB, contains up to 6-7 nested levels and lots of elements, ~300.
I tried parsing the file both with OPENXML and Xml.Nodes. Both of the methods are slow. A partial query which reads a parent element and it's nested grandchildren takes several minutes if not dozens to run.
I tried using the SQLXML Bulk Load method. Unfortunately I couldn't - because the file isn't structured properly. There is an element which is logically a parent element which isn't nested as a parent physically.
Do you think the only posiblle solution left is to use .NET or Java? Is there something I'm missing?
I would strongly prefer a dynamic solution, to some degree. I don't want the SQL Server developers to relay on a procedural, compiled, code that they have no control/knowledge about - in the event that some changes will occur (in the XML structure).
Thank you very much.
OK. I created an XML Index on the XML data column. (Just a primary one for now).
A query that took ~4:30 minutes before takes ~9 seconds now! Seems that a table that stores the XML with a proper XML Index and the parsing the data with the xml.nodes() function are a feasible solution.
Thank you all.