This question is theoretical as well as practical. Any results indicating useful resources on optimizing queries will be appreciated.
There is a large SQL database which stores a large amount of data stored in SQLXML fields. Querying the XML directly is not fast enough.
I have looked at some MSDN articles on optimizing SQLXML (i.e. http://msdn.microsoft.com/en-us/library/aa902661(SQL.80).aspx), and I am aware that indexing the searchable xml fields will increase search speed.
Can anyone recommend any additional resources for optimizing databases, either in this environment or in general, that are particularly useful? As always, I appreciate y'alls help
It depends on what you need to do to your XML. I have a similar setup where the table structure was made "generic" and anything product-specific was stashed into an XML field.
We also noticed the hard way that querying the XML is not exceptionally fast.... and using XML indices (which SQL Server offers, too) caused our database size to jump from roughly 1 GB to over 10 GB.....
What we're doing now for select elements from the XML is this:
With this, we can extract certain key values (e.g. "ProductID" or "ProductName") from the XML, and store them on the parent table as a column. It's computed, e.g. it's always up to date, and since it's also persisted, it's stored with the table data, e.g. it isn't constantly re-queried and re-computed. And since it's persisted with the table data, you can even put an index on it.
This works very well, but it only works for cases where you have isolated, single-value things that you want to fish out of the XML. For that case, it's a great solution, and it does speed up queries on those values by several orders of magnitude.