umbraco lucene.net umbraco7 umbraco6 examine

Differences in Umbraco cache structure?

Ok, So I have just spent the last 6-8 weeks in the weeds of Umbraco and have made some fixes/Improvements to our site and environments. I have spent a lot of that time trying to correct lower level Umbraco caching related issues. Now reflecting on my experience and I still don't have a clue what the conceptual differences are between the following:

Examine indexes
umbraco.config
cached xml file in memory (supposedly similar to umbraco.config)
CMSContentXML Table

Thanks Again,

Devin

Solution

Examine Indexes are index of umbraco content

So when ever you create/update/delete content, the current content information will be indexed

This index are use for searching - under the hood, it is lucene index umbraco backend use these index for searching

You can create your own index if you want

more info checkout, Overview & Explanation - "Examining Examine by Peter Gregory"

umbraco.config and cached xml in memory are really the same thing.

The front end UmbracoHelper api get content from the cache not the database - the cache is from the umbraco.config

CMSContentXML contains each content's information as xml

so essentially this xml represent all the information of a node content

So in a nutshell they represent really 3 things:

examine is used for searching
umbraco.config cached data - save round trip to DB
CMSContentXML stores full information of a content

Edit to include better clarification from Robert Foster comment and the UmbracoHelper vs ExamineManager

For the umbraco.config and CMSContentXML table, @robert-foster commented

umbraco.config stores the most recent version of all published content only; the in-memory cache is a cached version of this file; and the cmscontentxml table stores a representation of all content and is used primarily for preview mode - it is updated every time a content item is saved. IIRC it also stores a representation of other content types

Regards to UmbracoHelper vs ExamineManager

UmbracoHelper api mostly get it's content from the memory cache - IMO it works best when locating direct content, such as when you know the id of content you want, you just call Umbraco.TypedContent(id)

But where do you get the id you want in the first place? or put it another way, say if you want to find all content's property Title which contain a word "Test", then you would use Examine to search for it. Because Examine is really lucene wrapper, so it is going to be fast and efficient

Although you can traverse tree by method such as Umbraco.TypedContent(id).Children then use linq to filter the result, but I think this is done in memory using linq-to-object, so it is not as efficient and preferment as lucene

So personally I think:

use Examine when you are searching (locating) for content - because you can use the capability of a proper search engine lucene
once you got the ids from the search result, use UmbracoHelper to get the full publish content representation of the content id into strong type model and work with the data.

one thing @robert-foster mention in the comment which, I did not know is that UmbracoHelper provides Search method which is a wrapper around the examine, so use that if more familiar with that api.

Lastly, if any above statement are wrong or not so correct, help me clarify so that anyone look at it later will not get it wrong, thanks all.