Search code examples
umbracolucene.netumbraco7umbraco6examine

Differences in Umbraco cache structure?


Ok, So I have just spent the last 6-8 weeks in the weeds of Umbraco and have made some fixes/Improvements to our site and environments. I have spent a lot of that time trying to correct lower level Umbraco caching related issues. Now reflecting on my experience and I still don't have a clue what the conceptual differences are between the following:

  • Examine indexes
  • umbraco.config
  • cached xml file in memory (supposedly similar to umbraco.config)
  • CMSContentXML Table

Thanks Again,

Devin


Solution

  • Examine Indexes are index of umbraco content

    So when ever you create/update/delete content, the current content information will be indexed

    This index are use for searching - under the hood, it is lucene index umbraco backend use these index for searching

    You can create your own index if you want

    more info checkout, Overview & Explanation - "Examining Examine by Peter Gregory"

    umbraco.config and cached xml in memory are really the same thing.

    The front end UmbracoHelper api get content from the cache not the database - the cache is from the umbraco.config

    CMSContentXML contains each content's information as xml

    so essentially this xml represent all the information of a node content

    So in a nutshell they represent really 3 things:

    1. examine is used for searching
    2. umbraco.config cached data - save round trip to DB
    3. CMSContentXML stores full information of a content

    Edit to include better clarification from Robert Foster comment and the UmbracoHelper vs ExamineManager

    For the umbraco.config and CMSContentXML table, @robert-foster commented

    umbraco.config stores the most recent version of all published content only; the in-memory cache is a cached version of this file; and the cmscontentxml table stores a representation of all content and is used primarily for preview mode - it is updated every time a content item is saved. IIRC it also stores a representation of other content types

    Regards to UmbracoHelper vs ExamineManager

    UmbracoHelper api mostly get it's content from the memory cache - IMO it works best when locating direct content, such as when you know the id of content you want, you just call Umbraco.TypedContent(id)

    But where do you get the id you want in the first place? or put it another way, say if you want to find all content's property Title which contain a word "Test", then you would use Examine to search for it. Because Examine is really lucene wrapper, so it is going to be fast and efficient

    Although you can traverse tree by method such as Umbraco.TypedContent(id).Children then use linq to filter the result, but I think this is done in memory using linq-to-object, so it is not as efficient and preferment as lucene

    So personally I think:

    1. use Examine when you are searching (locating) for content - because you can use the capability of a proper search engine lucene
    2. once you got the ids from the search result, use UmbracoHelper to get the full publish content representation of the content id into strong type model and work with the data.

    one thing @robert-foster mention in the comment which, I did not know is that UmbracoHelper provides Search method which is a wrapper around the examine, so use that if more familiar with that api.

    Lastly, if any above statement are wrong or not so correct, help me clarify so that anyone look at it later will not get it wrong, thanks all.