I'm running into a strange issue with Hibernate4 caching in a Grails 2.5.0 application that is serving as a platform for data migrated from a legacy system. The migration involves direct database inserts and removals (while testing migration SQL) of database records. These operations are causing pageload errors in the system because cached data is different from the actual state of the database. Stacktrace errors on a particular failed page load indicate missing records whose IDs are not currently referenced by anything in the database via foreign key. For example, one page fails to render with the following error:
018-02-27 10:16:32,495 http-bio-8080-exec-8 | ERROR StackTrace | superAdmin | Full Stack Trace:
org.hibernate.UnresolvableObjectException: No row with the given identifier exists: [com.tlc.worx.company.CompanyQuestion#48466]
at org.hibernate.UnresolvableObjectException.throwIfNull(UnresolvableObjectException.java:68)
at org.hibernate.event.internal.DefaultRefreshEventListener.onRefresh(DefaultRefreshEventListener.java:179)
at org.hibernate.event.internal.DefaultRefreshEventListener.onRefresh(DefaultRefreshEventListener.java:61)
at org.hibernate.internal.SessionImpl.fireRefresh(SessionImpl.java:1121)
at org.hibernate.internal.SessionImpl.refresh(SessionImpl.java:1094)
at org.hibernate.internal.SessionImpl.refresh(SessionImpl.java:1089)
at org.codehaus.groovy.grails.orm.hibernate.GrailsHibernateTemplate$10.doInHibernate(GrailsHibernateTemplate.java:342)
at org.codehaus.groovy.grails.orm.hibernate.GrailsHibernateTemplate.doExecute(GrailsHibernateTemplate.java:188)
at org.codehaus.groovy.grails.orm.hibernate.GrailsHibernateTemplate.refresh(GrailsHibernateTemplate.java:339)
at org.codehaus.groovy.grails.orm.hibernate.GrailsHibernateTemplate.refresh(GrailsHibernateTemplate.java:335)
at org.codehaus.groovy.grails.orm.hibernate.HibernateGormInstanceApi.refresh(HibernateGormInstanceApi.groovy:150)
at com.tlc.worx.company.CompanyQuestion.refresh(CompanyQuestion.groovy)
at com.tlc.worx.company.CompanyQuestion$refresh.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:110)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:114)
at com.tlc.worx.checklist.CompanyQuestionController$_index_closure1$_closure2$_closure3.doCall(CompanyQuestionController.groovy:49)
at com.tlc.worx.checklist.CompanyQuestionController$_index_closure1$_closure2$_closure3.doCall(CompanyQuestionController.groovy)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
--
A search of the database for the record number being referenced reveals it only in a completely unrelated field and a Tally table:
(Note the page hitting the error does not have to do with the missing record itself, only an object that could be related to a CompanyQuestion but is currently NOT related to it).
I suspected a Hibernate caching issue, especially since this has coincided with the removal of records. Furthermore, migrating the same database to another environment for testing does not give rise to the same error on the new environment--corroborating my theory that this is related to environment-specific caching. But oddly, a Tomcat7 restart (app runs on Tomcat) within the original environment does not cause the problem to go away. Hibernate configuration is as follows:
hibernate {
cache.use_second_level_cache = true
cache.use_query_cache = false
cache.region.factory_class = 'org.hibernate.cache.ehcache.SingletonEhCacheRegionFactory' // Hibernate 4
singleSession = true // configure OSIV singleSession mode
flush.mode = 'auto' // pre-Hibernate4 default behavior was auto, so we'll stick with that for now. See https://grails.org/2.4.3+Release+Notes
}
It's the restart not causing the issue to disappear that has me scratching my head--is this normal Hibernate behavior to cache things for eternity even between Tomcat restarts? Am I missing the mark entirely here? My next step is running the application in the first environment with the second level cache disabled, but I would like to get community feedback as well that I am at least on the right track regarding my theory--it seems crazy. Any recommendations/feedback appreciated!
Wanted to close this up as I eventually found the issue.
Our application utilises ElasticSearch for compiling a set of data to query for searches. We have always reindexed ElasticSearch on app restart, and not run into this issue before, however I learned that the reindex operation does not always do exactly what we want to do and can actually index old data as new records, leading to either duplicates or a mixed bag of good/bad records.
The error in my question occurred when hitting one of the "bad", stale records from a previous reindex. Purging all ElasticSearch indices prior to reindexing resolved the issue.