I have a web service that touches 50+ database tables (the database is heavily normalized) in order to create the response. The service returns all voyages modified within a date range specified by the client.
For performance reasons I want to avoid lazy-loading, fetching as much of the graph as possible before mapping to the response type.
I have broken my query down into smaller parts, using Nhibernate Fetch + ToFuture to eager load the data I need:
var fetchQuery = Session.Query<Voyage>()
.Fetch(v => v.VoyageStatus)
.FetchMany(v => v.VoyageLocations)
.Where(v => voyageIds.Contains(v.VoyageID))
.ToFuture();
Session.Query<Ship>()
.FetchMany(s => s.ShipCsos)
.Where(s => shipIds.Contains(s.ShipID))
.ToFuture();
Session.Query<Ship>()
.Fetch(s => s.ShipFlagCode)
.ThenFetch(sf => sf.Country)
.Fetch(s => s.ShipType)
.Fetch(s => s.ShipStatus)
.Fetch(s => s.ShipSource)
.Fetch(s => s.ShipHullType)
.Fetch(s => s.ShipLengthType)
.Fetch(s => s.ShipBreadthType)
.Fetch(s => s.ShipSpeedType)
.Fetch(s => s.ShipPowerType)
.FetchMany(s => s.ShipAttributes)
.ThenFetch(sa => sa.ShipAttributeName)
.Where(s => shipIds.Contains(s.ShipID))
.ToFuture();
//[Lots of similar Session.Query<X>...ToFuture() calls]
return fetchQuery.ToList();
Problem
I'm starting to hit the SQL Server parameter limit of 2100 when the date range reaches a certain span. I thought the parameter limit only applied to a single IN clause, but it apparently applies to a query as a whole; using Futures I end up with a single SQL query with one SELECT statement for each ToFuture call (each SELECT statement contains a moderately sized IN clause).
Is there a workaround for this? For instance, is there a way to send of smaller groups of Futures to stay within the parameter limit and still hydrate the entities?
I've tried doing a fetchQuery.ToList() call halfway through the Futures. This keeps the parameter limit exceptions at bay, but the entities are not hydrated properly according to Nhibernate Profiler (properties are lazy-loaded).
Any pointers would be much appreciated!
You may in fact have better keeping lazy-loading for performances reasons with NHibernate, even in your case.
(Wanting to switch to eager loading for performance reasons may be a sign of not knowing how to optimize lazy-loading with NHibernate. NHibernate can avoid the classical n+1 performance issue of lazy loads.)
(Even in your case.)
Lazy-loading with NHibernate can be extremely performing. It tends to keep a good balance between runtime performances and development performances. Efficient to execute, and efficient to develop and maintain.
Adjust the lazy loading batch-size
property on your entities and collections mappings.
(Linked reference give a detailed explanation of how it works.)
<class name="YourEntity" batch-size="20">
...
<set name="SomeChildren" batch-size="15" ...>
Configuring that causes NHibernate to not only load related entities/collections when they are accessed, but also to include in the loading query up to batch-size - 1
related entities/collections it has tracked in its session first level cache. Of course, adjust batch-size
values for matching your usual cases loading cardinalities.
This is a very powerful mechanism. It causes most of what would have been subsequent lazy-load calls to be already there, loaded from a single call to the DB, usable without additional round-trips to DB.
(Only in some extreme corner cases where the session is badly used, causing it to reference many entities unrelated to your current work and having pending lazy-loads, lazy-loading batching can be badly defeated. This occurs since in such a situation, it may initialize too many pending lazy-loads unrelated to your work.)
You may globally configure a default batch size for all lazy loads of collections and entities with the global configuration parameter default_batch_fetch_size
(to be put in hibernate.cfg.xml file, or to set through configuration.SetProperty(Environment.DefaultBatchFetchSize, ...)
).
With current NHibernate state, this incurs a memory cost, because NHibernate prepares the batching queries at session factory built. (An option has been ported from Hibernate for building those queries "on demand" rather than preparing them in advance, available since NHibernate 5.4. See #2959, use batch_fetch_style
option value Dynamic
if needed, or configuration.SetProperty(Environment.BatchFetchStyle, BatchFetchStyle.Dynamic.ToString())
.)
In contrast, eager-loading can quickly incur "bloat code" and additional work for fine tuning and maintaining the required eager loads for each case. And failing to maintain them optimized surely lead to worse performances than lazy-loading with NHibernate. Even optimized eager loading may cause a lot more data than required to be loaded.
EF up to its 6 version was doing that. (I have not checked its Core version.) Its eager-loading querying strategy was causing eager-loaded result-sets to contain duplicated data, as soon as the "root" entities set were having many references to the same eager-loaded entity children instances. (And all this though in my current knowledge state, I tend to consider EF to be more usable than NHibernate about eager-loading. But that is quite a long time I have not considered and studied eager-loading with NHibernate, its lazy-loading being way more efficient than EF's one.)
(To be fair, lazy-loading may also cause some amount of bloat code: if the entities are to be used after closing the session (which is not recommended: consider using DTO/viewmodels/whatever), pending lazy loads may cause failures. To avoid them, NHibernateUtil.Initialize
should be called on the entities lazy associations which are needed, before closing the session. And if you want to leverage async, you will also have to call the async version of Initialize
before accessing the associations.)
NHibernate features a build in support of second level caching. Second level caching allows caching data and sharing them among different NHibernate sessions.
With eager-loading, the second level cache can not be leveraged for loading your dependent entities from memory (in case you use a memory cache provider for second level cache). Second level cache is best exploited with lazy-loading.
It is a full featured data cache, handling invalidation of data automatically. (Provided you work with transactions. If deadlocks deter you from doing so, maybe should you consider to enable read committed snapshot
mode on SQL Server, but this is a bit off-topic. Without explicit transactions, the cache will be disabled as soon as you start updating entities in your application.)
You only need to enable it in global configuration (cache.provider_class
, cache.use_second_level_cache
), and declare in your mapping what is cacheable (on entities and/or entity collections, with <cache usage="..." />
tag). Use cache regions for setting expiry. You may even cache queries (cache.use_query_cache
, and specifying on queries if they are cacheable). See here for an example.
Of course, for your case, if your data is not eligible to caching, this feature is not useful. (It may be the case if other processes do update your data, while you do not wish to use and configure the SysCache2 provider which can get notified by sql server of any data changes.)
A well accepted solution to your trouble implies quite more work. Ideally your front application should work with a de-normalized copy of your data, easy and efficient to query, while your back-office keeps a normalized database.