Entity Framework, Is there any performance improvement on querying at repository instead at service layer?
Does Entity Framework has any performance improvement engine like SQL? Does it knows that all next 3 pseudo code is just 1 DB call, or it first really gets someFilteredContidionA, and then applys the second condition?
CASE A)
clients.Where( someFilterConditionA);
clients.Where( someFilterConditionB);
return clients;
CASE B)
clients.Where( someFilterConditionA).Where( someFilterConditionB);
return clients;
CASE C)
clients.Where( someFilterConditionA AND someFilterConditionB);
return clients;
EF does have optimizations, there should be no difference at all between your 3 cases. You can easily run an SQL Profiler to inspect any query that EF sends through to the DB.
As far as execution goes: The cost is at execution, regardless of where that is triggered. This is commonly wherever calls the .ToList / .First / .Any / .Count etc. The biggest performance mistake I see people make looks like:
var results = dbContext.ParentEntities.Where(x=> x.SomeCriteria == true).ToList();
// later down the call stack...
var results = repostitory.GetParents(criteria);
if (results.Count > 0)
// never actually use results, or do something like results.FirstOrDefault(), etc.
Then there are the sins of lazy loading or eager loading inappropriately, returning entire entity graphs where only a few fields are actually used.
The pattern I follow is to use deferred execution /w a repository pattern. What this means is that my repositories, with few exceptions, always return IQueryable<TEntity>
. Even methods where only 1 entity is expected. (I.e. GetById()) The repository manages low-level rules such as Authorization and things like IsActive for Soft-delete systems, and temporal considerations for historical data, but I let the business logic further refine the query as needed in terms of additional criteria, defining which fields will be selected, and how the data will be retrieved before executing the deferred query.
The benefits of this approach is that my repository serves as an easy cut-off point for my unit tests, I just need to return back things like Lists of entities from a mocked repository to exercise my business logic. From a maintenance point of view my repository is very thin. No need for lots of similar methods for different criteria or expected return types, or passing vague Expression Trees and such into functions to relay filtering. From a performance point of view, giving the business logic the control over how the data is filtered and consumed gives me a lot of flexibility and helps ensure that queries are easily optimized. My business logic can:
All without touching the repository definition. The repository becomes a thin service class to facilitate testing.
The argument made against this pattern is that having a repository with explicit and static queries makes for a more deliberate target to tune indexes for. My counter to that is that index tuning should be based on how a system is used, and not something that should try to be optimized for early. Static repositories and queries means you're having a lot of extra code to maintain and in many cases running sub-optimal queries. (I.e. returning IEnumerable<Entity>
) Either a lot of methods that do essentially the same thing returning different data structures, or fewer methods returning entire entities, where only a few details were actually needed. IMO: Build the system, use the system, tune the system.