I use joins less and less these days, and rely on navigation properties as much as possible.
Here is a very basic schema:
Users:
Id int
Surname string
FirstName string
Enrolments:
Id int
UserId int (foreign key for Users.Id)
EnrolmentName string
StartDate datetime
EndDate datetime
A user may have 0, 1, or many enrolments related to them in the Enrolments
table.
Now in a query, I want to select all user rows, and also the EnrolmentName column from their first enrolment. I like to keep my queries as lean as possible, and only select what I need to from the database. I don't like returning entire entities if I don't have to.
Here is my query (where I select the relevant data directly into a view model).
IList<UserVm> rows = db.Users
.Select(
x => new UserVm
{
Id = x.Id,
Surname = x.Surname,
FirstName = x.FirstName,
FirstEnrolmentName = x.Enrolments.OrderBy(o => o.StartDate).FirstOrDefault().EnrolmentName
}
)
.ToList();
The issue I have is that it works, but I think it should fail where I encounter a User which has no enrolments. I would expect the following line to complain that the EnrolmentName
column could not be found on a null object.
FirstEnrolmentName = x.Enrolments.OrderBy(o => o.StartDate).FirstOrDefault().EnrolmentName
What actually happens is that it will leave the EnrolmentName
column as NULL where there are no Enrolment
records for that User.
I'd like to know:
Why this query works and doesn't cause an error for a student with 0 Enrolments
.
Is there a cleaner way of writing the query so that it's still only 1 hit to the database, and still only selects the required subset of columns, and not all of them.
You're right, you should select only the properties you actually plan to use, thus transporting as few data as possible to your local process.
LINQ knows two kinds of statements: the ones that compose the query and the ones that will execute the query. The composers are LINQ methods that return an IQueryable<...>
(or IEnumerable<...>
), the executors are the functions that do not return an IQueryable
, but a TResult
. Examples are ToList
, FirstOrDefault
, Any
, Max
,...
You should always make sure that the executor is the last one of your statement, unless you are absolutely sure that the statements that follow the executor won't limit the amount of data.
db.Users.Select(user => new UserVm
{
...
FirstEnrolMentName = user.Enrolments
.OrderBy(enrolment => enrolment.StartDate)
.FirstOrDefault()
.EnrolmentName,
});
I wonder if this would also work if you've got a User without any Enrolments
. My first guess would be that FirstOrDefault
would return null, and thus you would have an ArgumentNullException
when you want to get the EnrolmentName
Another problem is that you first select one complete Enrolment
, after which you throw away all Enrolment properties except the name. It would be better to select only the properties you plan to use and then use FirstOrDefault
:
var result = db.Users.Select(user => new UserVm
{
...
FirstEnrolMentName = user.Enrolments
.OrderBy(enrolment => enrolment.StartDate)
.Select(enrolment => enrolment.EnrolmentName)
.FirstOrDefault(),
});