Search code examples
c#entity-frameworklinq

Join tables that contain no relationship


I have the following database:

enter image description here

  1. Table Messages:

    • Responsible for storing messages sent do different email addresses.
  2. Table Emails:

  3. Table Contacts:

    • Used to store user's contacts. (stores mark, John, Tom, etc...)
  4. Table ContactEmails

    • Need this table because a contact may have multiple email addresses. (Stores the ids found in table Emails)

Anyways here is the question: I want to create a query that will select all the messages that are sent during specific dates. I want to include the contact name in the query if it exists. I have crated the following query but it is to slow:

Func<string, Contact> tryGetContact = (email)=>{
    var contactEmail = db.ContactEmails.FirstOrDefault(x=>x.IdEmail==email);
    if(contactEmail==null)
       return null;
    return contactEmail.Contact; // navigational property created by entity framework.
};

var query = from msg in db.Messages
            join email in db.Emails on msg.ToEmail equals email.Email
            where msg.Date < "some date" && msg.Date > "some other date" 
            select new
            {
                MessageSubject = msg.Subject,
                ToEmail = email.Email,
                Contact = tryGetContact(email.Email) // this slows down the query!
            };

In order to make my query execute faster do I need to store all the contacts in a dictionary and separate this query into 2 queries?

Storing all the contacts in a dictionary will make things much more efficient. But retriving all the contacts from the database where I do not need most of them makes me feel I am wasting resources.


Solution

  • It would be easier to determine what is slowing down the query if we could see the actual SQL generated by your Linq query. However, I guess that it might have something to do with your tryGetContact Func that is not sharing the same context as the main part of the query.

    So if I am right, every time you call tryGetContact(email.Email) a new complete query will be executed because of this line:

    var contactEmail = db.ContactEmails.FirstOrDefault(x=>x.IdEmail==email);
    

    In that case the db.ContactEmails is not part of the join in the SQL query so it is reexecuted each time.

    So what I would do is to add another join to include the ContactEmails on the Linq (and subsequent SQL) query. This should looks like that:

    var query = from msg in db.Messages
                join email in db.Emails on msg.ToEmail equals email.Email
                join contactEmail in db.ContactEmails on contactEmail.IdEmail equals email.Email
                where msg.Date < "some date" && msg.Date > "some other date" 
                select new
                {
                    MessageSubject = msg.Subject,
                    ToEmail = email.Email,
                    Contact = (contactEmail==null) ? contactEmail.Contact : null,
                };
    

    If this doesn't work you might want to execute the db.ContactEmails only once for all your tryGetContact calls and store the result in a Collection (or HashSet for better performance).