sql asp.net-mvc repository disconnected-environment

MVC 3/EF repository pattern and proper data access

Being rather new to MVC 3 and EF, I'm trying to understand the best architectural approach to developing an application for my company. The application will be a large-scale application that potentially handles hundreds of users at the same time, so I want to make sure I understand and am following proper procedures. So far, I've determined that a simple repository pattern (such as Controller -> Repository -> EF) approach is the best and easiest to implement, but I'm not sure if that is definitely the best way to do things. The application will basically return data that is shown to a user in a devexpress grid and they can modify this data/add to it etc.

I found this article and it is rather confusing for me at this time, so I'm wondering if there is any reason to attempt to work with a disconnected EF and why you would even want to do so: http://www.codeproject.com/Articles/81543/Finally-Entity-Framework-working-in-fully-disconne?msg=3717432#xx3717432xx

So to summarize my question(s):

Is the code below acceptable?
Should it work fine for a large-scale MVC application?
Is there a better way?
Will unnecessary connections to SQL remain open from EF? (SQL Profiler makes it look like it stays open a while even after the using statement has exited)
Is the disconnected framework idea a better one and why would you even want to do that? I don't believe we'll need to track data across tiers ...

Note: The repository implements IDisposable and has the dispose method listed below. It creates a new instance of the entity context in the repository constructor.

Example Usage:

Controller (LogOn using Custom Membership Provider):

if (MembershipService.ValidateUser(model.UserName, model.Password))
{
    User newUser = new User();                    

    using (AccountRepository repo = new AccountRepository())
    {
         newUser = repo.GetUser(model.UserName);
         ...
    }
}

Membership Provider ValidateUser:

public override bool ValidateUser(string username, string password)
{
    using (AccountRepository repo = new AccountRepository())
    {
        try
        {
            if (string.IsNullOrEmpty(password.Trim()) || string.IsNullOrEmpty(username.Trim()))
                return false;

            string hash = FormsAuthentication.HashPasswordForStoringInConfigFile(password.Trim(), "md5");

            bool exists = false;

            exists = repo.UserExists(username, hash);

            return exists;
        }catch{
            return false;
        }
    }
}

Account Repository Methods for GetUser & UserExists:

Get User:

public User GetUser(string userName)
    {
        try
        {
            return entities.Users.SingleOrDefault(user => user.UserName == userName);
        }
        catch (Exception Ex)
        {
            throw new Exception("An error occurred: " + Ex.Message);
        }           
    }

User Exists:

 public bool UserExists(string userName, string userPassword)
 {
        if (userName == "" || userPassword == "")
            throw new ArgumentException(InvalidUsernamePassword);

        try
        {
            bool exists = (entities.Users.SingleOrDefault(u => u.UserName == userName && u.Password == userPassword) != null);
            return exists;
        }
        catch (Exception Ex)
        {
            throw new Exception("An error occurred: " + Ex.Message);
        } 
    }

Repository Snippets (Constructor, Dispose etc):

    public class AccountRepository : IDisposable
    {
         private DbContext entities;

         public AccountRepository()
         {
            entities = new DbContext();
         }

         ...

         public void Dispose()
         {
            entities.Dispose();
         }
    }

Solution

What's acceptable is pretty subjective, but if you want to do proper data access I suggest you do NOT use the repository pattern, as it breaks down as your application gets more complex.

The biggest reason is minimizing database access. So for example look at your repository and notice the GetUser() method. Now take a step back from the code and think about how your application is going to be used. Now think about how often you are going to request data from the user table without any additional data. The answer is almost always going to be "rarely" unless you are creating a basic data entry application.

You say it your application will show a lot of grids. What data is in that Grid? I'm assuming (without knowing your application domain) that the grids will combine user data with other information that's relevant for that user. If that's the case, how do you do it with your repositories?

One way is to call on each repository's method individually, like so:

var user = userRepository.GetUser("KallDrexx");
var companies = companyRepository.GetCompaniesForUser(user.Id);

This now means you have 2 database calls for what really should be just one. As your screens get more and more complex, this will cause the number of database hits to increase and increase, and if your application gets significant traffic this will cause performance issues. The only real way to do this in the repository pattern is to add special methods to your repositories to do that specific query, like:

public class UserRepository
{
    public User GetUser(string userName)
    {
        // GetUser code          
    }

    public User GetUserWithCompanies(string userName)
    {
        // query code here
    }
}

So now what happens if you need users and say their contact data in one query. Now you have to add another method to your user repository. Now say you need to do another query that also returns the number of clients each company has, so you need to add yet another method (or add an optional parameter). Now say you want to add a query that returns all companies and what users they contain. Now you need a new query method but then comes the question of do you put that in the User repository or the Company repository? How do you keep track of which one it's in and make it simple to choose between GetUserWithCompany and GetCompanyWithUsers when you need it later?

Everything gets very complex from that point on, and it's those situations that have made me drop the repository pattern. What I do now for data access is I create individual query and command classes, each class represents 1 (and only 1) query or data update command to the database. Each query class returns a view model that only contains the data I need for one specific user usage scenario. There are other data access patterns that will work too (specification pattern, some good devs even say you should just do your data access in your controllers since EF is your data access layer).

The key to doing data access successfully is good planning. Do you know what your screens are going to look like? Do you know how users are going to use your system? Do you know all the data that is actually going to be on each screen? If the answer to any of these is no, then you need to take a step back and forget about the data layer, because the data layer is (or should be for a good application) determined based on how the application is actually going to be used, the UI and the screens should not be dependent on how the data layer was designed. If you don't take your UI needs and user usage scenarios into account when developing the data access, your application will not scale well and will not be performant. Sometimes that's not an issue if you don't plan on your site being big, but it never hurts to keep those things in mind.