load-balancing sharding high-availability

Sharding and load balancer: how does it works?

I think i' confused about some concepts and that's why i ask you to help me with this:

We have a big web app used by many users (companies) which is deployed at user premises. But now we are going saas so in order to setup the app to behave like this we're doing some adaptations.

In order to handle our users we're working like this: we will have a db per user. For sure we need load balancing because we need many servers so i proposed a "sharded" architecture. My idea is to have web servers which are each one completely independent from each other. So we'd have all our users data split in say, 10 servers. So when a user logs in, in fact he would be connecting to server 4 for example. To maintain availability in fact one of those each servers would be a mini cluster of two - three servers whose dbs are replicated among them. We use memcache in each 'cluster'. We could even have a load balancing at this level we just don't think we need it because data/users is already split.

Some questions:

is this sharding ? notice that each cluster serves a given group of users and there is no relationship between clusters. We don't have a main db with federated dbs in each cluster but the same db structure, just that it is split among servers.
How do i redirect users when they arrive the first time (not authenticated)? isn't it here where load balancing applies ? but if users data is split among servers ? i've been thinking that here we'd have a "public/authentication cluster" which would deal with unauthenticated users, the "public" part of the site. And depending on a very simple memcache db it would redirect users to the respective cluster where their data lies. If so ..
how do i redirect them ? the only way i see it would be to send them to something like explained here. It's just that i didn't want server123.mysite.com.

I think the 'public/authentication cluster' is not well designed. Because i have 2-3 servers merely for serving the main site to all users (before authentication) while in the back i have 5-6 clusters and some of them might be sleeping. or the opposite: i have a heavily loaded cluster while the public one is sleeping because its sole task is to show the main page and handle the redirect-to-login process.

If all this works,

is this structure ok ? please assume that each user is heavy (in fact we have more than just a php running but also .NET and other services working, etc). I don't think of it as an overkill but just a structure to handle multiple users. Do you have other ideas ?

Thanks for your help.

Solution

Basically I would say, yes, this is sharding.

To avoid the requirement to have some "global" knowledge of which user lives on which cluster (you mentioned a memcache db therefore), you can use some sort of (consistent-)hashing.

To redirect users to the right cluster without requiring separate Hostnames, you can send a cookie to the client after successful authentication, which contains the identifier of the user's cluster. The cookie can be evaluated by the loadbalancer(s) to forward all further requests to the right cluster. This is common practice for "session-persistence" in load-balancing.

I think the term load-balancing applies only if there are really more than one (active/active) servers that are valid candidates to serve the requests for one user.