I'm relatively new to Java EE and have already began to hear about the many different types of systems that can be clustered:
Wikipedia defines "clustering" as:
A computer cluster consists of a set of loosely connected computers that work together so that in many respects they can be viewed as a single system.
I'm wondering how clustering works for each of these "cluster types/methods" (mentioned above) and how they relate to one another.
For instance, if one could benefit from having a clustered application, he/she would probably put them on a clustered app server and then throw a cluster manager into the mix (again, like Terracotta).
But because the phrase "clustering" seems to be used in vague/ambiguous ways, I'm not seeing how each of these ties into the others ones, or if they even do. Thanks in advance to any brave StackOverflowers out there who can help me make sense of this interwoven terminology!
To me, clustering implies a number of qualities to a system but it boils down to fault tolerance -- server, networking, and data persistence. There are both loosely and tightly coupled systems and all flavors in between. Tightly coupled systems have the clustering performed at level close to the hardware. Many of the old clustering systems were more tightly coupled with the applications often not recognizing that they were clustered.
Loosely coupled systems are the norm these days with a large degree of the fault tolerance accomplished at a software level entirely. Systems in the cluster only share network connectivity to be able to accomplish fault tolerance. Usually there are some specialized load balancers which route requests to the various cluster servers using specialized hardware (sometimes just software) to accomplish this.
All of the examples you mentioned have some sort of "clustering". It is going to take a very long answer to describe the details about how each of the architectures accomplish this. For me, the differences are what comes "for free" when you use the architecture, and how much work you will have to do to get it to work optimally.
How you mix and match the solutions you've mentioned depends on what your architecture looks like and your requirements. You can have a Terracotta store for local high speed persistence and the cloud for the rest. You can use Glassfish as your application server and utilize Terracotta as your persistence layer.
Here are my thoughts about the technologies you listed:
Cloud applications are the easiest to work with obviously. Your only job from an architecture standpoint is to pick a good cluster provider. Certainly Amazon and Google will do it "right" in terms of fault tolerance and data integrity. There are many other players that probably do it "good enough" and are cheaper. You program to their APIs which come with their own set of limitations and expenses. One problem with cloud applications is that it most likely will be very hard to switch to a new one. Again, you might have some [large] portion of your application running on cloud servers and have some local systems for your higher latency requirements. The trend is to put most production functions in the cloud or at least start that way until you get too big or need some services they can't provide.
These 3 systems provide their own clustering capabilities. They may require you to do a lot of machine and service layer configurations to get the system running well in a production environment. I hear good things about Terracotta which is a distributed persistence layer. I've used Jgroups a lot which is under Jboss and it can be tricky to get running right but Jboss may also have some good default configurations/documentation. Oracle is most likely going to be hardest to cluster right. DBA's make a lot of money tweaking Oracle configurations.
These are the most amorphous to define in terms of clustering. Some VMs are considered "clustered" in that they share networking hardware and power backplanes but are really not clusters when compared to cloud computing certainly. As mentioned, there are some clustered hardware solutions that are very custom and will require a lot of specific domain knowledge to get running well.
I have very little experience with application servers such as Tomcat and Glassfish. We have our own clustering software on top of Jgroups and run Jetty entirely. Application servers are not, in themselves, "clustered" but packages such as Jboss and Terracotta run on top of them to provide clustering and they have internal projects which have clustering software written for them.
Hope some of this helps.