virtualization openstack openstack-nova openstack-neutron cloud-platform

OpenStack API Implementations

I have spent the last 6 hours reading through buzzword-riddled, lofty, high-level documents/blogs/articles/slideshares, trying to wrap my head around what OpenStack is, exactly. I understand that:

OpenStack is a free and open-source cloud computing software platform. Users primarily deploy it as an infrastructure as a service (IaaS) solution.

But again, that's a very lofty, high-level, gloss-over-the-details summary that doesn't really have meaning to me as an engineer.

I think I get the basic concept, but would like to bounce my understanding off of SO, and additionally I am having a tough time seeing the "forest through the trees" on the subject of OpenStack's componentry.

My understanding is that OpenStack:

Installs as an executable application on 1+ virtual machines (guest VMs); and
Somehow, all instances of your OpenStack cluster know about each other (that is, all instances running on all VMs you just installed them on) and form a collective pool of resources; and
Each OpenStack instance (again, running inside its own VM) houses the dashboard app ("Horizon") as well as 10 or so other components/modules (Nova, Cinder, Glance, etc.); and
Nova, is the OpenStack component/module that CRUDs VMs/nodes for your tenants, is somehow capable of turning the guest VM that it is running inside of into its own hypervisor, and spin up 1+ VMs inside of it (hence you have a VM inside of a VM) for any particular tenant

So please, if anything I have stated about OpenStack so far is incorrect, please begin by correcting me!

Assuming I am more or less correct, my understanding of the various OpenStack components is that they are really just APIs and require the open source community to provide concrete implementations:

Nova (VM manager)
Keystone (auth provider)
Neutron (networking manager)
Cinder (block storage manager)
etc...

Above, I believe all components are APIs. But these APIs have to have implementations that make sense for the OpenStack deployer/maintainer. So I would imagine that there are, say, multiple Neutron API providers, multipe Nova API providers, etc. However, after reviewing all of the official documentation this morning, I can find no such providers for these APIs. This leaves a sick feeling in my stomach like I am fundamentally mis-understanding OpenStack's componentry. Can someone help connect the dots for me?

Solution

Not quite.

Installs as an executable application on 1+ virtual machines (guest VMs); and

OpenStack isn't a single executable, there are many different modules, some required and some optional. You can install OpenStack on a VM (see DevStack, a distro that is friendly to VMs) but that is not the intended usage for production, you would only do that for testing or evaluation purposes.

When you are doing it for real, you install OpenStack on a cluster of physical machines. The OpenStack Install Guide recommends the following minimal structure for your cloud:

A controller node, running the core services
A network node, running the networking service
One or more compute nodes, where instances are created
Zero or more object and/or block storage nodes

But note that this is a minimal structure. For a more robust install you would have more than one controller and network nodes.

Somehow, all instances of your OpenStack cluster know about each other (that is, all instances running on all VMs you just installed them on) and form a collective pool of resources;

The OpenStack nodes (be them VMs or physical machines, it does not make a difference at this point) talk among themselves. Through configuration they all know how to reach the others.

Each OpenStack instance (again, running inside its own VM) houses the dashboard app ("Horizon") as well as 10 or so other components/modules (Nova, Cinder, Glance, etc.); and

No. In OpenStack jargon, the term "instance" is associated with the virtual machines that are created in the compute nodes. Here you meant "controller node", which does include the core services and the dashboard. And once again, these do not necessarily run on VMs.

Nova, is the OpenStack component/module that CRUDs VMs/nodes for your tenants, is somehow capable of turning the guest VM that it is running inside of into its own hypervisor, and spin up 1+ VMs inside of it (hence you have a VM inside of a VM) for any particular tenant

I think this is easier to understand if you forget about the "guest VM". In a production environment OpenStack would be installed on physical machines. The compute nodes are beefy machines that can host many VMs. The nova-compute service runs on these nodes and interfaces to a hypervisor, such as KVM, to allocate virtual machines, which OpenStack calls "instances".

If your compute nodes are hosted on VMs instead of on physical machines things work pretty much in the same way. In this setup typically the hypervisor is QEMU, which can be installed in a VM, and then can create VMs inside the VM just fine, though there is a big performance hit when compared to running the compute nodes on physical hardware.

Assuming I am more or less correct, my understanding of the various OpenStack components is that they are really just APIs

No. These services expose themselves as APIs, but that is not all they are. The APIs are also implemented.

and require the open source community to provide concrete implementations

Most services need to interface with an external service. Nova needs to talk to a hypervisor, neutron to interfaces, bridges, gateways, etc., cinder and swift to storage providers, and so on. This is really a small part of what an OpenStack service does, there is a lot more built on top that is independent of the low level external service. The OpenStack services include the support for the most common external services, and of course anybody who is interested can implement more of these.

Above, I believe all components are APIs. But these APIs have to have implementations that make sense for the OpenStack deployer/maintainer. So I would imagine that there are, say, multiple Neutron API providers, multipe Nova API providers, etc.

No. There is one Nova API implementation, and one Neutron API implementation. Based on configuration you tell each of these services how to interface with lower level services such as the hypervisor the networking stack, etc. And as I said above, support for a range of these is already implemented, so if you are using with ordinary x86 hardware for your nodes, then you should be fine.