I have been comparing Java PaaSes carefully and am really starting to like CloudBees. I only have one big concern with them, and that is their SLA/uptime.
After scouring through all of their documentation, I can only find one paper they offer on SLAs which states:
If you are using the CloudBees PaaS without taking advantage of high availability options, then CloudBees can only offer uptime that approaches the base uptime SLA of the infrastructure cloud provider.
As the same paper also mentions, Amazon seems to offer a 99.95% uptime, and I know that CloudBees runs - largely - on AWS/EC2 instances itself.
So this spawns a number of closely-related SLA questions:
CapabilitiesService
where, before you go to use their Email API, or Caching API, you first check with the master CapabilitiesService
to make sure those services are operating. I'd like to do the same with CloudBees, but seems like I'd need to build it myself. That's fine, but not sure if CloudBees even offers a mechanism (API call, etc.) to determine if a particular service partner is on or offline.Thanks in advance!
CloudBees does not offer an SLA on availability nor remedies in the form of credits if a particular level of uptime is not met in a month. This is AFAIK common for other offerings on AWS (e.g., Heroku). CloudBees does offer standard response-time based SLAs via a support agreement. As discussed in the white paper you reference, we also employ practices for our own usage of AWS and external providers that has helped to isolate our users from some specific Amazon issues.
The availability features you can make use of include:
The main point of the comment about using "high availability options" was to warn people that simply deploying an app on CloudBees does not make it highly available. If an EC2 instance fails underneath your single-instance deployment, your users will experience downtime while our internal machinery redeploys to a working instance, whereas a multi-instance deployment will likely only experience slower responses until a new instance is deployed. Similarly with single-instance databases without standbys or replicas across AZs. While this is just stating the blindingly obvious for a lot of people, you might be surprised how many people just assume some magic is happening.
Good point on the CapabilitiesService! We have some ideas kicking around in this area, but you would have to do something like this on your own for now.