Anatomy of a Modern Production Stack

Author: Joe Beda
A note on the term “modern”: This is my view, based on experiences at Google, for a stack that delivers what I’d want for a major production system. You can do this without containers, but I think it is hard to meet my criteria that way. The full stack here probably isn’t necessary for small applications and, as of today, is way too hard to get up and running. The qualities that I’d look for in a “modern” stack:
- Self healing and self managing. If a machine fails, I don’t want to have to think about it. The system should just work.
- Supports microservices. The idea of breaking your app into smaller components (regardless of the name) can help you to scale your engineering organization by keeping the dev team for each µs small enough that a 2 pizza team can own it.
- Efficient. I want a stack that doesn’t require a lot of hand holding to make sure I’m not wasting a ton of resources.
- Debuggable. Complex applications can be hard to debug. Having good strategies for application specific monitoring and log collection/aggregation can really help to provide insights into the stack.
So, with that, here is a brain dump of the parts that make up a “modern” stack:
- Production Host OS. This is a simplified and manageable Linux distribution. Usually it is just enough to get a container engine up and running.
- Examples include CoreOS, Red Hat Project Atomic, Ubuntu Snappy, and Rancher OS.
- Bootstrapping system. Assuming you are starting with a generic VM image or bare metal hardware, something has to be able to bootstrap those machines and get them running as productive members of the cluster. This becomes very important as you are dealing with lots machines that come and go as hardware fails.
- Cloud Foundry BOSH was created to do this for Cloud Foundry but is seeing new life as an independent product.
- The standard config tools (Puppet, Chef, Ansible, Salt) can serve this role.
- CoreOS Fleet is a lightweight clustering system that can also be used to bootstrap more comprehensive solutions.
- Container Engine. This is the system for setting up and managing containers. It is the primary management agent on the node.
- Examples include Docker Engine, CoreOS rkt, and LXC and systemd-nspawn.
- Some of these systems are more amenable to being directly controlled remotely than others.
- The Open Container Initiative is working to standardize the input into these systems – basically the root filesystem for the container along with some common parameters in a JSON file.
- Container Image Packaging and Distribution. A Container Image is a named and cloneable chroot that can be used to create container instances. It is pretty much an efficient way to capture, name and distribute the set of files that make up a container at runtime.
- Both Docker and CoreOS rkt solve this problem. It is built into the Docker Engine but is broken out for rkt as a separate tool set call acbuild.
- Inside of Google this was done slightly differently with a file package distribution system called MPM.
- Personally, I’m hoping that we can define a widely adopted spec for this, hopefully as part of the OCI.
- Container Image Registry/Repository. This is a central place to store and load Container Images.
- Hosted versions of this include the Docker Hub, Quay.io (owned by CoreOS), andGoogle Container Registry.
- Docker also has an open source registry called Docker Distribution.
- Personally, I’m hoping that the state of the art will evolve past centralized solutions with specialized APIs to solutions that are simpler by working regular HTTP and more transport agnostic so that protocols like BitTorrent can be used to distribute images.
- Container Distribution. This is the system for structuring what is running inside of a container. Many people don’t talk about this as a separate thing but instead reuse OS distributions such as Ubuntu, Debian, or CentOS.
- Many folks are working to build minimal container distributions by either using distributions based in the embedded world (BusyBox or Alpine) or by buildingstatic binaries and not needing anything else.
- Personally, I’d love to see the idea of a Container Distribution be further developed and take advantage of features only available in the container world. I wrote a blog post on this.
- Container Orchestration System. Once you have containers running on a single host, you need to get them running across multiple hosts.
- This is a super hot area of interest with lots of innovation.
- Open source deployable examples include Kubernetes, Docker Swarm, andApache Mesos.
- Hosted systems include Google Container Engine (GKE) (based on Kubernetes),Mesosphere DCOS and Amazon EC2 Container Service (ECS).
- Orchestration Config. Many of the orchestration systems have small granular objects. Creating and parameterizing these by hand can be difficult. In this context, an orchestration config system can take higher level description and compile them down to the nuts and bolts that the orchestration systems works with.
- The Google solutions to this problem have never been made public (to my knowledge).
- AWS CloudFormation and Google Cloud Deployment Manager play this role for their respective cloud ecosystems (only).
- Hashicorp Terraform and Flabbergast look like they could be applied to container orchestration systems but haven’t yet.
- Docker Compose is a start to a more comprehensive config system.
- The Kubernetes team (Brian Grant especially) have lots of ideas and plans for this area. There is a Kubernetes SIG being formed.
- Network Virtualization. While not strictly necessary, clustered container systems are much easier to use if each container has full presence on the cluster network. This has been referred to as “IP per Container”.
- Without a networking solution, orchestration systems must allocate and enforce port assignment as ports per host are a shared resource.
- Examples here include CoreOS Flannel, Weave, Project Calico, and Docker libnetwork (not ready for production yet). I’ve also been pointed to OpenContrailbut haven’t looked deeply.
- Container Storage Systems. As users move past special “pet” hosts storage becomes more difficult.
- I have more to say on this that I’ll probably put into a blog post at some point in the future.
- ClusterHQ Flocker deals with migrating data between hosts (among other things).
- I know there are other folks (someone pointed me at Blockbridge) that are working on software defined storage systems that can work well in this world.
- Discovery Service. Discovery is a fancy term for naming. Once you launch a bunch of containers, you need to figure out where they are so you can talk to them.
- DNS is often used as a solution here but can cause issues in highly dynamic environments due to aggressive caching. Java, in particular, is troublesome as itdoesn’t honor DNS TTLs by default.
- Many people build on top of highly consistent stores (lock servers) for this. Examples include: Apache Zookeeper, CoreOS etcd, Hashicorp Consul.
- Kubernetes supports service definition and discovery (with a stable virtual IP with load balanced proxy).
- Weave has a built in DNS server that stores data locally so that TTLs can be minimal.
- Related is a system to configure wider facing load balancer to manage the interface between the cluster and the wider network.
- Production Identity and Authentication. As clustered deployments grow, an identity system becomes necessary. When microservice A calls microservice B, microservice B needs some way to verify that it is actually microservice A calling. Note that this is for computer to computer communication within the cluster
- This is not a well understood component of the stack. I expect it to be an active area of development in the near future. Ideally the orchestration system would automatically configure the identity for each running container in a secure way.
- Related areas include secret storage and authorization.
- I’ve used the term “Authentity” to describe this area. Please use it as I’m hoping it’ll catch on.
- conjur.net is a commercial offering that can help out in this situation.
- Monitoring
Read full article: eightypercent.net/post/layers-in-the-stack
Recent Comments