Elixir and Kubernetes: Starting Kubernetes

In the first post in this series, we took a look at Aristochat, our app that we’ll use to explore developing and operating distributed systems with Elixir and Kubernetes. Today, we’ll start our Kubernetes cluster!

When it comes to starting up our Kubernetes cluster, we’ve got a lot of options. If you check out the Kubernetes setup guide, you’ll see steps for running things locally, running in the cloud, running on a single node, and running on several nodes.

Minikube is going to be our preferred way of running Kubernetes for this series. It’s easy to set up and gives us the shortest path to playing with things. Since it’s just running on a local virtual machine, we don’t have to worry about starting up a bunch of things manually or dealing with cloud providers. We can just concentrate on Kubernetes and what it does. If you want to follow along using a different environment (such as Google Cloud), feel free! Everything we do should be indifferent to the environment running Kubernetes.

To get Minikube spun up, go here. You’ll also install kubectl as part of the setup, which will let you interact with your cluster via the command line. Once you have both Minikube and kubectl installed, you can run the following commands to start your cluster and observe its nodes:

minikube start
kubectl get nodes

So, what did we just do? Minikube start seems fairly clear – we started Minikube. It spun up a virtual machine and initialized the Kubernetes components inside it. Kubectl is the command line client we use to interact with our cluster. We used it to run kubectl get nodes, which shows the nodes in our cluster. Since minikube is running on a single node, we should only see one node. Kubectl can also tell us what pods are currently running, what events have recently occurred, and much more!

I mentioned that Minikube installed some Kubernetes components, but what are these components? Let’s take a look at the pieces that make up Kubernetes so we can know what’s running under the hood.

Kubernetes is a master-worker system. A master node manages services running on any number of nodes in the cluster. The master node is also referred to as the Kubernetes control plane. When designing for high availability (for example, when running in a production environment), the control plane’s functionality can be replicated across several master nodes. When running in this configuration, though there are multiple instances of each service in the control plane, only one version of each component is running as the master.

Kubernetes stores state in its data store, etcd. Etcd is a distributed key-value store, similar to Consul or Zookeeper. Its nodes communicate with each other using the Raft protocol. The Kubernetes documentation says that anyone with read/write access to the etcd cluster effectively has root access to the Kubernetes cluster. To limit etcd’s exposure, we want the API server to be the only component, Kubernetes or otherwise, that talks directly to the etcd cluster. The API server provides access to etcd through a RESTful API. It handles configuration and validation of commands and manages the input and output of all state changing operations.

Also part of the control plane is the controller manager. There are several controllers that run, each with a specific function. The deployment controller manages creating replica sets for deployments, and the replica set controller handles creating the desired number of pods specified in the replica set. You can check out the documentation for examples of others.

As you might expect from something that schedules containers, there’s a scheduler. The Kubernetes scheduler is constantly querying the API server for unscheduled pods, meaning pods who haven’t been assigned to any node. The algorithm has two steps – it first rules out nodes and then ranks the remainder. The pod is scheduled to the node with the highest score. Nodes might be ruled out because the pod has requested a specific node, or perhaps the node doesn’t have enough memory remaining to accommodate the pod. Nodes get higher scores for having more available resources and for not having other copies of the same pod already scheduled to it.

Outside of the control plane, the star of the show is kubelet, the node agent. On the node that it’s running on, kubelet handles running containers and reporting their statuses. Kube-proxy also runs on each node and handles any host level networking that needs to be done. Also running on each node is a container engine. Docker is typically used, although there’s alternatives like CRI-O.

In this post, we’ve gotten our cluster started and taken a basic look at the components that make up Kubernetes. In our next installment, we’ll actually launch Aristochat into our cluster!