Introduction to Kubernetes Cluster-API Project

November 8, 2018

Kubernetes Cluster-API is an attempt to bring declarative, Kubernetes-style API for managing clusters and machines. It allows you to define all Cluster and Machines as Kubernetes objects (based on CustomResourceDefinitions) and then a cloud-specific Cluster-API provider will reconcile your request, i.e. make sure your requested state is what you have in cloud. With the help of clusterctl, Cluster-API can also create you a new Kubernetes cluster from zero.

We are going to deep dive into Cluster-API, see how it got started, how it looks and works like now, but also how you can get involved in Cluster-API development.

Short History of Cluster-API

Provisioning and managing Kubernetes clusters have always been a challenging job. Installing all dependencies, configuring components such as Kubelet, Scheduler, etcd, generating certificates… is all very time-consuming. There are so many different setups and so many different cloud providers to support.

To make this job easier, the Kubernetes community created two tools to simplify the cluster provisioning and maintenance process: kops and kubeadm.

kubeadm is a cloud-agnostic tool that initializes and configures Kubernetes, as well as handles cluster upgrades, for both master and worker instances. However, kubeadm is not supposed to be used for anything beside setting up Kubernetes or to be used as an API. It is up to operator to set up provisioning logic that is going to create virtual machines and other relevant resources, as well as to set up machines.

kops is a tool that set ups Kubernetes, but also handles provisioning—creating cloud resources, installing dependencies and configuring virtual machines, as well as comes with an API and can integrate with Terraform. However, kops initially only supported AWS, with support later extended for GCE and DigitalOcean (currently in alpha).

While both tools are great and solve many problems, there are several important problems left to solved:

We want a declarative, Kubernetes-like API, so we can manage infrastructure in fashion we got used to,
The API should be compatible with tooling we already have, i.e. with kubectl,
We’re not limited to cloud provider—everybody can easily implement API calls for the desired provider,
The tool installs and configures all required dependencies and then initialize the cluster,
The tool allows you to bring your own provisioning logic for setting up clusters,
The tool allows you to snapshot, scale, and upgrade cluster.

An attempt to solve this problem and bring such tool was made by Kris Nova with kubicorn.

While kubicorn may look similar to kops, there are several huge differences, such as: allowing users to easily bring support for any cloud provider, to use any bootstrap script to provision the cluster, to use kubicorn as a Go library, and many more.

And kubicorn had a great success! The community really loved the tool as it was easy to get an Kubernetes cluuster just like you would like.

What kubicorn was missing is a better integration with Kubernetes and the Kubernetes-like API. To manage your cluster you still had to use the kubicorn CLI or use it as a Go library.

The introduction and adoption of CustomResourceDefinitions allowed us to build a tool and an API that would bring similar feature-set as kubicorn and kops, but also integrate with the Kubernetes API, allowing operators to manage their infrastructure using existing tools, such as kubectl.

Cluster-API

The Kubernetes Cluster-API is an official Kubernetes project, lead by experience gained while working on kops and kubicorn. Cluster-API is a declarative API built on top of Kubernetes, making it possible to provision and manage your cluster using the tool we know very well—kubectl.

While the project has API in its name, it is not really just an API. Actually, we can look at it like at a framework. It provides API, but also comes with controller that reconciles your cluster—creates, updates and deletes resources in the cloud. Beside the controller, we have a CLI called clusterctl which is used to create a new cluster from zero.

While Cluster-API is appreciated and liked by community, it still has various pros and cons.

Pros:

Declarative, API-driven approach,
Creates and provisions virtual machines and sets up Kubernetes,
Can scale and handle cluster upgrades,
Works with any cloud provider and with any operating system or image,
Depending on options available for a cloud provider, user can define how exactly instances and clusters will be provisioned, by providing a bootstrap script.

Cons:

Project is still in prototype phase, meaning many breaking changes can get accepted over time,
There are still missing pieces, e.g. HA setups,
Not widely adopted.

Cluster-API Concepts

We’ve seen what Cluster-API is and what led to its creation. Let’s now see how it actually works and how you can utilize it. First, we’ll take a look at what resources define your cluster and machines and how Cluster-API “converts” those resources to actual virtual machines and Kubernetes clusters.

A cluster is defined using the Cluster resource. By default, the Cluster resource Spec defines how networking is going to be set up—what CIRDs we are going to use and what domain name to use for services, and Cluster resource Status saves endpoint IP address and port.

A machine is defined using the Machine resource. The Machine resource Spec defines what Kubernetes version is going to be used for that machine and what taints to set on that Kubernetes node, while the Machine resource Status stores IP address of that machine.

Both Cluster and Machine resources can be extended by specific Cluster-API implementation to include information relevant for that cloud provider. For example if someone is working on AWS Cluster-API implementation, Cluster resource can be extended to include information about SecurityGroups, or in case of DigitalOcean how to create Cloud Firewall.

Beside the main Cluster and Machine resources, we have several more resources that allow you to manage a bunch of machines, such as MachineSets similar to ReplicaSets and MachineDeployments similar to Deployments. Both allow you to define how many machines you want to create and when you create that one resource, Cluster-API will create the underlying Machine objects for each replica. MachineSets and MachineDeployments are the main resources used to scale your cluster.

The question which arises here is what happens when a Cluster or Machine resource is created? How does the Cluster-API create a new cluster or machine?

The component which handles this is a controller which reconciles all Cluster-API resources. The reconciliation process assumes watching for events on all Cluster-API resources and depending on a triggered event a specific function is invoked. For example, if a new Machine resource is created the controller will invoke the Create function, which depending on implementation will create a machine, install Kubernetes, and join node a cluster.

Cluster-API Project Structure and Cloud Specific Implementations

The Cluster-API itself is cloud-agnostic, i.e. it works on any cloud provider. But to be able to handle cloud resources, virtual machine creation, we need to implement specific API calls for that provider somewhere.

The kubernetes-sigs/cluster-api contains everything needed to start with Cluster-API: API types, controller which reconciles resources, clusterctl CLI implementation.

The cloud-specific API calls and mechanisms for setting up instances are not part of the core Cluster-API. Instead, Cluster-API just exposes interfaces that must be implemented. The cloud-specific Cluster-API implementations are called Cluster-API Providers. They contain implementations of Cluster-API interface functions, which handle creating, deleting, and upgrading cluster and machines, as well as contain other needed mechanisms such as a mechanism for provisioning machines.

This approach has many positive sides: it ensures that anybody can import Cluster-API and build a Cluster-API provider for any cloud, using any API, for any setup. The Cluster-API team can focus on developing the core API as they don’t need to spend countless hours on reviewing changes to the core repository for all the setups and cloud providers.

At the time of writing this post, there are several available Cluster-API Providers, such as AWS Provider, GCP Provider, DigitalOcean Provider. The Cluster-API README contains the list of all known Cluster-API implementations.

Getting Involved With Cluster-API Project

The Cluster-API project is still a young project. The core part is implemented and there are Cluster-API Providers for the most popular cloud providers. But there are still many features missing that would make it easier to implement Cluster-API Providers as well as bring new possibilities.

The Cluster-API project and Cluster-API Providers are looking for all kind of contributions, including but not limited to adding and improving feature set, improving testing and making sure it works correctly, writing documentation…

To get started, head over to GitHub and check out the issue tracker. Try to find issues labeled as good first issue, as they’re usually good for beginners. Here are some of Cluster-API projects that have some good first issues:

If you have any question or would like to communicate with other Cluster-API users, contributors and leads, you can join #cluster-api channel on Kubernetes Slack.

The Cluster-API team is holding weekly sync-up meetings for all users and contributors that you can join. At the time of writing this post there are four Cluster-API meetings weekly:

Cluster-API Breakout - Wednesday 5:00 PM UTC. A meeting targeted for the core Cluster-API development
Cluster-API Implementers’ Office Hours - US West Coast: Tuesday 7:00 PM UTC; EMEA: Wednesday 1:00 PM UTC. A meeting targeted to Cluster-API implementations, such as Cluster-API Providers
Cluster-API Provider AWS Office Hours - Monday 5:00 PM UTC. A meeting focused on Cluster-API Provider AWS development