Migrating docker based services to Kubernetes
06.04.2022
This post will describe how I moved my basic collection of apps on my Raspberry Pis from a single node Docker based container environment to Kubernetes. At first glance this shouldn't be too big of an endeavor, right? It kind of is if you don’t know what to look for. But first, let's clarify the why. Why move a perfectly fine, relatively simple solution to something else that of does the same thing?
The biggest factor was Learning. I figured that I have a good grasp of Docker and Docker Swarm since I have been using it at work for quite some time now. However, with the questionable future of Docker Swarm, it seemed like a natural progression to get towards Kubernetes.
Another point is the unification of configuration. Whereas in my Docker environment I hat multiple points where I would schedule my containers, multiple points where my data would be and multiple tools which I had to configure (e.g., traefik). Kubernetes solves almost all of this by unifying everything into a single API, from storage to networking and deployment of the containers.
Technicalities
I'll be using some different technologies in this post namely there is:
ansible, as the overarching automation tool
Kubernetes
Helm, to deploy predefined applications to my cluster
I use the following alias alias k=kubectl
for brevity.
Later on in this post I’ll be using ansible to deploy k8s objects utilizing the kubernetes.core module
To do this there are two options, the ansible native way:
- name: create namespace
kubernetes.core.k8s:
name: testing
api_version: v1
kind: Namespace
state: present
and the other option is to basically write a kubernetes manifest into the definition
field:
- name: create namespace
kubernetes.core.k8s:
state: present
definition:
api_version: v1
kind: Namespace
metadata:
name: testing
these two versions basically yield the exact same result. Which leaves it up to personal preference what to use. For this post I'll be sticking to the second choice. Since it's easier to extract the manifest if you do not want to use ansible.
The complete cluster preparation is contained in an ansible playbook you can find in my github repository.
Before the script can run successfully there are some prerequisites that have to be met:
a control node with a somewhat current ansible version (>2.10)
the kubernetes.core collection for ansible, it can be installed via
sh ansible-galaxy collection install kubernetes.core
on the control nodeansible needs to be able to reach all our nodes and login to them via SSH, there are some pointers in the Building an inventory section
the user ansible uses to login needs to be able to gain root privileges, either set
NOPASSWD
in the sudoers file or add the--ask-sudo-password
option to the playbook execution
Cluster
Before we can look at our application and how to move it, we have some foundation we have to set up. This process is however not limited to this environment. The application part later on will most likely work with any other Kubernetes flavour or even managed Kubernetes services like EKS, AKS and GKE.
My environment consists of the following (physical) machines:
1 Raspberry Pi 3 Model B+ running Raspbian on a 32 Bit armv7l Kernel
1 Raspberry Pi 4 Model B running Raspbian on a 64 Bit aarch64 Kernel
1 Notebook with 4c/8t, 8 GB RAM running Ubuntu Server 21.10
1 QNAP NAS serving a NFS share
What's noteworthy about this setup is that we have a mix of CPU architectures. While the Raspberry Pis are ARM based the notebook is more classically x86_64. This is not a problem per se, but we'll have to consider it later on when we are choosing the images we want to base our containers on. I made sure for my setup, that the Raspberry Pis are running a 64-Bit Kernel. This might be necessary in the future if we want to use more advanced networking with cilium.
The first decision to be made is the selection of what flavor of kubernetes to use. Since theres an
There are a handful of factors which influence this selection. First of all, our construct is somewhat resource constrained, and it offers no autoscaling capabilities which we would have in Private/Public Cloud environments.
The next consideration is simplicity. The later parts of this post where we talk about in-cluster resources and mechanisms which are supposed to provide value independently of the readers background. Be it someone who just wants to learn, or a DevOps Engineer in an enterprise context who got the task of migrating some old apps to their EKS/AKS/GKE deployment.
Considering these points i chose k3s since it fits the criteria.
it's resource-efficient, since it's focused on IoT use-cases
it's easy to deploy using their setup script
it comes with sane defaults for storage, networking
it conveniently installs some utilities like kubectl and crictl
Building an inventory
Let's begin with an overview overview of the machines in our cluster. well separate them into two groups, servers and agents according to the k3s naming.
The resulting inventory file looks like this in ansible ini form:
[k3s:children]
k3s-server
k3s-agent
[k3s-server]
druid ansible_host=druid.localdomain HEALTHCHECK_HOOKURL=xxx1
[k3s-agent]
monkeyrocket ansible_host=monkeyrocket.localdomain HEALTHCHECK_HOOKURL=xxx2
whitebox ansible_host=whitebox.localdomain HEALTHCHECK_HOOKURL=xxx3
This allows us to reference servers/agents separately or all the machines as a whole, depending on what our following scripts are supposed to accomplish. Another option of the inventory file gives us is to set arbitrary variables like the HEALTHCHECK_HOOKURL
, we'll talk about it a bit later in this post.
If you are curious where the names come from, I chose to set all my device hostnames after NSA Surveillance tools.
Building the cluster
At this point we are ready to get started setting up our cluster. k3s provides us with a handy setup script. which simplifies our deployment immensely.
Initializing the cluster
To begin initializing the cluster we fetch the k3s-install script and run it, passing in our installation options via environment variables.
In this case I bring the port range k3s can use down to start from 1000, since I need some lower ports for my services.
- name: get the installer script
ansible.builtin.get_url:
url: https://get.k3s.io
dest: /root/k3s.sh
mode: '0500'
- name: run the installer script
ansible.builtin.command: sh k3s.sh
args:
chdir: /root
creates: /var/lib/rancher/k3s/server/node-token
environment:
INSTALL_K3S_EXEC: "--service-node-port-range 1000-32767"
When this is done successfully, we read a token from the server-node. This token will be used subsequently to join agent nodes to the cluster.
Notice that the ansible.builtin.slurp module returns base64 encoded content and thus we have to decode it using the b64decode filter.
- name: get cluster token
ansible.builtin.slurp:
src: /var/lib/rancher/k3s/server/node-token
register: node_token_raw
- set_fact:
node_token: "{{ node_token_raw['content'] | b64decode | trim }}"
Joining agents to the cluster
Now that everything on the cluster side of things is set up, we can continue adding the agent nodes.
To start the process, we need to download the k3s install script like we did on our server. The execution of the script also largely stayed the same. We just need to add two Environment variables, K3S_URL and K3S_TOKEN. The URL points to our server nodes hostname. We can access it by looking into our k3s-server group we defined in our ansible inventory. The second value, the join token, can be accessed via ansible hostvars we set in the previous play while initializing the cluster.
- name: get the installer script
ansible.builtin.get_url:
url: https://get.k3s.io
dest: /root/k3s.sh
mode: '0500'
- name: run join script
ansible.builtin.command: sh k3s.sh
args:
chdir: /root
environment:
K3S_URL: "https://{{ groups['k3s-server'] | first }}:6443"
K3S_TOKEN: "{{ hostvars[groups['k3s-server'] | first]['node_token'] }}"
Updating nodes
One open point in regards to the nodes still remains. The patching of the base OS. There are some approaches which might be discovered in future blog posts.
Rancher System Upgrade Controller: https://github.com/rancher/system-upgrade-controller (k8s)
Ansible from cronjobs (ansible)
Ansible from AWX (ansible)
unattended-upgrades (apt)
Monitoring
At this point we probably want to know when things go wrong so we can react to it. My environment does not currently have any established monitoring solution like Prometheus, ELK Stack or PRTG. I have opted for a simple solution utilizing healthchecks.io. There is a free offering which allows me to "monitor" 20 different entities. It works by receiving a webhook at a specified time and alerting if the hook is not called within a certain grace period.
The service has a free tier but if you need scale or prefer a selfhosted solution, you can use the OSS Project to get the same functionality on your own infrastructure.
The monitoring is implemented as CronJobs which are bound to specific nodes, which looks like this in the ansible script:
- hosts: k3s
gather_facts: false
vars_files:
- variables.yml
tasks:
- name: deploy healthchecks
kubernetes.core.k8s:
state: present
kubeconfig: "{{ KUBECONFIG }}"
definition:
apiVersion: batch/v1
kind: CronJob
metadata:
name: "{{ inventory_hostname }}-healthcheck"
namespace: default
spec:
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 1
schedule: "*/5 * * * *"
jobTemplate:
spec:
template:
spec:
nodeSelector:
kubernetes.io/hostname: "{{ inventory_hostname }}"
containers:
- name: curlimage
image: curlimages/curl
imagePullPolicy: IfNotPresent
command:
- sh
- -c
args:
- curl $SERVICE_URL
env:
- name: SERVICE_URL
value: "{{ HEALTHCHECK_HOOKURL }}"
restartPolicy: OnFailure
delegate_to: "{{ groups['k3s-server'] | first }}"
It looks like there is a lot going on, so let’s break it down a bit. We run a play on all nodes in the cluster hence the hosts: k3s
. But by delegating the play to a server via delegate_to: "{{ groups['k3s-server'] | first }}"
we do all the actions on a server node which has a kubeconfig and thus permissions to create objects in the cluster.
The created object is a simple Cronjob, bound to a specific node via a Node Selector whose purpose is to call the health check URL. This is also the point where the health check URL from the inventory file is used.
kwatch
The Health Checks we just created tell us something about nodes, but what about stuff that happens in the cluster? For this i use another simple tool called kwatch. Kwatch is deployed inside the cluster and watches for error events. In case of an error it posts a message to many different messaging solutions like slack, telegram or Microsoft teams. The deployment looks like this:
- name: Ensure the kwatch namespace is present
kubernetes.core.k8s:
kubeconfig: "{{ KUBECONFIG }}"
definition:
apiVersion: v1
kind: Namespace
metadata:
name: kwatch
- name: Ensure the ConfigMap is present
kubernetes.core.k8s:
kubeconfig: "{{ KUBECONFIG }}"
definition:
apiVersion: v1
kind: ConfigMap
metadata:
name: kwatch
namespace: kwatch
data:
config.yaml: |
maxRecentLogLines: 50
ignoreFailedGracefulShutdown: true
alert:
telegram:
token: "{{ TELEGRAM_TOKEN }}"
chatId: "{{ TELEGRAM_CHATID }}"
- name: Ensure the deployment manifest is present
ansible.builtin.get_url:
url: "https://raw.githubusercontent.com/abahmed/kwatch/{{ KWATCH_VERSION }}/deploy/deploy.yaml"
dest: /root/kwatch_deploy.yaml
mode: "0664"
- name: Ensure the Deployment manifest is deployed
kubernetes.core.k8s:
kubeconfig: "{{ KUBECONFIG }}"
state: present
src: /root/kwatch_deploy.yaml
It creates a namespace, a config map and the kwatch deployment.
To check if kwatch is working correctly we can create a test pod to see if the notifications are working. My Test pod looks like this:
---
apiVersion: v1
kind: Pod
metadata:
name: termination-demo
namespace: default
spec:
containers:
- args:
- -c
- sleep 5
- echo "im doing stuff"
- terraform init # This will fail because terraform is not installed
command:
- /bin/sh
image: alpine
name: termination-demo-container
An alpine container with a command args set to a binary which is not there. Logically the container fails, and we get some telegram messages from kwatch:
Cert-Manager
Since certificate management is a tedious, unscalable and rather expensive task we'll be using Cert-Manager and Let’s Encrypt certificates to fix all those shortcomings. Cert-Manager simplifies working with in-cluster certificates tremendously by representing the whole certificate workflow as kubernetes objects.
Certificate Creation
The full workflow using cert manager for Let’s Encrypt certificates, including the ACME backend, can be found in the cert-manager documentation. The simplified version with the parts which are relevant to us is as following:
a ingress resource is created with a reference to a tls secret
cert-manager checks if the referenced tls secret is present
if not a certificate request is being created, referencing a Cluster Issuer
the Cluster Issuer creates a new Order
the Order results in a new Challenge
if the Challenge is verified a Certificate Request is created
which finally results in the issue of a Certificate
Certificate Renewal
This is one huge benefit of cert-manager. If we have everything setup like described previously, we don't have to do additional work and get certificate renewal out of the box. The inner workings on how this is automated can be found in the cert-manager documentation.
Installing cert manager
The installation is relatively simply done via the official Helm Chart.
The section in the ansible playbook looks like this:
- name: ensure the cert-manager helm repository is installed
kubernetes.core.helm_repository:
repo_name: jetstack
repo_url: https://charts.jetstack.io
repo_state: present
- name: deploy the cert-manager
kubernetes.core.helm:
name: cert-manager
chart_ref: jetstack/cert-manager
wait: true
kubeconfig: "{{ KUBECONFIG }}"
release_namespace: cert-manager
create_namespace: true
values:
installCRDs: true
The only customizing thats done to the Helm installation is the flag to include CustomResourceDefinitions.
Cloudflare secret
At this points its time to talk about how we authenticate to Letsencrypt that we are even allowed to issue certificates to the domain we want to use. Since i use Cloudflare services for my DNS i will use the cloudflare DNS01 challenge solver for my ownership verification. However there are many more DNS01 and HTTP01 challenge options available. In the simplest terms, this step creates a entity on our end so that the Letsencrypt backend can verify us as owner.
To use the Cloudflare DNS backend we need to create an API key in our Cloudflare Management console which is described here.
The secret resource in the script looks like this:
- name: create cf token
kubernetes.core.k8s:
kubeconfig: "{{ KUBECONFIG }}"
definition:
apiVersion: v1
kind: Secret
metadata:
namespace: cert-manager
name: cloudflare-api-token-secret
type: Opaque
stringData:
api-token: "{{ CF_USER_TOKEN }}"
Deploying a cluster issuer
The last component the setup is lacking now is a Cluster Issuer. It basically serves as an abstraction for the Letsencrypt backend inside our cluster.
- name: create ACME cluster issuer (prod)
kubernetes.core.k8s:
kubeconfig: "{{ KUBECONFIG }}"
definition:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: "{{ ACME_USER_EMAIL }}"
server: "{{ ACME_SERVER }}"
privateKeySecretRef:
name: prod-issuer-account-key
solvers:
- dns01:
cloudflare:
email: "{{ CF_USER_EMAIL }}"
apiTokenSecretRef:
name: cloudflare-api-token-secret
key: api-token
To check the status of our issuer we can describe it via
k describe clusterissuers.cert-manager.io letsencrypt-prod
And it should look like something along those lines
(optional) Creating a certificate
Let's look at an example how it might look like if we want to order a certificate explicitly.
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: dev-wildcard
namespace: dev
spec:
secretName: dev-tls
dnsNames:
- "*.dev.xc-cloud.net"
privateKey:
algorithm: RSA
encoding: PKCS1
size: 4096
usages:
- server auth
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
k describe certificate.cert-manager.io speed -n speedtest
This will create a certificate for us. But we will have to reference it later in our ingresses. This is actually additional work we don't really have to do. It might however be interesting for wildcard certificates who can be used in multiple ingresses.
NFS Persistent Volume Controller
The storage i will be using is NFS based on a QNAP-NAS. To use the NFS Share we utilize the NFS Subdir External Provisioner.
The configuration we give it is relatively straightforward. First we have to make sure the helm chart is enabled and then we deploy the chart with the hostname (nfs.hostname
) and the path (nfs.path
) of our NFS share. In a enterprise environment you probably will have to include authentication to the NFS share here.
- name: Ensure the nfs-provisioner helm chart is installed
kubernetes.core.helm_repository:
repo_name: nfs-subdir-external-provisioner
repo_url: https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
repo_state: present
- name: Ensure the nfs-provisioner is deployed
kubernetes.core.helm:
name: nfs-subdir-external-provisioner
chart_ref: nfs-subdir-external-provisioner/nfs-subdir-external-provisioner
release_namespace: kube-system
wait: true
kubeconfig: "{{ KUBECONFIG }}"
values:
nfs:
server: "{{ NFS_SERVER_HOST }}"
path: "{{ NFS_SERVER_PATH }}"
This gives us two objects. A external provisioner pod which we don't really have to interact with in any way. More importantly it creates a Storage Class named nfs-client
. If we create a Persistent Volume Claim referencing this Storage Class the following happens:
A new folder on the NFS Share is created
A new Persistent Volume in our cluster is created pointing to the folder on the NFS Share
With that all out of the way we can continue to work on porting over our applications, so stay tuned for part two!