Buildlog genesis - Single control-plane kubernetes homelab

󰃭 2024-11-06

So I decided to build a Kubernetes cluster at home. Sure, I could say it’s mainly for self-hosting + probably ML (I will need to add a GPU node) workloads I want to experiment with, but honestly, I’ve always wanted a homelab to tinker with. Having a proper test environment to break things in beats trying stuff out in production plus is plenty satisfying.

The plan

1 control plane node (future HA, but let’s not get ahead of ourselves)
3 worker nodes (because why not)
IP address management (MetalLB)
- I plan to use traefik later as the ingress controller
A way to manage it all (enter Terraform/Tofu and Ansible)

The Build

VM specifications

These are just what I chose for now, but I’ll modify them as needed

The controlplane has:

num_cpus = 4
memory = 4GB
disk = 50GB

The worker nodes have:

num_cpus = 4
memory = 8GB
disk = 100GB

Infrastructure Time

I use OpenTofu (you can replace it with Terraform), and a custom provider that runs a bunch of CLI commands to provision the instances.

In tofu land:

module "cp_nodes" {
  source                 = "./modules"
  cpus                   = var.cp_compute_properties.cp_num_cpus
  memory                 = var.cp_compute_properties.cp_memory
  disk                   = var.cp_compute_properties.cp_disk
  instance_addresses     = var.control_plane_instance.cp_ip_addresses
  instance_nodes         = var.control_plane_instance.cp_instance
  cloudinit_file         = "${path.module}/setup/user_data.cfg"
  hosts                  = "${path.module}/setup/hosts"
  setup_script           = "${path.module}/setup/setup.sh"
  image                  = var.ubuntu_image
}
module "worker_nodes" {
  source                 = "./modules"
  cpus                   = var.worker_compute_properties.worker_num_cpus
  memory                 = var.worker_compute_properties.worker_memory
  disk                   = var.worker_compute_properties.worker_disk
  instance_addresses     = var.worker_nodes_instance.worker_ip_addresses
  instance_nodes         = var.worker_nodes_instance.worker_instance
  cloudinit_file         = "${path.module}/setup/user_data.cfg"
  hosts                  = "${path.module}/setup/hosts"
  setup_script           = "${path.module}/setup/setup.sh"
  image                  = var.ubuntu_image
}

Kubernetes Bootstrap - Enter Ansible

First sign of life - nodes responding (always start with a ping test):

➜  ~ ansible all -m ping -i inventory.ini

node02 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
controlplane01 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
node03 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
node01 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

After manually setting it up (running into issues and finally figuring it out, I will probably note it somewhere one day) and tearing it down a couple of times the next logical step was to use ansible to set this all up for me.

My inventory:

[controlplanes]
controlplane01

[nodes]
node01
node02
node03

and playbook

- name: Install and setup kubeadm
  hosts: controlplanes
  tasks:
    - name: Setup kubeadm on controlplane
      ansible.builtin.script: kubeadm.sh -controlplane
      register: results
    # - debug:
    #     var: results.stdout
    - name: Fetch the join command to Ansible controller
      ansible.builtin.fetch:
        src: /tmp/kubeadm_join_cmd.sh
        dest: "{{ playbook_dir }}/files/kubeadm_join_cmd.sh"
        flat: yes

- name: Install and setup kubeadm
  hosts: nodes
  tasks:
    - name: Setup kubeadm on worker node
      ansible.builtin.script: kubeadm.sh
      register: results
    # - debug:
    #     var: results.stdout
    - name: Copy the join command to worker nodes
      ansible.builtin.copy:
        src: "{{ playbook_dir }}/files/kubeadm_join_cmd.sh"
        dest: /tmp/kubeadm_join_cmd.sh
        mode: '0755'

    - name: Execute join command on worker nodes
      ansible.builtin.command:
        cmd: sudo /tmp/kubeadm_join_cmd.sh

and the shell script kubeadm.sh (i am not particularly any good at shell scripts)

ARG=$1

sudo swapoff -a

# Load required kernel modules
sudo modprobe overlay
sudo modprobe br_netfilter

# Persist modules between restarts
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

# Set required networking parameters
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

# Apply sysctl params without reboot
sudo sysctl --system

sudo apt-get install -y containerd
sudo mkdir -p /etc/containerd
containerd config default | sed 's/SystemdCgroup = false/SystemdCgroup = true/' | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd

KUBE_LATEST=$(curl -L -s https://dl.k8s.io/release/stable.txt | awk 'BEGIN { FS="." } { printf "%s.%s", $1, $2 }')
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/${KUBE_LATEST}/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg --yes
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/${KUBE_LATEST}/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

sudo crictl config \
    --set runtime-endpoint=unix:///run/containerd/containerd.sock \
    --set image-endpoint=unix:///run/containerd/containerd.sock

cat <<EOF | sudo tee /etc/default/kubelet
KUBELET_EXTRA_ARGS='--node-ip ${PRIMARY_IP}'
EOF


if [ "$ARG" = "-controlplane" ]; then
    POD_CIDR=10.244.0.0/16
    SERVICE_CIDR=10.96.0.0/16
    LB_IP=$(getent hosts lb | awk '{ print $1 }')

    NODE_HOSTNAME=$(hostname)

    # Initialize only if it's the primary control plane
    if [ "$NODE_HOSTNAME" = "controlplane01" ]; then
        sudo kubeadm init --pod-network-cidr $POD_CIDR \
            --service-cidr $SERVICE_CIDR --apiserver-advertise-address $PRIMARY_IP \
            --apiserver-cert-extra-sans lb > /tmp/kubeadm_init_output.txt

        # Capture join command for other control planes
        cat /tmp/kubeadm_init_output.txt | grep "kubeadm join" -A 1 > /tmp/kubeadm_join_cmd.sh
        chmod +x /tmp/kubeadm_join_cmd.sh
        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config
    else
        # For additional control planes later
        echo "TODO: modify the join command to join as a control plane"
    fi
fi

Running an ansible-playbook -i inventory.ini playbook.yaml which uses the shell script and the playbook to set up the control plane and workers.

ansible kubeadm install

Copying the kubeconfig to my laptop scp -i ~/.ssh/labkey ubuntu@controlplane01:~/.kube/config ~/.kube/config

Finally, the moment of truth:

ansible kubeadm install

Current progress

Nodes responding
Basic connectivity working
kubectl get nodes shows nodes (NotReady nodes but they show up)

The nodes needed a CNI solution to get them Ready. While setting up the CNI, I figured I might as well install MetalLB and ArgoCD too.

I chose Cilium after seeing it at KubeCon Paris this year - after trying it out, I was so impressed it became my go-to CNI solution.

Instead of using kubectl apply (so 2020! Just picking a random year in the past), I went for a more everything as code approach with Helmfile - a declarative spec for deploying helm charts.

The process is straightforward:

create a helmfile.yaml
add your charts
run helmfile apply, and wait for the magic to happen.

The helmfile contents:

repositories:
  - name: argo
    url: https://argoproj.github.io/argo-helm
  - name: cilium
    url: https://helm.cilium.io/
  - name: metallb
    url: https://metallb.github.io/metallb

releases:
  - name: argocd
    namespace: argocd
    chart: argo/argo-cd
    version: 7.7.0
    values:
      - server:
          extraArgs:
            - --insecure
  - name: cilium
    namespace: kube-system
    chart: cilium/cilium
    version: 1.16.3

  - name: metallb
    namespace: metallb-system
    chart: metallb/metallb
    version: 0.14.8

a few seconds to a minute later and … She’s alive!

k get nodes -o wide
NAME             STATUS   ROLES           AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
controlplane01   Ready    control-plane   11m   v1.31.2   192.168.50.100   <none>        Ubuntu 22.04.5 LTS   5.15.0-122-generic   containerd://1.7.12
node01           Ready    <none>          10m   v1.31.2   192.168.50.97    <none>        Ubuntu 22.04.5 LTS   5.15.0-122-generic   containerd://1.7.12
node02           Ready    <none>          10m   v1.31.2   192.168.50.98    <none>        Ubuntu 22.04.5 LTS   5.15.0-122-generic   containerd://1.7.12
node03           Ready    <none>          10m   v1.31.2   192.168.50.99    <none>        Ubuntu 22.04.5 LTS   5.15.0-122-generic   containerd://1.7.12

and the container creation problem is gone

➜  ~ k get po -A
NAMESPACE        NAME                                                READY   STATUS    RESTARTS   AGE
argocd           argocd-application-controller-0                     1/1     Running   0          2m33s
argocd           argocd-applicationset-controller-858cddf5cb-pfffm   1/1     Running   0          2m34s
argocd           argocd-dex-server-f7794994f-8r5sg                   1/1     Running   0          2m34s
argocd           argocd-notifications-controller-d988b477c-95g5m     1/1     Running   0          2m34s
argocd           argocd-redis-5bd4bbb-5jkg5                          1/1     Running   0          2m34s
argocd           argocd-repo-server-6d9f6bd866-4kkcv                 1/1     Running   0          2m34s
argocd           argocd-server-7b6f64dd8d-d7rlv                      1/1     Running   0          2m34s
kube-system      cilium-82nxh                                        1/1     Running   0          3m25s
kube-system      cilium-82qdp                                        1/1     Running   0          3m25s
kube-system      cilium-8hnqm                                        1/1     Running   0          3m25s
kube-system      cilium-8l9g5                                        1/1     Running   0          3m25s
kube-system      cilium-envoy-gppg7                                  1/1     Running   0          3m25s
kube-system      cilium-envoy-qdpwc                                  1/1     Running   0          3m25s
kube-system      cilium-envoy-s4js7                                  1/1     Running   0          3m25s
kube-system      cilium-envoy-vbhnv                                  1/1     Running   0          3m25s
kube-system      cilium-operator-b4bfbfd9c-6cr4q                     1/1     Running   0          3m25s
kube-system      cilium-operator-b4bfbfd9c-mvhnd                     1/1     Running   0          3m25s
kube-system      coredns-7c65d6cfc9-4pt49                            1/1     Running   0          11m
kube-system      coredns-7c65d6cfc9-8qn75                            1/1     Running   0          11m
kube-system      etcd-controlplane01                                 1/1     Running   0          11m
kube-system      kube-apiserver-controlplane01                       1/1     Running   0          11m
kube-system      kube-controller-manager-controlplane01              1/1     Running   0          11m
kube-system      kube-proxy-2dr96                                    1/1     Running   0          10m
kube-system      kube-proxy-2s96q                                    1/1     Running   0          10m
kube-system      kube-proxy-kwn5v                                    1/1     Running   0          10m
kube-system      kube-proxy-qlfb2                                    1/1     Running   0          11m
kube-system      kube-scheduler-controlplane01                       1/1     Running   0          11m
metallb-system   metallb-controller-6c9b5955bc-vnrcb                 1/1     Running   0          3m24s
metallb-system   metallb-speaker-22lwz                               4/4     Running   0          2m56s
metallb-system   metallb-speaker-4jgnf                               4/4     Running   0          2m58s
metallb-system   metallb-speaker-6cm5m                               4/4     Running   0          2m55s
metallb-system   metallb-speaker-fxq4t                               4/4     Running   0          2m58s

Notes

Always check container runtime config first
Debugging:
- Network connectivity
- Container runtime
- kubelet logs
- Pod logs
Keep join tokens somewhere safe
Screenshot working configs (trust me)

Up Next

Planning HA control plane
Need to figure out:
- Load balancer for API server
- Backup strategy (probably should have thought of this earlier)

Note to readers: This is a not a tutorial. Your mileage may vary, and that’s the fun part!

Nomadin