Setting up Talos in HA Mode (Day 32)
Setting up a high-availability Kubernetes cluster with Talos
I decided to migrate from kubeadm
and ansible playbooks and switch to talos (mostly out of curiosity and it looks like an easier way to manage and do cluster upgrades)
Why Talos?
What makes Talos interesting:
- Immutable infrastructure (no SSH, no shell)
- API-driven configuration
- Designed from the ground up for Kubernetes
- Also I did say I would try it after this years KubeCon EU so...
Setting Up HA Control Plane
I didn't want to setup an external haproxy load balancer (though I plan to use opnsense instead, a bit different from my existing clusters), I defaulted to using talos's inbuilt VIP support.
Here's how I approached it:
First, create a controlplane patch file for configuration overrides:
machine:
network:
interfaces:
- interface: enp6s18 # Use talosctl -n <IP> get links --insecure
dhcp: true
vip:
ip: 10.30.30.135
cluster:
apiServer:
certSANs:
- 10.30.30.135
- 10.30.30.131
- 10.30.30.132
- 10.30.30.133
admissionControl:
- name: PodSecurity
configuration:
defaults:
audit: privileged
audit-version: latest
enforce: privileged
enforce-version: latest
warn: privileged
warn-version: latest
network:
cni:
name: none
podSubnets:
- 10.244.0.0/16
serviceSubnets:
- 10.96.0.0/16
proxy:
disabled: true
The patch disables the CNI and kubeproxy as I plan to use Cilium as a replacement for these two later.
Configuration Generation
Generate configs for your HA setup with the VIP:
talosctl gen config daedalus https://10.30.30.135:6443 \ # Use the VIP
--output-dir _out \
--with-cluster-discovery \
--config-patch-control-plane @controlplane.yaml \
--config-patch-worker @worker.yaml # If you have worker patches apply them too
Applying Configurations
Apply to control plane nodes:
talosctl apply-config --insecure --nodes 10.30.30.131 --file _out/controlplane.yaml
talosctl apply-config --insecure --nodes 10.30.30.132 --file _out/controlplane.yaml
talosctl apply-config --insecure --nodes 10.30.30.133 --file _out/controlplane.yaml
Apply to worker nodes:
talosctl apply-config --insecure --nodes 10.30.30.134 --file _out/worker.yaml
After applying the config, the nodes reboot, wait for the reboot and do a bootstrap on one of the controlplane nodes.
Bootstrapping
After Talos installs, and reboots run:
export TALOSCONFIG=$(pwd)/_out/talosconfig
talosctl config endpoint 10.30.30.131 10.30.30.132 10.30.30.133
talosctl config node 10.30.30.131
talosctl bootstrap
Health Check and Kubeconfig
Check cluster health:
talosctl health
This command might stall at waiting for all k8s nodes to report ready
if you set CNI to none in your config.
As long as the kubelet, apiserver, controller-manager, and scheduler are ready, you can proceed to install a CNI plugin, I went with Cilium as always.
Generate kubeconfig:
talosctl kubeconfig --nodes 10.30.30.131 --endpoints 10.30.30.135 -f
talosctl config endpoint 10.30.30.135
Automating
I created an Ansible playbook to automate this entire process, but I just found there's a terraform provider for talos, so I may be switching to that instead.
UPDATE: While switching I instead ended up with makefiles, the amount recreate i was doing needed something to just run all the terragrunt, helmfile etc commands.
First Impressions
The biggest challenge was understanding the bootstrapping process and how the VIP gets managed, but once configured, I pointed the deployed Argo instance and had my deployments up and running.