Current dashboard
The base setup is defined in the settings.yaml
file and looks like this:
title: <Your title name>
theme: dark
color: slate
background:
image: <image>
blur: sm
]]>Spent today setting up Homepage to organize access to all the URLs I currently have.
Current dashboard
The base setup is defined in the settings.yaml
file and looks like this:
title: <Your title name>
theme: dark
color: slate
background:
image: <image>
blur: sm
saturate: 50
brightness: 50
opacity: 50
....
<rest of config including layouts>
Services are grouped logically, making it easy to find what you need.
For icons, you can use:
When using material icons or simple icons prefix the icon with e.g mdi-<icon-name> or si-<icon-name>
An example definition in the services.yaml
- Hypervisor:
- Avalon:
icon: proxmox.svg
href: <proxmox url>
description: Main Compute Node
- Aegis:
icon: proxmox.svg
href: <proxmox url>
description: Network Node
- Network:
- Vale:
icon: opnsense.svg
href: <url>
description: OPNsense
- DNS:
icon: adguard.svg
href: <url>
description: DNS Management
The page auto-refreshes as you edit the config, you see the changes in real-time.
You can also add widgets. e.g showing the time:
- datetime:
text_size: xl
format:
timeStyle: short
And bookmarks for quick access:
- Blogs:
- Local Blog:
- abbr: Blog
href: <your blog url>
I still need to:
For now, it's a functional start that makes navigating between services easier, and the cool thing is you can set it as the default browser landing page.
]]>First, head to Minio and create an access key - you'll need both the Access Key and Secret
]]>With Minio running on our new storage, we've now got S3-compatible storage right. First use case: moving VM state off the local filesystem.
First, head to Minio and create an access key - you'll need both the Access Key and Secret Key for the next steps.
Download the file or copy the created keys (once you close the popup the you can no longer copy the keys).
After installing the AWS CLI (see installing guide here) we need to configure it:
aws configure
AWS Access Key ID [None]: <Key ID copied from minio>
AWS Secret Access Key [None]: <Secret key copied from minio>
Default region name [None]: eu-west-1
Default output format [None]:
# Set signature version
aws configure set default.s3.signature_version s3v4
Add your Minio endpoint to the config (if you are using a reverse proxy make sure the URL is pointing to port 9000, not the GUI port):
# ~/.aws/config
[default]
region = eu-west-1
endpoint_url=https://minio-s3.<your domain name> <- Add this line
s3 =
signature_version = s3v4
Let's make sure everything works:
# Create a bucket
aws s3 mb s3://test-minio-bucket
make_bucket: test-minio-bucket
# List buckets
aws s3 ls
2025-01-29 20:33:06 test-minio-bucket
# Remove the test bucket
aws s3 rb s3://test-minio-bucket
remove_bucket: test-minio-bucket
To use this with Terragrunt, you'll need to add some config to the remote_state
block. (See example repo here).
Since we're using Minio rather than actual S3, we need to disable several S3-specific features - otherwise Terragrunt will try to do S3 specific modifications to the bucket settings:
# remote_state block
remote_state {
backend = "s3"
config = {
bucket = "terragrunt-state"
key = "${path_relative_to_include()}/baldr.tfstate"
region = local.aws_region
endpoint = local.environment_vars.locals.endpoint_url
# Skip various S3 features we don't need
skip_bucket_ssencryption = true
skip_bucket_public_access_blocking = true
skip_bucket_enforced_tls = true
skip_bucket_root_access = true
skip_credentials_validation = true
force_path_style = true
}
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
}
Note: The endpoint needs to be set in the Terragrunt config even if it's in the AWS config file - I found Terragrunt doesn't pick it up from there.
]]>Created a new pool using RAIDZ1 with a 256GB SSD as ZFS L2ARC read-cache. Unlike the previous mirror setup from my testing, RAIDZ1 gives me more usable
]]>This is a continuation of this post, the final piece arrived today - another 12TB drive to complete the storage setup.
Created a new pool using RAIDZ1 with a 256GB SSD as ZFS L2ARC read-cache. Unlike the previous mirror setup from my testing, RAIDZ1 gives me more usable space while still protecting against a single drive failure.
Made some datasets:
backups
: For system and VM backupsk8s
: Kubernetes persistent storageiscsi
: Testing ground for VM storage over iSCSII will create more datasets as the need arises.
Swapped out the built-in fan in the drive enclosure with a Noctua - because if you're running 24/7 storage, noise matters. The difference is noticeable (Noctua fans are awesome).
Starting the Minio setup - finally getting back to what started this whole thing.
]]>This started as a simple "let's set up a container registry".
Instead, it turned into a deep dive into USB architecture, storage performance, and a lesson in why not all USB controllers are created equal, that lasted a whole weekend.
I needed to set up a container registry, which meant thinking about storage. The options were:
The registry actually supports using S3 as a backend for image storage. But well cloud providers aren't in the budget, so the next best thing is Minio. I had recently got my hands on two 12TB drives (waiting on a third for RAIDZ1), so this seemed like a perfect use case.
But there was a catch - I'm running mini PCs. No fancy drive bays, or PCIE expansion slots etc, just USB. And this marks the beginning of a very long rabbit hole learning about USB and PCIE devices and things i had no idea about.
The first bit of good news was that my external enclosure supports USB 3.2 Gen 2 and UASP (USB Attached SCSI Protocol). If you are like me you had no idea what UASP was:
You can see this in with lsusb -t
:
/: Bus 10.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/1p, 10000M
| Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 10000M
| Port 1: Dev 3, If 0, Class=Mass Storage, Driver=uas, 10000M
| Port 2: Dev 4, If 0, Class=Mass Storage, Driver=uas, 10000M
| Port 4: Dev 5, If 0, Class=Mass Storage, Driver=uas, 10000M
See that Driver=uas
?.
Digging deeper into the USB setup with lspci -k | grep -i usb
we see:
06:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Rembrandt USB4 XHCI controller #3
06:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Rembrandt USB4 XHCI controller #4
07:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Rembrandt USB4 XHCI controller #8
07:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Rembrandt USB4 XHCI controller #5
07:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Rembrandt USB4 XHCI controller #6
Okay so I have USB 4 ports, cool, but little did I know these weren't all created equal...
Getting TrueNAS set up was straightforward enough - grab ISO, upload to Proxmox, create VM (using q35, UEFI, disabled memory ballooning), install.
Now the "fun" started when trying to get the drives to the VM.
First attempt: pass through the USB controller as a PCIe device.
So looking at our lsusb -t
command we see we see we have:
To find which PCI device controls a USB bus we do a readlink /sys/bus/usb/devices/usb10
which shows:
../../../devices/pci0000:00/0000:00:08.3/0000:07:00.4/usb10
This path shows:
IOMMU groups show which devices can be passed through independently.
So checking IOMMU groups with find /sys/kernel/iommu_groups/ -type l | grep "0000:07:00"
:
/sys/kernel/iommu_groups/26/devices/0000:07:00.0
/sys/kernel/iommu_groups/27/devices/0000:07:00.3
/sys/kernel/iommu_groups/28/devices/0000:07:00.4
So if I understand this correctly it means each device in its own group can be passed through independently.
I passed through the device in group 28 with "all functions" enabled. The drives disappeared from Proxmox (expected), but so did a bunch of other stuff (kind of makes sense given the PCI thing has other things attached to it not just the USB controller).
However the VM wouldn't even boot (interesting):
error writing '1' to '/sys/bus/pci/devices/0000:07:00.0/reset': Inappropriate ioctl for device failed to reset PCI device '0000:07:00.0', but trying to continue as not all devices need a reset error writing '1' to '/sys/bus/pci/devices/0000:07:00.3/reset': Inappropriate ioctl for device failed to reset PCI device '0000:07:00.3', but trying to continue as not all devices need a reset error writing '1' to '/sys/bus/pci/devices/0000:07:00.4/reset': Inappropriate ioctl for device failed to reset PCI device '0000:07:00.4', but trying to continue as not all devices need a reset kvm: ../hw/pci/pci.c:1633: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed. TASK ERROR: start failed: QEMU exited with code 1
looking at what "all functions" is about it looks like it basically says "Pass everything about this device through"
So I unchecked "all functions" (which gave the host some of the previous "disappeared" devices) and the VM started.
Created a pool, everything seemed fine until:
Critical
Pool test state is SUSPENDED: One or more devices are faulted in response to IO failures
Well, that's not good, this usually suggests disk or hardware issues.
TrueNAS UI also shows the pool is in an unhealthy state
I go to the console and check the status zpool status test
and
and check the errors on dmesg dmesg | grep -i error
so:
sdc
, both reads and writesI try a smartctl
and lsblk
, to see what's up with sdc
and the drive has disappeared (this also happened for the other drives later).
No sdc
and the errors are full of references to sdc
. This suggests the drive was present (as sdc) when the errors started occurring, but has since completely disappeared from the system - which could mean:
I do an ls -l /dev/disk/by-id/
and confirm the drives are still there, using different names.
usb-ASMT_<serial>-0:0 -> sde
usb-ASMT_<serial>-0:0 -> sdf
usb-ASMT_<serial>-0:0 -> sdg
So remember when I said all USB and USB controllers are not made the same?
After hours of debugging it turns out that under heavy load the USB controller on the port I was using would "crash" and the drives would "shift around", getting remounted with different paths. (There's is more to this)
Not exactly ideal for a storage system.
So I went looking for ways to have TrueNAS use drives wwn
instead of the dev path but could not find anything that helped.
Time for Plan B. Instead of passing through the controller, pass through individual drives.
Install lshw
and do an lshw -class disk -class storage
to see the drives and their serial numbers.
Do an ls -l /dev/disk/by-id
and copy the wwn
path or the ata
path of the drives.
1. Get drive info:
➜ ~ ls -l /dev/disk/by-id
total 0
lrwxrwxrwx 1 root root 9 Jan 26 20:16 ata-<name-with-serial-number> -> ../../sde
lrwxrwxrwx 1 root root 9 Jan 26 20:16 ata-<name-with-serial-number> -> ../../sdd
lrwxrwxrwx 1 root root 9 Jan 26 20:16 ata-<name-with-serial-number> -> ../../sdf
2. Set up SCSI drives:
then using the wwn
path or the ata
path set scsi
drives with the following command:
qm set <vm-id> -scsi1 /dev/disk/by-id/ata-<name-with-serial-number>
qm set <vm-id> -scsi2 /dev/disk/by-id/ata-<name-with-serial-number>
qm set <vm-id> -scsi3 /dev/disk/by-id/ata-<name-with-serial-number>
3. Add serial numbers to config:
➜ ~ vim /etc/pve/qemu-server/<vm-id>.conf
... other config
scsi1: /dev/disk/by-id/ata-<name-with-serial-number>,size=11176G,serial=<serial-number>
scsi2: /dev/disk/by-id/ata-<name-with-serial-number>,size=11176G,serial=<serial-number>
scsi3: /dev/disk/by-id/ata-<name-with-serial-number>,size=250059096K,serial=<serial-number>
... other config
on the proxmox GUI, you should see the drives are attached and the serial numbers are set under the hardware tab.
And then found if I moved the USB cable to the USB-C port it fixed the flakiness seen when using the other albeit still USB 4 ports.
To make sure this works I did some stress tests with fio
and well the results speak for themselves:
Reading a 10G file in 3.89 seconds (2632MiB/s throughput):
fio --name=test --rw=read --bs=1m --size=10g --filename=./testfile
test: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [R(1)][75.0%][r=2639MiB/s][r=2638 IOPS][eta 00m:01s]
test: (groupid=0, jobs=1): err= 0: pid=7607: Sun Jan 26 05:46:15 2025
read: IOPS=2632, BW=2632MiB/s (2760MB/s)(10.0GiB/3890msec)
clat (usec): min=352, max=1613, avg=379.02, stdev=30.05
lat (usec): min=352, max=1613, avg=379.07, stdev=30.06
clat percentiles (usec):
| 1.00th=[ 359], 5.00th=[ 363], 10.00th=[ 367], 20.00th=[ 367],
| 30.00th=[ 371], 40.00th=[ 371], 50.00th=[ 375], 60.00th=[ 379],
| 70.00th=[ 383], 80.00th=[ 388], 90.00th=[ 396], 95.00th=[ 404],
| 99.00th=[ 441], 99.50th=[ 465], 99.90th=[ 562], 99.95th=[ 1090],
| 99.99th=[ 1467]
bw ( MiB/s): min= 2604, max= 2650, per=100.00%, avg=2633.71, stdev=16.18, samples=7
iops : min= 2604, max= 2650, avg=2633.71, stdev=16.18, samples=7
lat (usec) : 500=99.79%, 750=0.12%, 1000=0.03%
lat (msec) : 2=0.07%
cpu : usr=0.28%, sys=99.49%, ctx=86, majf=0, minf=266
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=10240,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=2632MiB/s (2760MB/s), 2632MiB/s-2632MiB/s (2760MB/s-2760MB/s), io=10.0GiB (10.7GB), run=3890-3890msec
A bigger file and adding --direct=1
to the command to go around the RAM cache
Reading a 50G file in 129 seconds (396MiB/s throughput):
fio --name=test --rw=read --bs=1m --size=50g --filename=./bigtest --direct=1
test: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
test: Laying out IO file (1 file / 51200MiB)
Jobs: 1 (f=1): [R(1)][100.0%][r=267MiB/s][r=267 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=9526: Sun Jan 26 05:54:51 2025
read: IOPS=396, BW=396MiB/s (415MB/s)(50.0GiB/129239msec)
clat (usec): min=133, max=180737, avg=2522.21, stdev=3642.09
lat (usec): min=133, max=180738, avg=2522.37, stdev=3642.10
clat percentiles (usec):
| 1.00th=[ 149], 5.00th=[ 194], 10.00th=[ 212], 20.00th=[ 260],
| 30.00th=[ 799], 40.00th=[ 1500], 50.00th=[ 1876], 60.00th=[ 2245],
| 70.00th=[ 2737], 80.00th=[ 3458], 90.00th=[ 5997], 95.00th=[ 8029],
| 99.00th=[ 12387], 99.50th=[ 16712], 99.90th=[ 34341], 99.95th=[ 45876],
| 99.99th=[116917]
bw ( KiB/s): min=51200, max=2932736, per=100.00%, avg=406019.97, stdev=193023.81, samples=258
iops : min= 50, max= 2864, avg=396.50, stdev=188.50, samples=258
lat (usec) : 250=17.75%, 500=11.38%, 750=0.74%, 1000=1.46%
lat (msec) : 2=21.94%, 4=30.99%, 10=13.34%, 20=2.05%, 50=0.31%
lat (msec) : 100=0.02%, 250=0.02%
cpu : usr=0.14%, sys=10.08%, ctx=37023, majf=0, minf=269
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=51200,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=396MiB/s (415MB/s), 396MiB/s-396MiB/s (415MB/s-415MB/s), io=50.0GiB (53.7GB), run=129239-129239msec
Okay, I could live with those numbers as long as they are stable and consistent.
So I do 5 runs of reading an 800G file (which does include a write during initial file creation) and and writing 900G file, with a mix of both reading and writing at the same time.
The idea is to see if something breaks, so i'm also monitoring the logs and drive temps
I will maybe never experience this kind of thing in one go unless during a resilver so if this is stable I'm good with that
I leave these running and go have a drink with a some friends, it's the weekend after all.
Reading an 800G file:
fio --name=read --rw=read --bs=1m --size=800g --filename=./bigtest
read: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=310MiB/s][r=310 IOPS][eta 00m:01s]
read: (groupid=0, jobs=1): err= 0: pid=10516: Sun Jan 26 21:32:56 2025
read: IOPS=255, BW=256MiB/s (268MB/s)(800GiB/3205833msec)
clat (usec): min=356, max=303293, avg=3911.61, stdev=6703.53
lat (usec): min=356, max=303293, avg=3911.78, stdev=6703.52
clat percentiles (usec):
| 1.00th=[ 416], 5.00th=[ 429], 10.00th=[ 437], 20.00th=[ 457],
| 30.00th=[ 506], 40.00th=[ 603], 50.00th=[ 676], 60.00th=[ 3982],
| 70.00th=[ 4555], 80.00th=[ 5276], 90.00th=[ 11731], 95.00th=[ 12387],
| 99.00th=[ 20579], 99.50th=[ 24773], 99.90th=[ 77071], 99.95th=[125305],
| 99.99th=[198181]
bw ( KiB/s): min=43008, max=555008, per=100.00%, avg=261727.05, stdev=73610.29, samples=6411
iops : min= 42, max= 542, avg=255.54, stdev=71.87, samples=6411
lat (usec) : 500=29.19%, 750=21.95%, 1000=0.45%
lat (msec) : 2=1.53%, 4=7.32%, 10=24.63%, 20=13.79%, 50=0.95%
lat (msec) : 100=0.12%, 250=0.07%, 500=0.01%
cpu : usr=0.10%, sys=13.54%, ctx=402475, majf=0, minf=269
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=819200,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=256MiB/s (268MB/s), 256MiB/s-256MiB/s (268MB/s-268MB/s), io=800GiB (859GB), run=3205833-3205833msec
and writing a 900G file:
fio --name=write --rw=write --bs=1m --size=900g --filename=./big2testfile
write: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=216MiB/s][w=216 IOPS][eta 00m:00s]
write: (groupid=0, jobs=1): err= 0: pid=24687: Sun Jan 26 23:32:36 2025
write: IOPS=179, BW=180MiB/s (188MB/s)(900GiB/5127844msec); 0 zone resets
clat (usec): min=118, max=400029, avg=5542.57, stdev=5208.15
lat (usec): min=120, max=400047, avg=5561.90, stdev=5208.03
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 5], 10.00th=[ 5], 20.00th=[ 5],
| 30.00th=[ 5], 40.00th=[ 5], 50.00th=[ 5], 60.00th=[ 5],
| 70.00th=[ 6], 80.00th=[ 6], 90.00th=[ 7], 95.00th=[ 8],
| 99.00th=[ 22], 99.50th=[ 35], 99.90th=[ 87], 99.95th=[ 101],
| 99.99th=[ 144]
bw ( KiB/s): min= 6144, max=2981888, per=100.00%, avg=184084.36, stdev=65812.35, samples=10254
iops : min= 6, max= 2912, avg=179.74, stdev=64.27, samples=10254
lat (usec) : 250=0.10%, 500=0.02%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=1.37%, 10=95.23%, 20=2.08%, 50=0.88%
lat (msec) : 100=0.20%, 250=0.07%, 500=0.01%
cpu : usr=0.45%, sys=3.40%, ctx=931630, majf=0, minf=16
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,921600,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=180MiB/s (188MB/s), 180MiB/s-180MiB/s (188MB/s-188MB/s), io=900GiB (966GB), run=5127844-5127844msec
From the TrueNAS GUI
In general, I think I learned a whole lot of stuff about USB somethings i had no idea about before.
for drive in sda sdb sdc sdd; do echo "=== /dev/$drive ==="; smartctl -A /dev/$drive | grep -i temp; done
for temps.It's hard to see Argo CD mentioned and GitOps not mentioned (though tbf that's the point of Argo).
GitOps is a way to manage your Kubernetes clusters where your desired state lives in Git, and tools like Argo CD continuously sync this state to your cluster.
Think of it like "infrastructure as code" but for Kubernetes resources.
Well for starters, given how often I kept rebuilding everything from scratch, being able to just point Argo CD at my repo and have it apply everything was 👌.
Anyway, why GitOps?
GitOps helps:
Before jumping into it, you need your cluster in a "usable" state i.e:
sops
decrypt and pipe to kubectl
)It's a tree structure (of sorts) where you have one root application that points to all your other applications.
When Argo syncs this root app, it creates and manages everything defined in your repo.
I ended up preferring Helm charts for this, though other methods exist.
Assuming you already installed Argo
First, grab the initial admin password:
k -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
Port forward so you can authenticate to the CLI (you can also access the UI)
k -n argocd port-forward services/argocd-server 8080:80
Login on the CLI
argocd login localhost:8080
Change the default password:
argocd account update-password --new-password "<your password>"
Add your git repo:
argocd repo add [email protected]:mrdvince/<your repo>.git --ssh-private-key-path <your ssh key path>
You can create the root app either via CLI:
argocd app create apps \
--dest-namespace argocd \
--dest-server https://kubernetes.default.svc \
--repo [email protected]:mrdvince/<your repo>.git \
--path apps/argo_apps
Then sync it:
argocd app sync apps
Or apply a manifest with kubectl
:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: apps
namespace: argocd
spec:
project: default
source:
repoURL: [email protected]:mrdvince/<your repo>.git
targetRevision: HEAD
path: apps/argo_apps/
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
- RespectIgnoreDifferences=true
- ApplyOutOfSyncOnly
Note: the path: apps/argo_apps/
points to the path from the base of the repo.
Once everything is set up, the workflow looks like this:
Now just sit back and watch as Argo CD starts creating and managing all your applications defined in the git repo.
You should then be able to see a dashboard that looks like my screenshot below
]]>After setting up and debugging various parts, I thought I'd share some basic tips that have helped me along the way.
Here's how to merge multiple kubeconfig files:
KUBECONFIG=~/.kube/config:~/.kube/config.cluster2 kubectl config view --flatten > ~/.kube/config.merged
cp ~/.kube/config ~/.kube/config.backup
mv ~/.kube/config.merged ~/.kube/config
You can then rename contexts for better clarity:
kubectl config rename-context default prism
kubectl config rename-context kubernetes-admin@kubernetes atlas
And set proper permissions on your kube config:
chmod 600 ~/.kube/config
If pods aren't scheduling on control plane nodes (I'm using 3 control plane nodes), check for taints:
kubectl get nodes -o json | jq '.items[].spec.taints'
To remove control-plane taints if needed:
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
In general, most issues can be found and solved by following a pattern:
An example of a certificate issue:
Follow the chain of resources when debugging cert-manager:
kubectl get certificate -n argocd
kubectl -n argocd describe certificate argocd-certificate
kubectl -n argocd describe certificaterequests.cert-manager.io argocd-certificate-1
kubectl -n argocd describe order argocd-certificate-1-1494176820
kubectl -n cert-manager logs pods/cert-manager-<some-hash>
Other times just deleting a resource and having it get recreated solves the issue, for example, switching from staging to production Let's Encrypt, you may need to delete the old secrets or the orders and they should be recreated:
e,g kubectl -n argocd delete secrets argocd-tls
When services aren't reachable:
dig
or nslookup
to verify DNS resolutiontcpdump
and netstat
for network debugging:# Check listening ports
netstat -tlpn
# Monitor ARP requests
tcpdump -i any -n arp
If setting up a new cluster using kubeadm (not on the cloud) use Metalb or Cilium to give load balancer IP addresses.
If using Cilium, here's a sample configuration:
apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
name: "lb-pool"
spec:
blocks:
- cidr: "192.168.30.140/30"
---
apiVersion: "cilium.io/v2alpha1"
kind: CiliumL2AnnouncementPolicy
metadata:
name: cilium-l2-announce
spec:
externalIPs: true
loadBalancerIPs: true
interfaces:
- eth0
All services run through traefik so a few loadbalancer IPs are plenty.
Debug Argo CD applications, you can render out the chart:
helm template . -f values.yaml > rendered-app.yaml
And for helmfile:
helmfile template > rendered.yaml
]]>Having moved to a k3s deployed traefik, this covers setting up routes for services running outside the cluster.
First, we need to enable external name services in Traefik (add this to your traefik ingress deployment):
set:
- name: providers.kubernetesCRD.allowExternalNameServices
value: true
I'm keeping all routes separate from other deployments using a dedicated routes.yaml
for helmfile:
releases:
- name: routes
namespace: routes
chart: ./charts/routes
Assuming you have a ClusterIssuer
already configured next we request a wildcard certificate for the services defined in the namespace.
I typically request wildcard certificates per namespace if there are multiple services needing ingress routes
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: routes-ns-ssl-cert
namespace: routes
spec:
commonName: "*.<your domain>"
dnsNames:
- "*.<your domain>"
issuerRef:
kind: ClusterIssuer
name: letsencrypt-dns01-issuer
secretName: ssl-cert-prod
First, an HTTPS redirect middleware - this forces all incoming traffic to use HTTPS
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: https-redirect
namespace: routes
spec:
redirectScheme:
scheme: https
permanent: true
For services like Proxmox and OPNsense that use self-signed certificates, we need to tell Traefik to skip verification:
apiVersion: traefik.io/v1alpha1
kind: ServersTransport
metadata:
name: insecure-skip-verify
namespace: routes
spec:
insecureSkipVerify: true
First, create a service for your endpoint:
apiVersion: v1
kind: Service
metadata:
name: opnsense
namespace: routes
spec:
ports:
- port: 443
targetPort: 443
type: ExternalName
externalName: 192.168.2.1 <this is the IP of the service you want to route to or its domain name>
You'll need two routes - one for HTTPS and one for HTTP redirect:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: opnsense
namespace: routes
spec:
entryPoints:
- websecure
routes:
- match: Host(`<the URL you want to the service to be reached on>`)
kind: Rule
services:
- name: opnsense
port: 443
scheme: https
serversTransport: insecure-skip-verify
tls:
secretName: ssl-cert-prod
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: opnsense-redirect
namespace: routes
spec:
entryPoints:
- web
routes:
- match: Host(`<the URL you want to the service to be reached on>`)
kind: Rule
middlewares:
- name: https-redirect
services:
- name: noop@internal
kind: TraefikService
Add your routes to your DNS provider:
<your defined url> -> traefik loadbalancer ip
Or if your DNS provider supports wildcards:
*.<your defined url> -> traefik loadbalancer ip
I'm using AdGuard Home, so it's just a matter of going to Filters > DNS rewrites and adding either a wildcard or specific URL.
Review the changes first:
helmfile diff --file routes.yaml
Then deploy:
helmfile apply --file routes.yaml
You should now be able to access the service via the URL name used.
]]>I've moving Grafana from my atlas
main cluster described here over to the prism
cluster.
The thinking here is services that need to connect to multiple sources could ideally live in one centralized place. (we will see
]]>I'm making some changes to my service organization today.
I've moving Grafana from my atlas
main cluster described here over to the prism
cluster.
The thinking here is services that need to connect to multiple sources could ideally live in one centralized place. (we will see how that goes)
On a different note, I've switched from Hugo to Ghost for the blog. The growing pile of markdown files was getting a bit unwieldy, and I needed better organization.
I'm still keeping it static and hosted on Cloudflare Pages, just like the Hugo version.
It's a neat setup that gives me Ghost's content management while maintaining the benefits of static hosting.
If you're curious about setting up Ghost as a static site on Cloudflare Pages, there's a great guide by Paolo Tagliaferri that walks through the process.
]]>Left off with the Traefik setup in k3s not picking up the certificate, and if you read that post one thing that immediately comes to mind is “Did you check the right TLS secret was being used?”
Since I&
]]>This is a continuation of the previous post here.
Left off with the Traefik setup in k3s not picking up the certificate, and if you read that post one thing that immediately comes to mind is “Did you check the right TLS secret was being used?”
Since I’m working with Helm charts with nested dependencies, I decided to use helmfile template
to see what’s being deployed:
helmfile template --file apps.yaml > rendered.yaml
Looking at the output, I spot the issue:
# Source: traefik-config/templates/ingress-route.yaml
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: traefik-dashboard
namespace: kube-system
spec:
entryPoints:
- websecure
routes:
- kind: Rule
match: Host(`ark.prism.home.mrdvince.me`)
middlewares:
- name: traefik-dashboard-basicauth
namespace: kube-system
services:
- name: api@internal
kind: TraefikService
tls:
secretName: # Aha! there you go, empty secret name
Well, that explains why Traefik is falling back to its self-signed certificate. The certificate is there and valid, but this naming mismatch in the chart config means Traefik doesn’t know about it.
After some chart refactoring and a crash course in Go templating, I’ve got everything working.
sidenote: It really does help to debug with a fresh mind, sometimes the obvious answers become apparent after stepping away for a bit.]]>
(Also very likely a skill issue).
To be fair having the nodes in a cluster made it easier for centralized management and easy VM migrations.
This is]]>
After the whole VM boot ordering not working the way i wanted, I finally decided to uncluster the nodes.
(Also very likely a skill issue).
To be fair having the nodes in a cluster made it easier for centralized management and easy VM migrations.
This is more of day 6 now
Now remember this experiment running Traefik in an LXC container? Well, I thought - why not make this more interesting? Instead of systemd services, why not run it on k3s? it would make it easier to manage traefik and run some other containers without reaching for portainer plus anyway, i already have Helm charts from the other kubeadm setup.
Here’s how it went/going (depending on when you read this) down:
First, the k3s install: Disabling kube-proxy and using cilium for networking (almost muscle memory at this point):
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC='--disable=traefik,disable-kube-proxy,disable-network-policy --flannel-backend=none --write-kubeconfig-mode=644 --etcd-expose-metrics true' sh -
As expected, pods not running yet - no networking configured, easy fix (as always).
I won’t bore you with the details but let’s just say it wasn’t as straightforward as I thought it would be. But anyway got Cilium running after some debugging, all I needed was this (ipam.operator.clusterPoolIPv4PodCIDRList: "10.42.0.0/16"
).
Final helmfile config section:
repositories:
- name: cilium
url: https://helm.cilium.io/
releases:
- name: cilium
namespace: kube-system
chart: cilium/cilium
version: 1.16.5
values:
- ipam.operator.clusterPoolIPv4PodCIDRList: "10.42.0.0/16"
- operator.replicas: 1
Next up was getting the secrets in place. I pulled my age keys from Bitwarden secrets manager:
export SOPS_AGE_KEY=$(bws secret get <uuid> | jq .value | xargs)
export AGE_PUBLIC_KEY=$(bws secret get <uuid> | jq ".value" | xargs)
Applied them to the cluster (well, it’s just one node, but hey I don’t know what to call it):
sops --decrypt --encrypted-regex '^(data|stringData)$' ../atlas/manifests/secrets/traefik-auth-secret.yaml | k apply -f -
sops --decrypt --encrypted-regex '^(data|stringData)$' ../atlas/manifests/secrets/cloudflare-token-secret.yaml | k apply -f -
These are necessary for Traefik auth middleware and the cluster issuer’s Cloudflare DNS solvers integration stuff in cert-manager.
Then came the interesting part. Cert-manager is failing to verify certificates with an interesting error:
Warning ErrInitIssuer 7m27s (x8 over 22m) cert-manager-clusterissuers Error initializing issuer: Get "https://acme-staging-v02.api.letsencrypt.org/directory": tls: failed to verify certificate: x509: certificate is valid for OPNsense.localdomain, not acme-staging-v02.api.letsencrypt.org
Warning ErrInitIssuer 2m27s (x2 over 2m27s) cert-manager-clusterissuers Error initializing issuer: Get "https://acme-staging-v02.api.letsencrypt.org/directory": tls: failed to verify certificate: x509: certificate is valid for 01d3241326f8a773d75e9f119eb7de02.2a95698c2bc1a035feb46fa4cfb29c0f.traefik.default, not acme-staging-v02.api.letsencrypt.org
Strange, okay so lets see, do i have any special filtering on my opnsense, hmm!!! and why OPNsense.localdomain and traefik.default?
Some DNS debugging:
nslookup acme-staging-v02.api.letsencrypt.org
Looked fine…
curl -v https://acme-staging-v02.api.letsencrypt.org/directory
All seems fine ….
Then aha!!! Turned out to be an interesting DNS loop - the instance uses OPNsense which in turn I had set to only use AdGuard DNS for the VLAN the instance was running on, I had a record for the domain in AdGuard rewrites rules pointing back to traefik. Sort of a circular dependency.
After fixing the DNS setup, finally, the certificate is finally ready:
kubectl get certificates
NAMESPACE NAME READY SECRET AGE
kube-system traefik-certificate True traefik-tls-staging 40m
But hold up!, while the certificate is issued and ready, Traefik isn’t picking it up. it’s still defaulting to the self-signed certificate when accessing the URL.
At this point, it is approaching midnight and I’m tired (this could be the issue).
To be continued…
Note: For this k3s setup, I’m keeping it simpler and using helmfile to deploy the charts instead of Argo CD
]]>Think of a reverse proxy like a bartender - you ask for a drink, and
]]>After running Proxmox for about 2 months, I realized I hadn’t tried out LXC containers yet. What better way to start than setting up a reverse proxy?
Think of a reverse proxy like a bartender - you ask for a drink, and they handle getting it from the right place. More technically, it’s a server that sits between clients and your web services, forwarding requests.
LXC containers are lightweight VMs that still give you a full Linux userspace, including systemd and other traditional VM behaviors.
First, grab an Ubuntu template:
pveam download local ubuntu-22.04-standard_22.04-1_amd64.tar.zst
Then create a container with 2GB RAM and 8GB disk:
pct create 1001 local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
--hostname <your-container-name> \
--cores 2 \
--memory 2048 \
--rootfs local-lvm:8 \
--net0 name=eth0,bridge=vmbr0,ip=dhcp \
--unprivileged 1 \
--ssh-public-keys /root/.ssh/id_rsa.pub
While Traefik is popular in Docker and Kubernetes environments, it works great as a standalone binary too (you do loose the nice service auto discovery features though).
Here’s how I set it up:
mkdir -p ~/traefik
traefik.yaml
):log:
level: "DEBUG"
api:
insecure: true # Temporarily enable dashboard for debugging
entryPoints:
web:
address: ":80"
websecure:
address: ":443"
providers:
file:
directory: ./config # inside the traefik directory
watch: true
~/traefik/config/services.yaml
):http:
routers:
opnsense:
rule: "Host(`<your-domain>`)"
entryPoints:
- "websecure"
service: opnsense
tls: {}
# HTTP to HTTPS redirect
opnsense-redirect:
rule: "Host(`<your-domain>`)"
entryPoints:
- "web"
middlewares:
- https-redirect
service: opnsense
middlewares:
https-redirect:
redirectScheme:
scheme: https
permanent: true
services:
opnsense:
loadBalancer:
servers:
- url: "https://<opnsense-ip>"
serversTransport: insecure-skip-verify
serversTransports:
insecure-skip-verify:
insecureSkipVerify: true
If you’re dealing with self-signed certificates and see errors like "tls: failed to verify certificate: x509: cannot validate certificate for..."
, setting the insecureSkipVerify
to true in the serversTransport
should fix that.
Launch Traefik:
traefik --configfile=traefik.yaml
This can then be converted into a systemd service for automatic startup.
PS: I changed the numbering of the days on the posts to go at the end of the title, felt like adding the day number at the beginning got cluttered.
]]>The initial solution of setting quorum expectations to 1 worked… sort of. Here’s what happened:
When a node booted up (remember it can’t initially “see”
]]>It turns out that Proxmox’s quorum requirements are not as “simple” as I thought.
The initial solution of setting quorum expectations to 1 worked… sort of. Here’s what happened:
When a node booted up (remember it can’t initially “see” the other node), OPNsense would start (great!), provide DHCP and network connectivity (also great!), but then things got interesting. Once the network was up and the Proxmox nodes could talk to each other, the other VMs would fail to start with cryptic errors like:
generating cloud-init ISO TASK ERROR: start failed: command ........ <very long kvm command> failed: got timeout
The issue? Trying to set quorum to 1 in a cluster with both nodes available fails.
The solution ended up being two-fold:
Start/Shutdown order
settingsHere’s the updated service that checks node count before trying to set quorum:
[Unit]
Description=Set Proxmox quorum expectations
After=corosync.service pve-cluster.service
Requires=corosync.service pve-cluster.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=/bin/sleep 10
ExecStart=/bin/bash -c 'if [ "$(/usr/bin/pvecm status | grep "Nodes:" | awk "{print \$2}")" = "1" ]; then /usr/bin/pvecm expected 1; fi'
[Install]
WantedBy=multi-user.target
Now the cluster behaves as expected: VMs start properly whether we’re running on one node or two, and OPNsense starts when it should.
]]>First up, got OPNsense reinstalled on the N100 box - fresh start for
]]>I got a new Netgear managed switch yesterday and so today was all about network segmentation and fighting with OPNsense. Also learned some interesting bits about Proxmox clustering that I didn’t expect to deal with.
First up, got OPNsense reinstalled on the N100 box - fresh start for the VLAN setup. The interesting bit was that OPNsense runs as a VM on one of my Proxmox nodes, but needs to handle traffic for Proxmox’s management interface itself. Not exactly a chicken-and-egg problem since they’re on different NICs, but took some time to get right.
VLANs setup:
VLAN 10: Home Network (192.168.10.0/24)
- All regular home devices (e.g Xbox, laptops, phones etc)
- Another unmanaged switch connects here for more devices
VLAN 50: Core Services/Proxmox (192.168.50.0/24)
- Proxmox management IPs
- AdGuard Home
- Traefik
- Any other "always on" services
VLAN 30: Kubernetes (192.168.30.0/24)
- K8s control plane
- K8s workers
- K8s services
- etc
And then I stumbled across it, the issue with my Proxmox cluster.
When one node is offline, the other wouldn’t boot the VMs because of quorum requirements. Not ideal for a homelab where I might want to take nodes down frequently and need the OPNsense VM to be available as soon as it starts.
After some googling and finding posts talking about running something like opnsense on a different node, etc etc, all not exactly an option for me.
So the solution was a simple systemd service that sets quorum expectations to 1 on boot:
[Unit]
Description=Set Proxmox quorum expectations
After=corosync.service pve-cluster.service
Requires=corosync.service pve-cluster.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStartPre=/bin/sleep 10
ExecStart=/usr/bin/pvecm expected 1
[Install]
WantedBy=multi-user.target
This lets me boot a single node without the cluster throwing a fit about quorum. Might not be production-grade, but perfect for a homelab where I just need the main services to be available as soon as possible.
]]>Been rebuilding the k8s cluster multiple times (as one does
]]>I recently came across the #100DaysOfHomelab challenge and thought it’d be a perfect way to document stuff I’m doing. Day 1 is really a catch-up on recent changes, and there have been quite a few.
Been rebuilding the k8s cluster multiple times (as one does in a homelab). The main improvements have been around networking. Since I’m using Cilium as my CNI of choice, I made two significant changes:
I’ve also set up a bunch of applications using Argo CD’s app of apps pattern. I’ll share a list eventually.
Added a new dedicated Proxmox node specifically for services that need to be always available:
I’ve been trying to document more of these adventures - my drafts folder is growing faster than my published posts, which probably says something about my experimenting-to-documentation ratio. Hoping this 100 days challenge will help fix that balance a bit.
]]>This is how I did it.
This was the first thing i did after installing the hypervisor, I went looking]]>
So you've got Proxmox running, and you’re tired of clicking through the UI to create VMs. The next logical step is to write some provisioning code.
This is how I did it.
This was the first thing i did after installing the hypervisor, I went looking for ways to use either ansible or terraform it, I kept recreating and destroying machine so having a way to easily bring them up and down was useful.
First things first - Proxmox needs to know who we are and what we’re allowed to do. Let’s create a role with just enough permissions (courtesy of Proxmox Provider docs):
pveum role add TerraformProv -privs "Datastore.AllocateSpace Datastore.AllocateTemplate Datastore.Audit Pool.Allocate Sys.Audit Sys.Console Sys.Modify VM.Allocate VM.Audit VM.Clone VM.Config.CDROM VM.Config.Cloudinit VM.Config.CPU VM.Config.Disk VM.Config.HWType VM.Config.Memory VM.Config.Network VM.Config.Options VM.Migrate VM.Monitor VM.PowerMgmt SDN.Use"
Create a user and give them our new role:
pveum user add terraform-prov@pve --password <password>
pveum aclmod / -user terraform-prov@pve -role TerraformProv
After this, head to the Proxmox UI and create an API token. You’ll need this for Terraform authentication with proxmox.
Before we can automate VM creation, we need a base image. Download your distro of choice (I went with Ubuntu 24.04 because I like it):
# Get the tools we need
sudo apt update -y && sudo apt install libguestfs-tools -y
# Download the image and add qemu-guest-agent
# (read more about the guest agent here https://pve.proxmox.com/wiki/Qemu-guest-agent)
wget <distro-cloud-image-url>
sudo virt-customize -a noble-server-cloudimg-amd64.img --install qemu-guest-agent
# Create and configure the template VM
qm create 9000 --name ubuntu-cloud --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0
qm importdisk 9000 noble-server-cloudimg-amd64.img local-lvm
qm set 9000 --scsihw virtio-scsi-pci --scsi0 local-lvm:vm-9000-disk-0
qm set 9000 --agent enabled=1
Pro tip: Skip setting SSH keys in the base template. We’ll handle that in Terraform to avoid cloud-init overwriting our keys on every boot.
Now for the fun part. First, tell Terraform about our Proxmox provider:
terraform {
required_providers {
proxmox = {
source = "Telmate/proxmox"
version = "3.0.1-rc4"
}
}
}
provider "proxmox" {
pm_api_url = var.pm_api_url # Your Proxmox API URL
pm_api_token_id = var.pm_api_token_id # API token ID from earlier
pm_api_token_secret = var.pm_api_token_secret # The secret part
pm_tls_insecure = false # Set true if using self-signed certs
}
I created a reusable module for this, you can find it here terraform-proxmox-qemu. Here’s an example (check the repo for more details):
module "cp" {
source = "[email protected]:mrdvince/terraform-proxmox-qemu.git"
count = 3
vmname = "controlplane-${count.index + 1}"
template_name = "ubuntu-cloud"
os_type = "cloud_init"
target_node = "node01"
vmid = "${count.index + 1 + 600}"
ipconfig0 = "ip=192.168.50.13${count.index + 1}/24,gw=192.168.50.1"
network = {
bridge = "vmbr0"
firewall = false
link_down = false
model = "virtio"
}
cipassword = var.cipassword
vm_config_map = {
bios = "ovmf"
boot = "c"
bootdisk = "scsi0"
ciupgrade = true
ciuser = "ubuntu"
cores = 4
define_connection_info = true
machine = "q35"
memory = 8192
onboot = true
scsihw = "virtio-scsi-pci"
balloon = 4096
}
disks = {
storage = "local-lvm"
backup = true
discard = false
emulatessd = false
format = "raw"
iothread = false
readonly = false
replicate = false
size = "128G"
}
serial = {
id = 0
type = "socket"
}
sshkeys = file("~/.ssh/devkey.pub")
efidisk = {
efitype = "4m"
storage = "local-lvm"
}
}
This configuration gives you a VM with:
Run terraform init
(or tofu init
), then terraform plan
to see what’s going to happen, and finally terraform apply
.
When creating multiple VMs in parallel, you might hit lock conflicts, to fix this explicitly set VMIDs:
vmid = "${count.index + 1 + 600}" # Starts from 601 and increments
This lets all VMs create in parallel without fighting over IDs or locks.