Post

About 6 Months Ago I Stumbled Upon Jeff Geerlings Video On Youtube One Thing Led To Another And Here We Are My First Homelab

About 6 Months Ago I Stumbled Upon Jeff Geerlings Video On Youtube One Thing Led To Another And Here We Are My First Homelab

About 6 Months Ago I Stumbled Upon Jeff Geerling’s Video On YouTube. One Thing Led To Another, And Here We Are: My First Homelab

Introduction

The journey from watching a random YouTube video to building enterprise-grade infrastructure at home is a rite of passage for many DevOps professionals. Six months ago, Jeff Geerling’s homelab content became the catalyst for my own infrastructure adventure - culminating in a fully 3D-printed rack running a high-availability Kubernetes cluster on Talos Linux. This is the story of how curiosity became capability, and why every infrastructure engineer should consider building a purpose-driven homelab.

In today’s cloud-native landscape, hands-on experience with infrastructure fundamentals separates competent engineers from true masters. A homelab provides the ultimate sandbox for:

  • Testing production-grade configurations risk-free
  • Developing muscle memory for infrastructure-as-code workflows
  • Understanding the physical constraints of virtual systems
  • Experimenting with cutting-edge technologies before enterprise adoption

This comprehensive guide will walk through building an enterprise-inspired homelab featuring:

  1. 3D-Printed Modular Rack System using Homeracker’s open-source designs
  2. High-Availability Kubernetes Cluster powered by Talos Linux
  3. Energy-Efficient Hardware configuration consuming <60W at idle
  4. Bare Metal Automation techniques for reproducible infrastructure

Whether you’re a seasoned sysadmin expanding your cloud-native skills or a DevOps engineer building your first physical cluster, this deep dive into modern homelab design will provide actionable insights for your infrastructure journey.

Understanding the Topic

The Homeracker Revolution: 3D-Printed Infrastructure

Homeracker.org represents a paradigm shift in homelab design - an open-source framework for creating custom 3D-printed server racks. Unlike traditional metal racks, Homeracker’s modular system enables:

FeatureBenefit
PLA/PETG Printed ComponentsCost-effective (90%+ savings vs metal racks)
Customizable DimensionsPerfect fit for SFF (Small Form Factor) hardware
Tool-Free AssemblySnap-fit design with minimal hardware
Passive Cooling OptimizationCustom vent patterns for silent operation

Technical Specifications:

  • Layer Height: 0.2mm (structural) / 0.28mm (non-critical)
  • Infill: 25% Gyroid (balance between strength and material use)
  • Supports: Tree-style (minimal material waste)
  • Print Time: ~18 hours per RU (Rack Unit) section on Prusa P2S

Compared to commercial alternatives like StarTech 12U racks ($300+) or LackRack hacks, Homeracker provides professional-grade organization at hobbyist budgets.

Talos Linux: Kubernetes-Optimized OS

Talos Linux is a container-focused operating system designed exclusively for Kubernetes:

Key Features:

  • Immutable root filesystem (read-only /usr)
  • API-driven management (no SSH access)
  • Automated etcd backups and recovery
  • Secure by default (SELinux, no shell access)
  • Dual-stack IPv4/IPv6 support

Unlike Ubuntu or CentOS-based Kubernetes setups, Talos eliminates OS maintenance overhead through:

  1. Declarative Configuration: YAML-defined machine states
  2. Atomic Updates: Rollback-safe OS upgrades
  3. Minimal Attack Surface: No package managers or unused services

High-Availability Kubernetes Architecture

The reference architecture implements:

  • 3 Control Plane Nodes: Etcd quorum for cluster state
  • 3 Worker Nodes: Workload scheduling capacity
  • Topology Spread Constraints: Automatic anti-affinity
  • BGP-based Load Balancing (using kube-vip)

This production-inspired design achieves:

  • Zero single-point-of-failure
  • Rolling update capability
  • <30s failover for control plane components
  • Persistent workload availability during maintenance

Prerequisites

Hardware Requirements

Based on the successful Reddit build, we recommend:

ComponentSpecificationNotes
Compute NodesLenovo ThinkCentre M910 TinySFF, 35W TDP
Control Plane3x 8GB RAM / 256GB SSDMinimal resource overhead
Worker Nodes3x 32GB RAM / 1TB SSDWorkload capacity
NetworkingTP-Link TL-SG108 V3Unmanaged gigabit switch
Power65W USB-C PDPer-node power supply

Total Estimated Cost: $800-$1200 (used hardware market)

Software Requirements

  • Talos Linux: v1.6.4+ (stable channel)
  • Kubernetes: v1.29+ (via Talos)
  • Container Runtime: containerd v2.0+ (embedded in Talos)
  • CLI Tools:
    • talosctl v1.6.4+
    • kubectl v1.29+
    • kube-vip v0.6.4+

Network Considerations

  • Subnet Planning:
    • Node IPs: 192.168.1.100-192.168.1.105 /24
    • Kubernetes Pod CIDR: 10.244.0.0/16
    • Service CIDR: 10.245.0.0/16
  • Firewall Rules:
    • Allow TCP 6443 (Kubernetes API)
    • Allow UDP 8472 (Flannel VXLAN)
    • Allow TCP 2379-2380 (etcd)

Installation & Setup

Step 1: 3D Printing the Rack

Using Homeracker’s open-source designs:

1
2
3
4
5
6
7
8
9
10
# Clone repository
git clone https://github.com/homeracker/homeracker.git
cd homeracker/designs/standard-19-inch

# Slice models (PrusaSlicer example)
prusa-slicer \
  --layer-height 0.2 \
  --infill 25 \
  --support-material tree \
  --output ru1.gcode RU1.stl

Printing Notes:

  • Use PLA+ for structural components
  • Orient parts to minimize supports
  • Allow 24h cure time before assembly

Step 2: Bare Metal Provisioning

Talos Linux Installation:

  1. Download the latest image:
    1
    
    talosctl download --arch amd64 --platform metal --version v1.6.4
    
  2. Prepare boot media:
    1
    
    sudo dd if=metal-amd64.iso of=/dev/sdX bs=4M status=progress
    
  3. Boot nodes sequentially with monitor/keyboard attached

Step 3: Cluster Initialization

Generate machine configurations:

1
2
3
4
5
6
7
8
9
# Control planes
talosctl gen config homelab-cluster https://192.168.1.100:6443 \
  --output-dir ./homelab-config \
  --config-patch-control-plane @control-patch.yaml

# Workers
talosctl gen config homelab-cluster https://192.168.1.100:6443 \
  --output-dir ./homelab-config \
  --config-patch-worker @worker-patch.yaml

Example control-patch.yaml:

1
2
3
4
5
6
7
8
9
10
11
12
13
machine:
  network:
    interfaces:
      - interface: eth0
        dhcp: true
  install:
    disk: /dev/sda
cluster:
  extraManifests:
    - https://raw.githubusercontent.com/kube-vip/kube-vip/v0.6.4/docs/manifests/rbac.yaml
  network:
    cni:
      name: flannel

Apply configurations:

1
2
3
4
5
6
# Bootstrap first control plane
talosctl apply-config --insecure --nodes 192.168.1.100 --file homelab-config/controlplane.yaml

# Join remaining nodes
talosctl apply-config --insecure --nodes 192.168.1.101 --file homelab-config/controlplane.yaml
talosctl apply-config --insecure --nodes 192.168.1.102 --file homelab-config/worker.yaml

Step 4: Cluster Bootstrap

1
2
3
4
5
# Fetch kubeconfig
talosctl kubeconfig --nodes 192.168.1.100

# Verify node status
kubectl get nodes -o wide

Expected output:

1
2
3
4
NAME           STATUS   ROLES           AGE   VERSION   INTERNAL-IP     
homelab-cp1    Ready    control-plane   5m    v1.29.2   192.168.1.100
homelab-cp2    Ready    control-plane   4m    v1.29.2   192.168.1.101
homelab-wrk1   Ready    <none>          3m    v1.29.2   192.168.1.102

Configuration & Optimization

Talos Linux Hardening

  1. Enable Kubernetes Audit Policy:
    1
    2
    3
    4
    5
    6
    7
    
    cluster:
      apiServer:
     auditPolicy:
       apiVersion: audit.k8s.io/v1
       kind: Policy
       rules:
         - level: Metadata
    
  2. Restrict API Access:
    1
    
    talosctl patch machineconfig --nodes 192.168.1.100 <(echo '{"machine":{"files":[{"op":"add","path":"/etc/kubernetes/audit-policy.yaml","content":{"inline":"..."}}]}}')
    

Performance Tuning

Kernel Parameters (/etc/sysctl.d/10-homelab.conf):

1
2
3
4
5
6
7
8
9
10
# Increase TCP buffer sizes
net.core.rmem_max=16777216
net.core.wmem_max=16777216

# Enable BBR congestion control
net.ipv4.tcp_congestion_control=bbr

# Optimize virtual memory
vm.swappiness=10
vm.vfs_cache_pressure=50

Kubelet Configuration (worker nodes):

1
2
3
4
5
6
machine:
  kubelet:
    extraArgs:
      max-pods: "250"
      kube-reserved: "cpu=500m,memory=1Gi"
      system-reserved: "cpu=1000m,memory=2Gi"

Energy Efficiency Practices

  1. Power Management:
    1
    2
    3
    4
    5
    6
    7
    
    # Set CPU governor
    talosctl etcdctl --nodes 192.168.1.100 patch machineconfig <<EOF
    machine:
      sysctls:
     kernel.sched_energy_aware: 1
     kernel.nmi_watchdog: 0
    EOF
    
  2. Hardware-Specific Tweaks: ```yaml

    Lenovo ThinkCentre power profile

    machine: kernel: modules:

    • name: thinkpad_acpi parameters:
      • fan_control=1 ```

Usage & Operations

Day-to-Day Management

Common Talos Commands:

1
2
3
4
5
6
7
8
# View node status
talosctl -n 192.168.1.100 get members

# Inspect services
talosctl -n 192.168.1.100 services

# Update OS
talosctl -n 192.168.1.100 upgrade --image ghcr.io/talos-systems/talos:v1.6.4

Kubernetes Operations:

1
2
3
4
5
6
7
8
# Drain node before maintenance
kubectl drain homelab-wrk1 --ignore-daemonsets

# Apply cluster changes
talosctl -n 192.168.1.100 apply-config -f updated-config.yaml

# Monitor cluster events
kubectl get events --all-namespaces --sort-by='.metadata.creationTimestamp' -w

Backup Strategy

Automated etcd Backups:

1
2
3
4
5
6
7
cluster:
  backup:
    enabled: true
    interval: 15m
    destination:
      local:
        directory: /var/etcd/backups

Disaster Recovery Process:

1
2
3
4
5
6
# Stop all control planes
talosctl -n 192.168.1.100 reset

# Bootstrap from backup
talosctl -n 192.168.1.100 bootstrap \
  --recover-from=/var/etcd/backups/automatic/etcd-2024-05-01T12:00:00Z

Troubleshooting

Common Issues & Solutions

Node Failing to Join Cluster:

1
2
3
4
5
# Check network connectivity
talosctl -n 192.168.1.103 ping 192.168.1.100

# Inspect join logs
talosctl -n 192.168.1.103 logs controller-runtime

Kubernetes API Unresponsive:

1
2
3
4
5
# Verify etcd health
talosctl -n 192.168.1.100 etcdctl member list

# Check control plane pods
kubectl -n kube-system get pods -l tier=control-plane

Certificate Expiration:

1
2
3
4
5
# Renew certificates
talosctl -n 192.168.1.100 renew certificates

# Verify cert validity
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -dates

Conclusion

Six months after that fateful YouTube session, this homelab journey has reinforced three critical lessons for infrastructure professionals:

  1. Physical Constraints Matter: Understanding power, thermal, and space limitations informs cloud architecture decisions
  2. Declarative Infrastructure Scales: Talos Linux’s API-driven model proves that reproducibility beats manual configuration
  3. Production Patterns Apply Anywhere: HA Kubernetes on $1,000 hardware demonstrates cloud-native principles transcend environments

For those starting their homelab journey:

  1. Begin with energy-efficient hardware
  2. Standardize on immutable infrastructure patterns
  3. Implement production-grade resilience from day one

Recommended Resources:

The homelab isn’t just a hobby - it’s the proving ground where theory becomes practice, failures become lessons, and engineers become architects. What will you build tomorrow?

This post is licensed under CC BY 4.0 by the author.