Post

Finally Finished My Minilab

Finally Finished My Minilab: A DevOps Journey in Infrastructure Optimization

1. Introduction

Completing a homelab project represents a rite of passage for system administrators and DevOps engineers. After four years of meticulous hardware selection and iterative improvements, this minilab achieves enterprise-grade capabilities in a compact footprint.

Self-hosted infrastructure remains critical for professionals seeking to:

  • Test production-grade configurations without cloud costs
  • Develop infrastructure-as-code (IaC) pipelines
  • Experiment with container orchestration and network topologies
  • Build redundant storage solutions
  • Harden systems against real-world security threats

This guide deconstructs a real-world minilab deployment featuring hybrid hardware (Lenovo Tiny PCs + Minisforum enterprise gear), multi-gigabit networking, and hyperconverged storage. You’ll learn:

  • Hardware selection criteria for compact labs
  • Network architecture for 10GbE environments
  • Storage optimization techniques
  • Automation approaches for heterogeneous hardware
  • Security hardening for exposed services

For DevOps engineers, homelabs provide unparalleled opportunities to test Kubernetes failures, simulate distributed system outages, and validate backup strategies - all skills directly transferable to production environments.

2. Understanding Modern Homelab Architecture

2.1 What Defines a “Minilab”?

Contemporary minilabs prioritize:

  • Density: Enterprise capabilities in <5U space
  • Efficiency: Low power consumption (<100W idle)
  • Modularity: Hot-swappable components
  • Performance: 10GbE+ networking, NVMe storage

2.2 Hardware Breakdown

Compute Nodes

| Component | Lenovo M92p Tiny | Minisforum MS-01 | |———————-|——————————-|——————————-| | CPU | i5-3470T (2C/4T) | i5-12600H (12C/16T) | | RAM | 16GB DDR3 | 32GB DDR5 | | Boot Drive | 1TB SATA SSD | 1TB NVMe SSD | | Secondary Storage | N/A | 4x1TB Samsung SM863 (RAID 10) | | TDP | 35W | 45W |

Networking

  • TP-Link TL-SG108S-M2: 8x2.5GbE ports + 2x10GbE SFP+ (Layer 2)
  • MikroTik CRS305: 4x10GbE SFP+ (Layer 3 capable)

2.3 Why This Architecture Works

  1. Tiered Compute:
    • Legacy nodes handle low-intensity services (DNS, monitoring)
    • Modern nodes run resource-hungry workloads (K8s, databases)
  2. Storage Stratification:
    • Boot SSDs for OS/hypervisor
    • Secondary RAID arrays for VM storage
  3. Network Segmentation:
    1
    2
    3
    4
    5
    6
    7
    
    [Internet]
      |
    [Firewall]
      |
    [MikroTik CRS305] --10GbE--> [Minisforum Nodes]
      |
    [TP-Link Switch] --2.5GbE--> [Lenovo Nodes]
    

2.4 Alternatives Considered

ComponentConsidered AlternativesWhy Chosen
ComputeIntel NUC, Dell MicroPrice/performance ratio
SwitchingUbiquiti, Cisco SG350Cost per 10GbE port
StorageSynology NASHyperconverged flexibility

3. Prerequisites

3.1 Hardware Requirements

  • Minimum 16GB RAM per node
  • SSD boot drives (NVMe preferred)
  • Multi-gigabit network interfaces
  • Power-over-Ethernet (PoE) optional but recommended

3.2 Software Stack

  • Hypervisor: Proxmox VE 8.1+
  • Orchestration: Kubernetes v1.28+ / Docker Swarm
  • Storage: ZFS 2.1.12 (for RAIDZ)
  • Monitoring: Prometheus 2.47 + Grafana 10.1

3.3 Network Considerations

  • Dedicated VLANs for:
    • Management (VLAN 10)
    • Storage (VLAN 20)
    • Services (VLAN 30)
  • Firewall rules limiting cross-VLAN traffic
  • MAC address whitelisting

3.4 Security Requirements

  • SSH key authentication only
  • Full-disk encryption (LUKS)
  • Automated vulnerability scanning (Trivy)
  • WireGuard VPN for remote access

4. Installation & Configuration

4.1 Proxmox VE Deployment

Step 1: Prepare boot media

1
2
3
4
5
# Download ISO
wget https://enterprise.proxmox.com/iso/proxmox-ve_8.1-1.iso

# Create bootable USB
sudo dd if=proxmox-ve_8.1-1.iso of=/dev/sdc bs=4M status=progress

Step 2: ZFS Configuration

1
2
3
Disk Layout:
- /dev/nvme0n1 (1TB) => Proxmox OS (ZFS RAID1)
- /dev/sd[a-d] (4x1TB) => VM Storage (ZFS RAID10)

Step 3: Network Bonding

1
2
3
4
5
6
7
# Create bond0 (LACP)
auto bond0
iface bond0 inet manual
    bond-slaves eno1 eno2
    bond-miimon 100
    bond-mode 802.3ad
    bond-xmit-hash-policy layer2+3

4.2 Kubernetes Cluster Setup

kubeadm Configuration (kubeadm-config.yaml)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
controllerManager:
  extraArgs:
    node-cidr-mask-size: "24"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
  strictARP: true

Initialize Control Plane

1
kubeadm init --config=kubeadm-config.yaml

4.3 Storage Configuration

ZFS Pool Creation

1
2
3
4
zpool create -f tank mirror /dev/disk/by-id/ata-Samsung_SSD_860_EVO_1TB_A \
                           mirror /dev/disk/by-id/ata-Samsung_SSD_860_EVO_1TB_B
zfs create tank/vms -o recordsize=16K -o compression=lz4
zfs create tank/containers -o recordsize=128K -o compression=zstd

4.4 Network Optimization

MikroTik CRS305 Configuration

1
2
3
4
5
6
7
8
9
10
# Enable hardware offloading
/interface ethernet switch port
set switch1-cpu hw-offloading=yes
set [find where name~"switch1"] l3-hw-offloading=yes

# Create VLANs
/interface vlan
add interface=bridge name=MGMT vlan-id=10
add interface=bridge name=STORAGE vlan-id=20
add interface=bridge name=SERVICES vlan-id=30

5. Optimization & Hardening

5.1 Kernel Tuning

/etc/sysctl.conf Optimizations

1
2
3
4
5
6
7
8
# Network stack
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_wmem=4096 65536 16777216

# ZFS ARC
vm.vfs_cache_pressure=50

5.2 Security Hardening

AppArmor Profiles for Containers

1
2
3
4
5
# Generate default profile
aa-genprof /usr/bin/docker

# Enforce profile
aa-enforce /etc/apparmor.d/usr.bin.docker

Pod Security Admission (Kubernetes)

1
2
3
4
5
6
7
8
9
10
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
  configuration:
    apiVersion: pod-security.admission.config.k8s.io/v1beta1
    kind: PodSecurityConfiguration
    defaults:
      enforce: "restricted"
      enforce-version: "latest"

5.3 Storage Performance

ZFS Tunables

1
2
3
4
# SSD optimization
zfs set primarycache=metadata tank/vms
zfs set logbias=throughput tank/vms
zfs set sync=disabled tank/vms  # UPS required!

6. Operations & Maintenance

6.1 Daily Monitoring

Prometheus Node Exporter Queries

# CPU Utilization
100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# ZFS ARC Efficiency
node_zfs_arc_hits_total / (node_zfs_arc_hits_total + node_zfs_arc_misses_total)

6.2 Backup Strategy

Proxmox Backup Schedule

1
2
3
# Weekly full, daily incremental
vzdump 100 --mode snapshot --compress lzo \
    --storage PBS1 --remove 1 --prune-backups 'keep-last=3'

6.3 Kubernetes Operations

Pod Management

1
2
3
4
5
# List pods with node assignment
kubectl get pods -o custom-columns=NAME:.metadata.name,STATUS:$CONTAINER_STATUS,NODE:.spec.nodeName

# Drain node for maintenance
kubectl drain node02 --ignore-daemonsets --delete-emptydir-data

7. Troubleshooting Guide

7.1 Common Issues

Problem: ZFS performance degradation
Solution: Check fragmentation and ARC hit rate

1
2
zpool status -v
arc_summary.py | grep "hit rate"

Problem: Kubernetes nodes NotReady
Diagnosis:

1
2
journalctl -u kubelet --since "10 minutes ago" | grep -i error
kubectl describe node $NODE_NAME | grep -i conditions: -A10

7.2 Network Diagnostics

Check Switch Port Statistics

1
2
# MikroTik SFP+ status
/interface ethernet monitor ether1,ether2,ether3,ether4

Test 10GbE Throughput

1
2
3
4
5
# Server A (receiver)
iperf3 -s

# Server B (transmitter)
iperf3 -c serverA -P 4 -t 30 -O 5

8. Conclusion

Completing this minilab demonstrates how modern SMB hardware can rival enterprise infrastructure when properly configured. The key takeaways:

  1. Strategic Hardware Selection: Balance compute density with power efficiency
  2. Network Segmentation: Isolate traffic types via VLANs and QoS
  3. Storage Tiering: Match media types to workload requirements
  4. Automation First: Treat lab infrastructure as production-grade

For those continuing their homelab journey, consider exploring:

The minilab remains the ultimate proving ground for infrastructure automation, disaster recovery testing, and continuous learning in ever-evolving DevOps landscapes.

This post is licensed under CC BY 4.0 by the author.