Post

My Own Homelab Can Begin

My Own Homelab Can Begin

Introduction

The moment every infrastructure enthusiast anticipates – staring at a stack of repurposed hardware and declaring, “My own homelab can begin.” This pivotal juncture represents more than just accumulated hardware; it’s the foundation for mastering enterprise-grade infrastructure on a personal scale. Based on real-world discussions from platforms like Reddit, where users debate optimal setups ranging from “overkill adblocker clusters” to Proxmox virtualization environments, this guide addresses the critical question: What should I actually run on this hardware?

Homelabs have evolved beyond hobbyist playgrounds into essential training grounds for DevOps professionals. With 72% of engineers reporting improved job performance through homelab experimentation (2023 DevOps Skills Report), building a purpose-driven lab is now career-critical infrastructure. Unlike cloud-based sandboxes, physical homelabs provide hands-on experience with hardware failures, network constraints, and real resource limitations – the exact challenges faced in production environments.

In this comprehensive guide, you’ll transform generic hardware into a professional-grade lab capable of running:

  • Hyper-converged virtualization clusters
  • Distributed network services
  • Container orchestration platforms
  • Automated deployment pipelines
  • Enterprise monitoring systems

We’ll focus on battle-tested open-source technologies while emphasizing operational best practices honed from enterprise deployments. Forget theoretical exercises – every configuration and command here applies directly to real-world infrastructure management.

Understanding Homelab Infrastructure

Defining the Modern Homelab

A homelab is a self-hosted infrastructure environment that mirrors production systems at reduced scale. Unlike traditional server setups, contemporary labs emphasize:

  1. Resource Efficiency: Maximizing utility per watt (e.g., repurposing thin clients)
  2. Automation: Infrastructure-as-Code (IaC) driven provisioning
  3. Resilience: Implementing clustering and failover at micro-scale
  4. Portability: Hybrid cloud/on-prem deployment capabilities

Evolution of Homelab Technologies

| Era | Key Technologies | Limitations | |————-|—————————|—————————| | 2000s | Physical servers | High power consumption | | 2010s | VMware ESXi, Hyper-V | Licensing complexities | | 2020s | Proxmox, LXD, Kubernetes | Steep learning curve |

Modern solutions like Proxmox VE combine the best of virtualization and containerization while remaining freely available – making them ideal for homelab use.

Key Homelab Use Cases

  1. Skill Development: Practice Terraform deployments without cloud costs
  2. Service Hosting: Self-hosted alternatives to SaaS products
  3. Testing Ground: Validate configurations before production rollout
  4. Disaster Recovery: On-prem backup for critical cloud resources

Hardware Considerations

The referenced Reddit post highlights thin clients as lab workhorses. These devices offer:

  • Low power consumption (5-15W typical)
  • x86_64 compatibility
  • Enterprise-grade reliability
  • Compact form factors

Sample Thin Client Specs (HP t620):

1
2
3
4
CPU: AMD GX-217GA (2 cores @ 1.65GHz)
RAM: 4-16GB DDR3
Storage: 16GB eMMC + SATA expansion
Networking: Dual NIC variants available

Prerequisites

Hardware Requirements

Minimum Cluster Node:

  • CPU: 2+ cores (64-bit x86)
  • RAM: 4GB (8GB recommended)
  • Storage: 32GB SSD + expansion options
  • Networking: Gigabit Ethernet

Recommended Starter Lab:

1
2
3
3 nodes × (4CPU cores / 8GB RAM / 120GB SSD)
1 managed switch (VLAN capable)
1 UPS unit (for graceful shutdowns)

Software Selection

| Component | Recommendation | Rationale | |—————–|——————–|——————————-| | Hypervisor | Proxmox VE 8.x | Integrated LXC/KVM management | | Orchestration | Kubernetes 1.28 | Industry-standard container mgmt | | Network | Open vSwitch | Advanced SDN capabilities | | Storage | Ceph or ZFS | Software-defined storage |

Network Architecture

Essential pre-installation planning:

@startuml
!include <cloudinsight/common>
!include <cloudinsight/aws/compute>

component "Lab Router" as router {
  [pfSense]
}

node "Cluster Nodes" {
  [Proxmox 01] as pve1
  [Proxmox 02] as pve2
  [Proxmox 03] as pve3
}

router --> pve1 : VLAN 100 (Management)
router --> pve2 : VLAN 100
router --> pve3 : VLAN 100
router --> pve1 : VLAN 200 (Ceph/Storage)
router --> pve2 : VLAN 200
router --> pve3 : VLAN 200

cloud "Internet" {
  [ISP Gateway]
}

[ISP Gateway] --> router : WAN
@enduml

Security Fundamentals

  1. Physical Isolation: Dedicated lab network segment
  2. Access Control:
    1
    2
    
    # SSH hardening example
    sudo nano /etc/ssh/sshd_config
    
    1
    2
    3
    
    PermitRootLogin no
    PasswordAuthentication no
    AllowUsers labadmin
    
  3. Firmware Verification:
    1
    2
    
    sha256sum proxmox-ve_8.0.iso
    # Compare with published checksums
    

Installation & Configuration

Proxmox VE Cluster Setup

Node Preparation:

1
2
3
4
5
6
7
8
# Flash installer to USB
sudo dd if=proxmox-ve_8.0.iso of=/dev/sdb bs=4M status=progress

# Boot from USB and follow installer prompts

# Post-install configuration
sudo ip a # Confirm network interfaces
sudo nano /etc/network/interfaces
1
2
3
4
5
6
7
auto vmbr0
iface vmbr0 inet static
    address 192.168.100.101/24
    gateway 192.168.100.1
    bridge-ports eno1
    bridge-stp off
    bridge-fd 0

Cluster Initialization (First Node):

1
sudo pvecm create LAB-CLUSTER

Joining Additional Nodes:

1
2
3
4
5
6
# On existing cluster member
sudo pvecm status | grep join-url
# Output: pvecm add 192.168.100.101 --join-url https://192.168.100.101:8006/join/pve_join_1a2b3c4d

# On new node
sudo pvecm add 192.168.100.101 -join-url https://192.168.100.101:8006/join/pve_join_1a2b3c4d

Validation:

1
sudo pvecm status
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Cluster information
-------------------
Name:             LAB-CLUSTER
Config Version:   3
Transport:        knet
Secure auth:      on

[...]

Membership information
----------------------
    Nodeid      Votes Name
         1          1 192.168.100.101
         2          1 192.168.100.102
         3          1 192.168.100.103

Hyper-Converged Storage Configuration

ZFS Setup for Local Storage:

1
2
3
4
5
6
# Identify disks
sudo lsblk -o NAME,SIZE,MODEL

# Create striped mirror pool
sudo zpool create lab-pool mirror /dev/sdb /dev/sdc mirror /dev/sdd /dev/sde
sudo zfs create lab-pool/vm-store

Ceph Configuration for Distributed Storage:

1
2
3
4
5
6
7
8
# /etc/pve/ceph.conf on all nodes
[global]
  cluster network = 192.168.200.0/24
  public network = 192.168.200.0/24

[osd]
  osd_memory_target = 2G
  osd_crush_chooseleaf_type = 1

Initialize via Proxmox web interface or CLI:

1
2
3
sudo pveceph install --version quincy
sudo pveceph init --network 192.168.200.0/24
sudo pveceph createmon

Virtual Machine Templates

Cloud-Init Preparation:

1
2
3
4
5
6
7
8
9
10
11
# Download Ubuntu cloud image
wget https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img

# Create template
qm create 9000 --memory 2048 --core 2 --name ubuntu-template
qm importdisk 9000 jammy-server-cloudimg-amd64.img lab-pool
qm set 9000 --scsihw virtio-scsi-pci --scsi0 lab-pool:vm-9000-disk-0
qm set 9000 --ide2 lab-pool:cloudinit
qm set 9000 --boot c --bootdisk scsi0
qm set 9000 --serial0 socket --vga serial0
qm template 9000

Service Deployment Patterns

Overengineered Adblocker Cluster

Implementing the Reddit suggestion with maximum resilience:

Architecture:

1
2
3
4
3x Pi-hole containers (1 per node)
  └─ Keepalived VIP (192.168.100.250)
  └─ Gravity-sync for config replication
2x Unbound recursive resolvers (HA pair)

Deployment Script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Create LXC containers
for node in pve1 pve2 pve3; do
  pct create $NEXTID local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
    --hostname pihole-$node \
    --cores 1 --memory 512 --swap 512 \
    --net0 name=eth0,bridge=vmbr0,ip=dhcp \
    --unprivileged 1 \
    --features nesting=1
done

# Install Pi-hole
pct exec $CONTAINER_ID -- bash -c "curl -sSL https://install.pi-hole.net | bash"

# Configure Gravity Sync
pct exec $CONTAINER_ID -- apt install sqlite3 git rsync
pct exec $CONTAINER_ID -- git clone https://github.com/vmstan/gravity-sync.git

Keepalived Configuration:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 101
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass pih0le-vip
    }
    virtual_ipaddress {
        192.168.100.250/24
    }
}

Proxmox Automation Stack

Infrastructure-as-Code Workflow:

@startuml
left to right direction

[GitLab] --> [Terraform]: Triggers TF apply
[Terraform] --> [Proxmox]: Creates VMs/LXCs
[Ansible] --> [New VM]: Configures services
[Prometheus] --> [All Nodes]: Metrics collection
[Grafana] --> [Prometheus]: Dashboards

@enduml

Terraform Provider Configuration:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# main.tf
terraform {
  required_providers {
    proxmox = {
      source  = "telmate/proxmox"
      version = "2.9.14"
    }
  }
}

provider "proxmox" {
  pm_api_url          = "https://pve1:8006/api2/json"
  pm_api_token_id     = "terraform@pve!lab_token"
  pm_api_token_secret = "uuid-secret-here"
}

resource "proxmox_vm_qemu" "k8s-worker" {
  count       = 3
  name        = "k8s-worker-${count.index}"
  target_node = "pve${count.index % 3 + 1}"
  
  clone = "ubuntu-template"
  
  cores   = 2
  memory  = 4096
  agent   = 1
  
  network {
    bridge    = "vmbr0"
    model     = "virtio"
  }
  
  disk {
    storage = "lab-pool"
    type    = "scsi"
    size    = "30G"
  }
}

Performance Optimization

Proxmox Tuning Parameters

Kernel Settings (/etc/sysctl.conf):

1
2
3
4
5
6
7
8
9
10
# Increase TCP buffers
net.core.rmem_max=16777216
net.core.wmem_max=16777216

# Virtualization optimizations
vm.swappiness=10
vm.vfs_cache_pressure=50

# KVM specific
kernel.sched_autogroup_enabled=1

QEMU CPU Flags:

1
args: -cpu host,+kvm_pv_eoi,+kvm_pv_unhalt

Storage Performance Benchmarks

1
2
3
4
5
6
# Direct ZFS performance test
sudo zpool iostat -v lab-pool 1

# FIO benchmark inside VM
fio --name=randwrite --ioengine=libaio --rw=randwrite --bs=4k \
    --direct=1 --numjobs=4 --size=1G --runtime=60 --group_reporting

Network Tuning

Multi-Queue VirtIO:

1
2
# Add to VM configuration
qm set $VMID --net0 virtio,bridge=vmbr0,macaddr=...,queues=4

Offloading Settings:

1
sudo ethtool -K eth0 tx off rx off sg off tso off gso off gro off lro off

Operational Considerations

Monitoring Stack

Essential Metrics:

  1. Node Level: CPU steal time, memory ballooning
  2. Storage: ZFS ARC hit ratio, Ceph OSD latency
  3. Network: Packet drops, retransmits

Grafana Dashboard Queries:

1
2
3
4
5
# Proxmox node memory usage
proxmox_node_memory_total{node="$node"} - proxmox_node_memory_available{node="$node"}

# Ceph pool IOPS
rate(ceph_pool_wr[1m]) + rate(ceph_pool_rd[1m])

Backup Strategy

3-2-1 Rule Implementation:

  1. Proxmox Backup Server: Local ZFS snapshots
  2. Borgmatic: Encrypted offsite backups
  3. MinIO: S3-compatible object storage

Automated Backup Script:

1
2
# Proxmox scheduled backup
vzdump $CONTAINER_ID
This post is licensed under CC BY 4.0 by the author.