My First Homelab Setup

Posted Oct 28, 2025

By Usman Masood Ashraf

views 8 min read

My First Homelab Setup: A DevOps Engineer’s Practical Guide

Introduction

The Reddit post showing a homelab running on “a mix of Ethernet and ethanol” perfectly captures the essence of this journey. As DevOps engineers and sysadmins, our professional expertise often collides with the chaotic reality of personal infrastructure projects. This guide documents a practical, production-grade approach to building your first homelab - complete with the inevitable troubleshooting whiskey bottle within arm’s reach.

Homelabs serve as critical learning environments where we experiment with technologies too risky for production systems. They’re sandboxes for mastering infrastructure-as-code, container orchestration, network segmentation, and high-availability configurations. Unlike cloud playgrounds that disappear with a billing cycle, physical homelabs provide tangible experience with hardware limitations, thermal constraints, and real-world failure scenarios.

In this 4000-word definitive guide, you’ll learn:

Hardware selection balancing performance and power efficiency
Enterprise-grade virtualization using Proxmox VE
Container orchestration with Docker Swarm (Kubernetes-light)
Network segmentation and security best practices
Monitoring and alerting stack implementation
Automated backup strategies for bare-metal recovery
Power management and operational cost optimization

We’ll implement these solutions while acknowledging the “Fireball whiskey” reality of homelab operations - where high availability sometimes means having spare hardware in the closet and backups might consist of external drives in a fireproof safe.

Understanding Homelab Fundamentals

What Exactly is a Homelab?

A homelab is a personal technology sandbox that replicates enterprise infrastructure environments at a smaller scale. Unlike corporate data centers, homelabs typically:

Run on consumer-grade or decommissioned enterprise hardware
Prioritize learning over uptime (though we pretend otherwise)
Combine production services (media servers, file shares) with experimental setups
Operate within residential power and thermal constraints

Key Homelab Components

Component	Enterprise Equivalent	Homelab Reality
Compute	VMware ESXi Cluster	Used Intel NUC/Proliant DL
Storage	SAN/NAS with SSD caching	ZFS array in old PC case
Networking	Cisco/Juniper stack	Ubiquiti Dream Machine SE
Backup	Veeam/Commvault	Rclone to Backblaze B2
Monitoring	Datadog/New Relic	Prometheus+Grafana VM
High Availability	Redundant power/network	Single PSU with UPS backup

The Homelab Evolution Cycle

Phase 1 - The Accidental Server: Old desktop running Plex
Phase 2 - Virtualization Enlightenment: Proxmox/Hyper-V cluster
Phase 3 - Network Segmentation: VLANs and pfSense firewall
Phase 4 - Infrastructure as Code: Terraform/Ansible adoption
Phase 5 - The Whiskey Phase: Realization that Kubernetes on Raspberry Pis was a bad idea

Why Homelabs Matter for DevOps Professionals

Risk-Free Experimentation: Test kernel updates, breaking changes, and security patches without career consequences
Deep Technology Understanding: Learn how storage actually works when you lose a ZFS vdev
Troubleshooting Skills: Develop patience when debugging why NFS shares disappear after reboots
Budget Constraints Creativity: Implement HAProxy because you can’t afford F5 BIG-IP

Prerequisites

Hardware Requirements (Minimum)

Component	Specification	Notes
Host Machine	Intel i5 8th Gen / Ryzen 5 3600	AES-NI for encryption, VT-d/AMD-V
RAM	16GB DDR4	ECC recommended for ZFS
Storage	2x500GB SSD (Boot/VM) + 2x4TB HDD	Separate boot/media storage
Network	Dual Gigabit NIC	VLAN separation for management/data
Power	UPS 650VA	Protect against dirty power

Software Requirements

Hypervisor: Proxmox VE 8.x (Installation Guide)
Containers: Docker 24.x (Install Docs)
Networking: Open vSwitch 3.1.x
Monitoring: Prometheus 2.47 + Grafana 10.1

Network Planning

Create this VLAN structure before installation:

VLAN ID	Purpose	Subnet	DHCP Scope
10	Management	192.168.10.0/24	.100-.200
20	Services	192.168.20.0/24	.100-.200
30	IoT	192.168.30.0/24	.100-.200
40	Guest	192.168.40.0/24	.100-.200

Security Checklist

Physically secure server location (no “basement flooding” disasters)
Disable IPMI default credentials
Plan firewall rules between VLANs
Generate SSH keys for host access
Note down all MAC addresses for port security

Installation & Configuration

Proxmox VE Installation

  
# Download latest ISO
wget https://download.proxmox.com/iso/proxmox-ve_8.1-1.iso

# Create bootable USB (Linux example)
sudo dd if=proxmox-ve_8.1-1.iso of=/dev/sdb bs=4M status=progress conv=fdatasync

# Installation steps:
# 1. Select "Install Proxmox VE"
# 2. Set country/timezone/US keyboard
# 3. Password for root@host (use 32+ char password)
# 4. Configure management interface on VLAN 10
# 5. Use ZFS mirror for boot drives

Post-install configuration:

  
# Update sources
echo "deb https://enterprise.proxmox.com/debian/pve bullseye pve-enterprise" > /etc/apt/sources.list.d/pve-enterprise.list

# Add no-subscription repo
echo "deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription" > /etc/apt/sources.list.d/pve-no-subscription.list

# Update and upgrade
apt update && apt dist-upgrade -y

# Install common tools
apt install tmux zsh git curl net-tools ovs-switch -y

Docker Swarm Setup

On first manager node:

  
# Install Docker
curl -fsSL https://get.docker.com | sh

# Initialize swarm
docker swarm init --advertise-addr 192.168.10.101

# Get join token
docker swarm join-token worker

# Sample output: 
# docker swarm join --token SWMTKN-1-49nj1cmql0jkz5s954yi3oex3nedyz0fb0xx14ie39trti4wxv-8vxv8rssmk743ojnwacrr2e7c 192.168.10.101:2377

On worker nodes:

docker swarm join --token <TOKEN> 192.168.10.101:2377

Verify cluster status:

docker node ls

# CORRECT OUTPUT FORMAT:
ID                            HOSTNAME   STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
l4gku8f7a8j5zq6z9x3x9x3x9     node1      Ready     Active         Leader           24.0.7
3x9x3x9x3x9x3x9x3x9x3x9x3     node2      Ready     Active         Reachable        24.0.7

Network Configuration with Open vSwitch

  
# Create bridge for VM traffic
ovs-vsctl add-br vmbr0
ovs-vsctl add-port vmbr0 eno1 tag=10 vlan_mode=native-untagged
ovs-vsctl add-port vmbr0 eno2 tag=20
ovs-vsctl set port eno2 trunks=10,20,30,40

# Configure VLAN interfaces
ovs-vsctl add-br mgmt0
ovs-vsctl add-port mgmt0 vmbr0 tag=10

ovs-vsctl add-br services0
ovs-vsctl add-port services0 vmbr0 tag=20

# Persistent configuration
cat << EOF > /etc/network/interfaces.d/ovs
auto vmbr0
iface vmbr0 inet manual
    ovs_type OVSBridge
    ovs_ports eno1 eno2

allow-vmbr0 mgmt0
iface mgmt0 inet static
    address 192.168.10.101/24
    gateway 192.168.10.1
    ovs_type OVSIntPort
    ovs_bridge vmbr0
    ovs_options tag=10

allow-vmbr0 services0
iface services0 inet manual
    ovs_type OVSIntPort
    ovs_bridge vmbr0
    ovs_options tag=20
EOF

Configuration & Optimization

Proxmox Resource Allocation Best Practices

CPU Pinning: Assign dedicated cores to critical VMs
1 qm set 100 -cpu cpus=0-3
Memory Ballooning: Allow dynamic memory allocation
1 qm set 100 -balloon 1024
Storage Tiering:
- SSD: VM operating systems and databases
- HDD: Media storage and backups
- NVMe: Ceph or ZFS caching

Docker Swarm Security Hardening

Enable swarm encryption:

  
docker swarm update --task-history-limit 50 --autolock=true

Create container security profiles:

  
{
  "defaults": {
    "user": "nobody",
    "no-new-privileges": true
  },
  "sysctls": {
    "net.ipv4.tcp_syncookies": "1",
    "net.ipv4.conf.all.rp_filter": "1"
  }
}

Implement resource constraints:

  
services:
  nginx:
    deploy:
      resources:
        limits:
          cpus: '0.50'
          memory: 512M
        reservations:
          cpus: '0.25'
          memory: 256M

Monitoring Stack Implementation

Prometheus configuration for Proxmox:

  
# prometheus.yml
scrape_configs:
  - job_name: 'proxmox'
    metrics_path: '/pve'
    params:
      module: [proxmox]
    static_configs:
      - targets: ['192.168.10.101:9221']
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 192.168.10.101:9221

Grafana dashboard import via docker-compose:

  
version: '3.8'

services:
  grafana:
    image: grafana/grafana:10.1.0
    volumes:
      - grafana_data:/var/lib/grafana
      - ./dashboards:/etc/grafana/provisioning/dashboards
    deploy:
      mode: replicated
      replicas: 1
      resources:
        limits:
          memory: 512M
    networks:
      - metrics

volumes:
  grafana_data:

networks:
  metrics:
    driver: overlay
    attachable: true

Operations & Maintenance

Daily Operations Checklist

Resource Monitoring:

  
# Check Proxmox cluster status
pvecm status
   
# Docker node health
docker node ps $CONTAINER_ID
   
# ZFS pool status
zpool status -v

Backup Verification:

  
# List Proxmox backups
pvesm list local-backup
   
# Test Docker volume restore
docker run --rm -v backup_verify:/data alpine ls -l /data

Security Updates:

  
# Proxmox updates
apt update && apt dist-upgrade -y
   
# Docker image updates
docker images | awk '(NR>1) && ($2!="<none>") {print $1":"$2}' | xargs -L1 docker pull

Backup Strategy Implementation

Three-tier backup approach:

Local Snapshots (15 minute intervals):

  
# ZFS automated snapshots
zfs set com.sun:auto-snapshot=true rpool/data

NAS Replication (Nightly):

# ZFS send/receive to backup server
zfs send rpool/data@snap-20240501 | ssh backup-host zfs recv backup/data

Cloud Backup (Weekly):

  
# Rclone encrypted backup to B2
rclone sync /backup b2:homelab-backup --b2-hard-delete --transfers 16

Scaling Considerations

When your whiskey collection outgrows your rack space:

Vertical Scaling:
- Add NVMe caching layer
- Upgrade to 10Gbps networking
- Implement ECC memory
Horizontal Scaling:
- Add compute nodes with identical specs
- Deploy Ceph distributed storage
- Implement load balancing with HAProxy
Density Optimization:
- Replace towers with rack-mounted servers
- Implement PoE switches for low-power devices
- Consolidate services through containerization

Troubleshooting

Common Issues and Resolutions

Problem: All VMs lose network connectivity after switch reboot
Solution:

# Reinitialize OVS bridges
systemctl restart openvswitch-switch

# Verify bridge mappings
ovs-vsctl show

Problem: Docker swarm nodes show “Unreachable” status
Debugging:

  
# Check swarm manager logs
journalctl -u docker.service --since "10 minutes ago"

# Verify firewall rules
iptables -L DOCKER-USER -v -n

# Test overlay network
docker network create -d overlay --attachable test-net

Problem: ZFS pool reports “corrupted data” errors
Recovery:

  
# Scrub pool
zpool scrub tank

# Check errors
zpool status -v

# Restore from backup if needed
zfs rollback tank/data@snap-pre-corruption

Performance Tuning

Disk I/O Optimization:

  
# Set ZFS ARC limits
echo $((16 * 1024 * 1024 * 1024)) > /sys/module/zfs/parameters/zfs_arc_max
   
# Adjust VM disk scheduler
echo ky

Open Source, Reddit Guides, Kubernetes

This post is licensed under CC BY 4.0 by the author.