Post

I Heard You Like Thinkcentre Clusters

I Heard You Like Thinkcentre Clusters

Introduction

The challenge of building enterprise-grade infrastructure on a budget has never been more relevant. When a Reddit user showcased their compact Proxmox cluster built from repurposed Lenovo Thinkcentre M710q devices, it resonated with DevOps engineers worldwide. This unconventional approach to infrastructure management demonstrates how professionals are leveraging retired enterprise hardware to create powerful, cost-effective clusters for development, testing, and even production workloads.

For system administrators and DevOps practitioners, understanding how to design and maintain such systems provides critical advantages:

  1. Cost efficiency: 75-90% savings compared to new enterprise hardware
  2. Energy consciousness: Ultra-compact devices consuming 15-45W under load
  3. Scalability: Modular expansion through commodity hardware
  4. Real-world skill development: Enterprise-grade cluster management on accessible hardware

This comprehensive guide examines the technical implementation of Thinkcentre-based clusters, focusing on the M710q platform mentioned in the original Reddit post. We’ll explore hardware selection, Proxmox configuration, networking considerations, and cluster management techniques that transform these mini-PCs into enterprise-grade infrastructure.

Understanding Thinkcentre Clusters

What Are Thinkcentre Clusters?

Thinkcentre clusters are distributed computing systems built from multiple Lenovo Thinkcentre ultra-compact form factor (UCFF) devices. These 1-liter PCs, originally designed for office productivity, become powerful infrastructure components when:

  • Combined in groups of 3+ nodes
  • Connected via high-speed networking
  • Managed through cluster-aware virtualization platforms

The M710q generation (7th Gen Intel Core processors) represents the current sweet spot for price/performance in used markets, typically available at $150-300 per unit with enterprise-grade components.

Technical Advantages

FeatureBenefit
Intel vPro supportOut-of-band management (similar to IPMI)
PCIe M.2 expansionNetwork/storage upgrades without USB bottlenecks
35W TDP processorsCluster density (5-10 nodes per 1U equivalent)
DDR4 ECC supportEnterprise-grade memory reliability

Comparison to Alternatives

PlatformThinkcentre M710qRaspberry Pi 4Enterprise Server
Cost/node$200-400$100-150$2000+
Power Draw15-45W5-15W150-300W
x86 CompatibilityNativeARM (limited)Native
ExpandabilityPCIe/M.2USB-onlyFull expansion
Production ViabilityYesLimitedYes

Real-World Use Cases

  1. Development/Testing Environments:
    1
    2
    3
    4
    5
    
    # Create disposable Kubernetes node
    pct create 1001 local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.gz \
      --hostname k8s-test-node \
      --cores 2 --memory 2048 --swap 512 \
      --net0 name=eth0,bridge=vmbr0,ip=dhcp
    
  2. Network Services Cluster:
    HAProxy load balancers + Keepalived VIPs across nodes

  3. Distributed Storage:
    Ceph OSDs leveraging M.2 NVMe drives

  4. CI/CD Infrastructure:
    GitLab runners with LXC container isolation

Prerequisites

Hardware Requirements

Base Configuration (Per Node):

  • Lenovo M710q (i5-6500T or better)
  • 32GB DDR4 RAM (2x16GB 2400MHz)
  • 512GB NVMe SSD (Samsung PM981/Intel P4510)
  • 2.5Gbps Ethernet Adapter (Intel i225-based M.2 NIC)

Networking:

  • Managed switch with 2.5Gbps ports (MokerLink 8-port recommended)
  • 10Gbps uplink capability (for cluster backbone)
  • VLAN support for network segmentation

Software Requirements

ComponentVersionNotes
Proxmox VE7.4+Debian 11 (Bullseye) base
CephQuincy (17.2.5)For storage clusters
Corosync3.1.5+Cluster communication
Linux Kernel5.15+Required for Intel i225 NICs

Security Pre-Checks

  1. BIOS Configuration:
    • Enable Intel VT-d/AMD-V
    • Set secure boot policy
    • Configure vPro AMT if available
  2. Network Segmentation:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
    # /etc/network/interfaces
    auto vmbr0
    iface vmbr0 inet static
        address 192.168.1.10/24
        gateway 192.168.1.1
        bridge-ports enp3s0
        bridge-stp off
        bridge-fd 0
           
    # Management VLAN
    auto vmbr0.10
    iface vmbr0.10 inet static
        address 10.10.10.2/24
    

Pre-Installation Checklist

  1. Validate CPU flags:
    1
    
    grep -E 'vmx|svm' /proc/cpuinfo
    
  2. Confirm NIC compatibility:
    1
    
    lspci -nn | grep -i ethernet
    
  3. Verify memory integrity:
    1
    
    memtester 4G 1
    
  4. Check SSD wear level:
    1
    
    nvme smart-log /dev/nvme0 | grep percentage_used
    

Installation & Setup

Proxmox Cluster Deployment

Step 1: Base Installation

Download Proxmox VE 7.4 ISO and create bootable USB:

1
dd if=proxmox-ve_7.4-1.iso of=/dev/sdX bs=4M status=progress conv=fdatasync

Installation parameters:

  • Filesystem: ext4 (ZFS requires additional RAM)
  • Management interface: Dedicated 1Gbps port
  • Storage: Entire NVMe drive

Step 2: Post-Install Configuration

Update repositories:

1
2
3
sed -i 's/^deb/#deb/' /etc/apt/sources.list.d/pve-enterprise.list
echo "deb https://download.proxmox.com/debian/pve bullseye pve-no-subscription" > /etc/apt/sources.list.d/pve-public.list
apt update && apt full-upgrade -y

Step 3: Cluster Initialization

On first node:

1
pvecm create CLUSTER-01

On subsequent nodes:

1
pvecm add 192.168.1.10

Verify cluster status:

1
pvecm status

Expected output:

1
2
3
4
5
6
7
8
Quorum information
------------------
Date:             Tue Aug 15 10:45:22 2023
Quorum provider:  corosync_votequorum
Nodes:            4
Node ID:          0x00000001
Ring ID:          1.1234
Quorate:          Yes

Networking Configuration

2.5Gbps NIC Setup:

  1. Identify NIC interface:
    1
    
    lshw -class network -businfo
    
  2. Configure switch ports:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    
    # /etc/network/interfaces
    auto enp4s0
    iface enp4s0 inet manual
       
    auto vmbr1
    iface vmbr1 inet static
        address  10.20.30.1/24
        bridge-ports enp4s0
        bridge-stp off
        bridge-fd 0
        mtu 9000
    
  3. Enable jumbo frames:
    1
    
    ip link set dev enp4s0 mtu 9000
    

VLAN Configuration for Services:

1
2
3
4
5
# /etc/network/interfaces.d/vlans
auto vmbr1.100
iface vmbr1.100 inet static
    address 10.100.0.1/24
    mtu 9000

Storage Configuration

Ceph Cluster Setup:

  1. Install Ceph on all nodes:
    1
    
    pveceph install --version quincy
    
  2. Initialize monitor:
    1
    
    pveceph init --network 10.20.30.0/24
    
  3. Create OSDs:
    1
    
    pveceph osd create /dev/nvme0n1 -crush_device_class nvme
    

Performance Verification:

1
rados bench -p testpool 10 write --no-cleanup

Configuration & Optimization

Proxmox Tuning

CPU Pinning for Performance:

1
2
3
4
5
# /etc/pve/qemu-server/100.conf
args: -cpu host,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt
cores: 4
cpu: host
numa: 1

Memory Management:

1
2
3
4
5
# /etc/sysctl.conf
vm.swappiness=10
vm.vfs_cache_pressure=50
vm.dirty_ratio=10
vm.dirty_background_ratio=5

Security Hardening

Firewall Configuration:

1
2
3
4
5
6
7
8
9
# /etc/pve/firewall/cluster.fw
[OPTIONS]
enable: 1
policy_in: DROP
policy_out: ACCEPT

[RULES]
GROUP proxmox -i net0 -p tcp -d 8006 ACCEPT
GROUP cluster -i net1 -p udp -d 5404:5405 ACCEPT

SSH Hardening:

1
2
3
4
5
6
7
# /etc/ssh/sshd_config
Protocol 2
PermitRootLogin prohibit-password
MaxAuthTries 3
LoginGraceTime 20
ClientAliveInterval 300
ClientAliveCountMax 2

Performance Optimization

Network Stack Tuning:

1
2
3
4
5
6
# /etc/sysctl.d/10-network.conf
net.core.rmem_max=268435456
net.core.wmem_max=268435456
net.ipv4.tcp_rmem=4096 87380 268435456
net.ipv4.tcp_wmem=4096 65536 268435456
net.core.netdev_max_backlog=300000

Storage I/O Optimization:

1
2
# /etc/udev/rules.d/60-ssd-scheduler.rules
ACTION=="add|change", KERNEL=="nvme[0-9]n[0-9]", ATTR{queue/scheduler}="none"

Usage & Operations

Common Cluster Operations

Live Migration:

1
qm migrate 101 node02 --online --with-local-disks

Resource Monitoring:

1
pvesh get /cluster/resources --type vm

Automated Backups:

1
2
# /etc/cron.d/proxmox-backup
0 2 * * 0 root vzdump 100 101 --mode snapshot --compress zstd --storage nas01

Capacity Planning

Resource Utilization Metrics:

ResourceRecommended LimitMonitoring Command
CPU85% sustainedmpstat -P ALL 1 5
Memory90% utilizationfree -m
Storage80% capacitydf -h
Network70% bandwidthiftop -i vmbr1

Scaling Considerations

Horizontal Expansion:

  1. Add new node to cluster
  2. Rebalance Ceph OSDs:
    1
    
    ceph osd crush reweight-all
    
  3. Adjust HA groups:
    1
    
    pvesh set /pools/ha-pool -nodes node01,node02,node03,node04
    

Vertical Limitations:

  • Max 64GB RAM per M710q (non-ECC)
  • PCIe x4 bottleneck for multiple NVMe drives
  • Thermal constraints under sustained load

Troubleshooting

Common Issues

Corosync Network Problems:

1
2
3
corosync-cfgtool -s
systemctl status corosync
journalctl -u corosync --since "10 minutes ago"

PCIe Device Passthrough Errors:

  1. Confirm IOMMU groups:
    1
    
    for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU Group %s ' "$n"; lspci -nns "${d##*/}"; done
    
  2. Validate kernel parameters:
    1
    2
    
    grep -E '(vmx|svm)' /proc/cpuinfo
    grep iommu /proc/cmdline
    

Storage Performance Degradation:

1
2
3
4
5
6
# Check Ceph status
ceph -s
ceph osd perf

# NVMe health
nvme smart-log /dev/nvme0

Debugging Methodology

  1. Cluster Health Check:
    1
    2
    3
    
    pvecm status
    pveversion -v
    systemctl list-units --failed
    
  2. Network Validation:
    1
    
    iperf3 -c 10.20.30.2 -P 4 -t 30
    
  3. Storage Verification:
    1
    2
    3
    
    fio --name=randwrite --ioengine=libaio --iodepth=32 \
      --rw=randwrite --bs=4k --direct=1 --size=1G --numjobs=4 \
      --runtime=60 --time_based --group_reporting
    

Recovery Procedures

Failed Node Replacement:

  1. Remove from cluster:
    1
    
    pvecm delnode node03
    
  2. Reinstall Proxmox
  3. Rejoin cluster
  4. Rebalance resources

Ceph Recovery:

1
2
3
4
ceph osd out osd.3
ceph osd crush remove osd.3
ceph auth del osd.3
ceph osd rm osd.3

Conclusion

Building enterprise-grade infrastructure with Thinkcentre clusters demonstrates how DevOps principles apply beyond traditional data centers. These compact systems offer:

  • Operational efficiency: 1/10th power consumption of rack servers
  • Financial viability: <$1,000 for 4-node cluster with 128GB RAM
  • Technical capability: 40+ container deployments per cluster

For further exploration:

  1. Proxmox Cluster Manager Documentation
  2. Ceph Storage Architectures
  3. Intel NIC Compatibility Matrix
  4. [Lenovo M710
This post is licensed under CC BY 4.0 by the author.