Post

What Things Would You Do With Access To An Abundance Of M2 Ssds 256Gb-1Tb

What Things Would You Do With Access To An Abundance Of M2 SSDs (256GB-1TB)

INTRODUCTION

Imagine staring at a stack of 50+ M.2 SSDs ranging from 256GB to 1TB - all wiped, tested, and ready for deployment. This is the reality for many homelab enthusiasts and DevOps engineers who acquire surplus hardware. But what strategic advantage does this abundance provide in modern infrastructure management?

In the era of cloud dominance, physical storage arrays present unique opportunities for performance optimization, cost reduction, and architectural experimentation. This guide explores practical implementations of SSD fleets in self-hosted environments, covering:

  1. Hyperconverged infrastructure design
  2. Distributed storage systems
  3. Massive caching layers
  4. Ephemeral workload orchestration
  5. Disaster recovery solutions

We’ll examine real-world configurations using proven open-source tools like Ceph, ZFS, and Kubernetes - transforming idle hardware into high-performance infrastructure components. By the end, you’ll understand how to leverage SSD abundance for tangible performance gains while maintaining enterprise-grade reliability.

UNDERSTANDING THE TOPIC

What Are M.2 SSDs?

M.2 SSDs (formally known as Next Generation Form Factor) provide NVMe storage via PCIe lanes, offering significant advantages over traditional SATA drives:

CharacteristicSATA SSDNVMe M.2 SSD
Max Bandwidth600MB/s3,500-7,000MB/s
InterfaceAHCIPCIe 3.0/4.0
Latency50-100μs10-20μs
Form Factor2.5”22mm width
Power Consumption3-5W4.5-8.5W

Key Advantages in Homelabs

  1. Density: 20+ drives in 2U chassis
  2. Power Efficiency: 1/3 the wattage of HDDs
  3. Performance: Ideal for metadata-heavy operations
  4. Silent Operation: No moving parts

Strategic Use Cases

Distributed Object Storage
Create Ceph clusters with all-NVMe OSD nodes for high-performance object storage:

1
2
3
4
5
# OSD configuration for NVMe optimization
[osd]
osd_memory_target = 4G
bluestore_min_alloc_size = 4096
bluestore_prefer_deferred_size = 0

ZFS Special Device
Accelerate metadata operations in large storage pools:

1
zpool create fastpool mirror nvme0n1 nvme1n1 special mirror nvme2n1 nvme3n1

Kubernetes Local Storage
Provision local PVs for stateful workloads:

1
2
3
4
5
6
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-nvme
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

PREREQUISITES

Hardware Requirements

ComponentMinimum SpecificationRecommended
Host PlatformPCIe 3.0 x4 slotsPCIe 4.0 x4 slots
Adapter CardsM.2 to PCIe x4 (bifurcation support)Asus Hyper M.2 x16 Card
CoolingPassive heatsinksActive cooling
Power Supply80+ Bronze 500W80+ Platinum 750W
Network1Gbps Ethernet10Gbps SFP+

Software Requirements

  • Linux Kernel 5.4+ (for NVMe-oF support)
  • mdadm 4.1+ or ZFS 2.1.5+
  • Docker 20.10+ or containerd 1.6+
  • SMART monitoring tools (nvme-cli, smartmontools)

Security Considerations

  1. Secure Erasure:
    1
    
    nvme format /dev/nvme0n1 -s 1 -n 1
    
  2. Encryption:
    1
    
    cryptsetup luksFormat --type luks2 /dev/nvme0n1p1
    

INSTALLATION & SETUP

RAID Configuration

Hardware RAID (mdadm):

1
2
mdadm --create /dev/md0 --level=10 --raid-devices=4 /dev/nvme[0-3]n1
mkfs.xfs -f -L fast_array /dev/md0

ZFS Pool:

1
2
zpool create -o ashift=12 tank mirror nvme0n1 nvme1n1 mirror nvme2n1 nvme3n1
zfs set compression=lz4 atime=off recordsize=1M tank

Kubernetes Local Volume Provisioning

  1. Create discovery daemonset:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: local-nvme
    provisioner: kubernetes.io/no-provisioner
    volumeBindingMode: WaitForFirstConsumer
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: local-provisioner-config
      namespace: kube-system
    data:
      storageClassMap: |
     local-nvme:
       hostDir: /mnt/fast-disks
       mountDir: /mnt/fast-disks
    
  2. Deploy local volume provisioner:
    1
    2
    3
    4
    5
    6
    
    helm install local-provisioner \
      --set nodeSelector.node-type=storage \
      --set storageClasses[0].name=local-nvme \
      --set storageClasses[0].hostDir=/mnt/fast-disks \
      --namespace kube-system \
      sig-storage/local-volume-provisioner
    

CONFIGURATION & OPTIMIZATION

NVMe-oF Target Configuration

  1. Install target CLI:
    1
    
    dnf install nvmetcli
    
  2. Create NVMe subsystem:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    
    nvmetcli
    > cd /
    > create subsys nqn.2023-09.usmanmasoodashraf:storage
    > cd /subsystems/nqn.2023-09.usmanmasoodashraf:storage
    > create namespaces 1
    > cd namespaces/1
    > set device path=/dev/nvme0n1
    > cd /
    > create ports 1
    > cd ports/1
    > set addr traddr=192.168.1.100 trsvcid=4420 trtype=tcp
    

Ceph OSD Tuning

/etc/ceph/ceph.conf optimizations:

1
2
3
4
5
6
[osd]
osd_memory_target = 4G
bluestore_cache_size = 2G
bluestore_min_alloc_size = 4096
bluestore_prefer_deferred_size = 0
bluestore_rocksdb_options = compression=kNoCompression

ZFS Special Device Allocation

1
2
zpool add tank special mirror nvme4n1 nvme5n1
zfs set special_small_blocks=128K tank

USAGE & OPERATIONS

Monitoring SSD Health

1
2
3
4
5
6
7
8
nvme smart-log /dev/nvme0
Critical Warning:                   0x00
Temperature:                        38 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    3%
Data Units Read:                    15,123,456
Data Units Written:                 8,765,432

Automated Wear Leveling

Create udev rule (/etc/udev/rules.d/99-nvme-rotation.rules):

1
ACTION=="add", SUBSYSTEM=="block", ENV{DEVTYPE}=="disk", ATTRS{queue/rotational}=="0", RUN+="/usr/bin/nvme set-feature /dev/$kernel -f 0x04 -v 0x01"

Kubernetes Storage Scheduling

Pod specification with topology constraints:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nvme-pvc
spec:
  storageClassName: local-nvme
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 500Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: nvme-consumer
spec:
  containers:
  - name: app
    image: nginx
    volumeMounts:
    - mountPath: "/data"
      name: nvme-vol
  volumes:
  - name: nvme-vol
    persistentVolumeClaim:
      claimName: nvme-pvc
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: topology.kubernetes.io/zone
            operator: In
            values:
            - rack-a

TROUBLESHOOTING

Common Issues and Solutions

Problem: NVMe drive not detected
Solution:

1
2
3
# Rescan PCIe bus
echo 1 > /sys/bus/pci/rescan
nvme list

Problem: High latency during parallel writes
Solution: Enable multiqueue:

1
echo 0 > /sys/block/nvme0n1/queue/rq_affinity

Problem: ZFS pool degradation
Solution: Check and replace faulty drive:

1
2
zpool status -v
zpool replace tank nvme0n1 nvme4n1

CONCLUSION

An abundance of M.2 SSDs unlocks architectural possibilities typically reserved for enterprise environments. By implementing distributed storage systems, accelerating metadata operations, and creating high-performance ephemeral storage layers, DevOps engineers can achieve:

  1. 5-9x performance improvements over SATA SSDs
  2. 40% reduction in storage latency
  3. 75% decrease in power consumption vs HDD arrays

For further exploration:

The strategic deployment of surplus SSDs transforms idle hardware into enterprise-grade infrastructure - proving that in the right hands, even discarded components can power world-class systems.

This post is licensed under CC BY 4.0 by the author.