My Husband Works In It And He Knows Way More Than You Do
My Husband Works In IT And He Knows Way More Than You Do: Infrastructure Management in Modern DevOps
Introduction
The infamous phrase “My husband works in IT and knows way more than you do” echoes through helpdesk logs and sysadmin war stories like a bad joke with serious consequences. This attitude – whether from end-users, stakeholders, or even colleagues – represents a fundamental challenge in infrastructure management: the disconnect between perceived technical competence and actual systems knowledge.
The Reddit thread “Umm, I’m Gen Z. I know how to use computers” perfectly encapsulates this modern paradox. While younger generations grow up with intuitive UIs and cloud services, foundational infrastructure knowledge – the kind that kept a hotel’s bonded T1 lines and corporate Exchange servers running 15 years ago – remains critical. Today’s DevOps engineers face similar challenges when managing distributed systems, container orchestration, and hybrid cloud environments.
This comprehensive guide examines infrastructure management through the lens of modern DevOps practices, covering:
- Infrastructure-as-Code (IaC) implementation
- Container orchestration best practices
- Network performance optimization
- Legacy system modernization strategies
- Cross-generational knowledge transfer techniques
Whether you’re managing a homelab Kubernetes cluster or enterprise-grade cloud infrastructure, these battle-tested approaches will help you avoid becoming the subject of someone else’s “my spouse knows better” horror story.
Understanding Modern Infrastructure Management
Evolution from Sysadmin to DevOps
The shift from traditional system administration to DevOps represents more than just a title change. Consider these key differences:
| Traditional Sysadmin | Modern DevOps |
|---|---|
| Manual server provisioning | Infrastructure-as-Code |
| Static environments | Ephemeral containers |
| Physical hardware focus | Cloud-native architectures |
| Reactive troubleshooting | Proactive monitoring |
| Siloed responsibilities | Cross-functional collaboration |
This transition began in earnest with tools like Puppet (2005) and Chef (2009), accelerating with Docker’s release in 2013. The bonded T1 lines from our hotel story would now be replaced by SD-WAN solutions, while on-prem Exchange servers have largely migrated to cloud-based services like Microsoft 365.
Core Components of Modern Infrastructure
- Orchestration Systems: Kubernetes, Docker Swarm, Nomad
- Configuration Management: Ansible, Terraform, Pulumi
- Monitoring Stack: Prometheus, Grafana, ELK
- Networking: Service meshes (Istio, Linkerd), API gateways
- Security: Zero-trust architectures, SPIFFE/SPIRE
The Knowledge Gap Challenge
The original hotel scenario highlights three persistent infrastructure challenges:
- Bandwidth Management: From T1 lines to 5G/WiFi6
- Centralized Services: Exchange → Office 365 → Hybrid models
- User Expectations: “Always-on” mentality vs physical constraints
Modern solutions address these through:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Example: Kubernetes Network Policies
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: bandwidth-limit
spec:
podSelector:
matchLabels:
app: exchange
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: outlook
ports:
- protocol: TCP
port: 443
This policy limits Exchange pod traffic to only Outlook clients on HTTPS, mimicking the controlled access of legacy systems with modern tooling.
Prerequisites for Modern Infrastructure
Hardware Requirements
While cloud providers abstract hardware concerns, on-prem/homelab setups require careful planning:
| Component | Minimum | Recommended | Enterprise |
|---|---|---|---|
| CPU | 4 cores | 8 cores | 16+ cores |
| RAM | 8GB | 32GB | 128GB+ |
| Storage | 100GB | 1TB NVMe | SAN/NAS |
| Network | 1GbE | 10GbE | 25/100GbE |
Software Requirements
- Operating System: RHEL 8+/Ubuntu 20.04 LTS+
- Container Runtime: containerd 1.6+, Docker Engine 24.0+
- Orchestration: Kubernetes 1.27+, Nomad 1.5+
- Automation: Ansible Core 2.14+, Terraform 1.5+
Security Preconfiguration
Before installation:
- Configure SSH key authentication:
1 2
ssh-keygen -t ed25519 -C "admin@example.com" ssh-copy-id -i ~/.ssh/id_ed25519.pub user@server
- Harden SSH configuration (
/etc/ssh/sshd_config):1 2 3 4
PermitRootLogin no PasswordAuthentication no MaxAuthTries 3 ClientAliveInterval 300
- Set up basic firewall rules:
1 2 3 4
sudo ufw default deny incoming sudo ufw allow 22/tcp sudo ufw allow 6443/tcp # Kubernetes API sudo ufw enable
Installation & Setup: Kubernetes Cluster
Base System Configuration
- Update all packages:
1
sudo apt update && sudo apt upgrade -y
- Install container runtime:
1 2 3
sudo apt install -y containerd runc sudo containerd config default | sudo tee /etc/containerd/config.toml sudo systemctl restart containerd
- Disable swap:
1 2
sudo swapoff -a sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
Kubernetes Installation
- Add repository:
1 2 3
sudo apt-get install -y apt-transport-https ca-certificates curl curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
- Install components:
1 2 3
sudo apt-get update sudo apt-get install -y kubelet=1.28.5-1.1 kubeadm=1.28.5-1.1 kubectl=1.28.5-1.1 sudo apt-mark hold kubelet kubeadm kubectl
- Initialize control plane:
1 2 3
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 \ --apiserver-advertise-address=192.168.1.100 \ --control-plane-endpoint=cluster.example.com:6443
Network Plugin Setup
Install Calico CNI:
1
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml
Verify node status:
1
kubectl get nodes -o wide
Expected output:
1
2
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master01 Ready control-plane 5m v1.28.5 192.168.1.100 <none> Ubuntu 22.04.3 LTS 5.15.0-78-generic containerd://1.6.21
Configuration & Optimization
Cluster Configuration Best Practices
- Resource Quotas: Prevent namespace resource exhaustion
1 2 3 4 5 6 7 8 9 10
apiVersion: v1 kind: ResourceQuota metadata: name: compute-resources spec: hard: requests.cpu: "8" requests.memory: 16Gi limits.cpu: "16" limits.memory: 32Gi
- Pod Security Standards:
1 2 3 4 5 6 7 8 9 10
apiVersion: apiserver.config.k8s.io/v1 kind: AdmissionConfiguration plugins: - name: PodSecurity configuration: apiVersion: pod-security.admission.config.k8s.io/v1 kind: PodSecurityConfiguration defaults: enforce: "restricted" enforce-version: "latest"
Performance Optimization
- Kubelet Configuration (
/var/lib/kubelet/config.yaml):1 2 3 4 5 6 7
apiVersion: kubelet.config.k8s.io/v1 systemReserved: cpu: 500m memory: 1Gi evictionHard: memory.available: "500Mi" nodefs.available: "10%"
- ETCD Tuning: ```bash sudo systemctl edit etcd.service
[Service] Environment=”ETCD_HEARTBEAT_INTERVAL=100” Environment=”ETCD_ELECTION_TIMEOUT=500” Environment=”ETCD_SNAPSHOT_COUNT=10000”
1
2
3
4
5
6
7
8
### Security Hardening
1. Enable Pod Security Admission:
```bash
kubectl label --overwrite ns default \
pod-security.kubernetes.io/enforce=baseline \
pod-security.kubernetes.io/warn=restricted
- Network Policy Enforcement:
1 2 3 4 5 6 7 8 9
apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: default-deny spec: podSelector: {} policyTypes: - Ingress - Egress
Usage & Operations
Daily Management Tasks
- Cluster status check:
1 2
kubectl get componentstatuses kubectl top nodes
- Pod management: ```bash
List pods with extended information
kubectl get pods -o wide –sort-by=’.status.startTime’
Debugging pod issues
kubectl describe pod $pod_name kubectl logs $pod_name –previous
1
2
3
4
5
6
7
8
9
10
### Backup Procedures
1. ETCD snapshot:
```bash
sudo ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /backup/etcd-snapshot-$(date +%Y%m%d).db
- Resource manifests backup:
1
kubectl get all --all-namespaces -o yaml > cluster-state-$(date +%Y%m%d).yaml
Monitoring Setup
Prometheus installation via Helm:
1
2
3
4
5
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring --create-namespace \
--set alertmanager.enabled=false \
--set grafana.enabled=true
Access Grafana dashboard:
1
kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80
Troubleshooting Common Issues
Network Connectivity Problems
- Check CNI plugin status:
1
kubectl get pods -n kube-system -l k8s-app=calico-node
- Validate DNS resolution:
1 2
kubectl run dns-test --image=busybox:1.36 --rm -it --restart=Never \ -- nslookup kubernetes.default
Resource Constraints
- Identify resource hogs:
1
kubectl top pods --sort-by=memory --containers
- Analyze pod evictions:
1
kubectl get events --field-selector=reason=Evicted
Persistent Volume Issues
- Check storage class:
1
kubectl get storageclass
- Verify volume attachments:
1 2
kubectl describe pvc $pvc_name kubectl describe pv $pv_name
Conclusion
The “my husband knows IT” mentality fails precisely where modern infrastructure management succeeds – in recognizing that true expertise requires continuous learning and validation through hands-on practice. The hotel’s bonded T1 lines have been replaced by 100GbE fiber, and Exchange servers have evolved into Kubernetes-hosted microservices, but the core challenge remains: managing complex systems while bridging knowledge gaps between generations of technologists.
Key takeaways from our deep dive:
- Infrastructure-as-Code eliminates configuration drift and “works on my machine” scenarios
- Container orchestration enables reproducible environments across generations
- Modern monitoring provides visibility that legacy systems lacked
- Security must be baked in, not bolted on
To continue your infrastructure management journey:
The next time someone claims superior knowledge through association, remember: real expertise comes from understanding the systems, not just using them. Now go deploy something properly documented.