Post

She Wants To Become A Sysadmin When She Grows Up

She Wants To Become A Sysadmin When She Grows Up: The Definitive Infrastructure Management Guide

INTRODUCTION

The whimsical image of a kitten inspecting network cables sparks a serious conversation about system administration fundamentals. In an era of abstracted cloud services and ephemeral containers, core sysadmin skills remain critical for DevOps practitioners managing modern infrastructure.

This guide addresses the essential infrastructure management techniques every professional needs - whether managing enterprise environments or self-hosted homelabs. We’ll explore why foundational system administration remains relevant despite Kubernetes clusters and serverless architectures handling increasing operational complexity.

Why This Matters for DevOps Professionals:

  • 72% of production outages stem from configuration errors (Gartner, 2023)
  • Infrastructure-as-Code (IaC) requires deep understanding of underlying systems
  • Cloud costs spiral when resource management fundamentals are neglected
  • Security breaches often exploit basic hardening oversights

You’ll master:

  • Linux system administration core concepts
  • Network configuration and debugging
  • Security hardening techniques
  • Performance optimization strategies
  • Automation patterns for infrastructure management

UNDERSTANDING SYSTEM ADMINISTRATION IN DEVOPS CONTEXTS

The Evolution of Sysadmin Responsibilities

Traditional system administration focused on:

  1. Physical server maintenance
  2. OS installation and patching
  3. Manual configuration management
  4. Reactive troubleshooting

Modern DevOps-integrated sysadmin roles encompass:

  • Infrastructure automation (Terraform, Ansible)
  • Container orchestration (Kubernetes, Docker Swarm)
  • Cloud resource optimization
  • Security compliance automation
  • Monitoring and observability implementation

Key Technical Competencies

Essential Linux Skills:

1
2
3
4
5
6
7
8
9
10
# Process management
$ top -p $(pgrep nginx)

# Filesystem troubleshooting
$ df -Th /var/lib/docker
$ lsof +L1

# Network diagnostics
$ tcpdump -i eth0 -nn 'port 80'
$ ss -tulpn

Network Fundamentals: | Protocol | Port | Tool | Use Case | |———–|——|—————|————————| | SSH | 22 | ssh-audit | Secure remote access | | HTTP/S | 80/443 | curl -v | Web service validation | | DNS | 53 | dig +trace | Name resolution debug | | ICMP | N/A | mtr | Network path analysis |

Security Hardening Checklist:

  1. Implement mandatory access control (AppArmor/SELinux)
  2. Configure unattended-upgrades
  3. Enforce SSH key authentication
  4. Set up firewall rules with UFW/iptables
  5. Enable auditd for system monitoring

DevOps vs Traditional Sysadmin: Key Differences

Workflow Comparison:

AspectTraditional SysadminDevOps Engineer
DeploymentManualCI/CD pipelines
ConfigurationScript-basedDeclarative IaC
ScalingVertical (bigger servers)Horizontal (auto-scaling)
MonitoringReactive alertsProactive observability
Change ManagementChange review boardsGitOps workflows

PREREQUISITES FOR MODERN INFRASTRUCTURE MANAGEMENT

Hardware Requirements

Minimum Homelab Specs:

  • CPU: x86_64 4 cores (Intel VT-x/AMD-V support)
  • RAM: 16GB DDR4
  • Storage: 256GB SSD + 2TB HDD
  • Network: 1Gbps NIC

Production Recommendations:

  • Separate management network
  • RAID-10 storage configuration
  • Dual power supplies
  • IPMI/iDRAC for remote management

Software Dependencies

Core Components:

1
2
3
4
5
6
7
8
9
10
11
# Ubuntu 22.04 base
$ lsb_release -a

# Docker 24.x+
$ docker version --format ''

# Python 3.10+
$ python3 --version

# Kernel requirements
$ uname -r # 5.15+

Security Tools:

  1. fail2ban for intrusion prevention
  2. ClamAV for malware scanning
  3. rkhunter for rootkit detection
  4. Lynis for system auditing

INSTALLATION & CONFIGURATION WALKTHROUGH

Base System Setup

Partitioning Scheme (GPT):

1
2
3
$ parted /dev/sda --align optimal mklabel gpt
$ parted /dev/sda mkpart primary 1MiB 512MiB   # /boot
$ parted /dev/sda mkpart primary 512MiB 100%   # LVM physical volume

LVM Configuration:

1
2
3
4
5
$ pvcreate /dev/sda2
$ vgcreate vg0 /dev/sda2
$ lvcreate -n lv_root -L 50G vg0
$ lvcreate -n lv_var -L 30G vg0
$ lvcreate -n lv_docker -L 100G vg0

Automated Deployment with Ansible

playbook.yml:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
- name: Configure base system
  hosts: all
  become: true
  tasks:
    - name: Update apt cache
      apt:
        update_cache: yes
        cache_valid_time: 3600

    - name: Install security packages
      apt:
        name:
          - unattended-upgrades
          - fail2ban
          - auditd
        state: present

    - name: Configure automatic updates
      copy:
        src: files/50unattended-upgrades
        dest: /etc/apt/apt.conf.d/50unattended-upgrades
        owner: root
        group: root
        mode: 0644

Docker Runtime Configuration

daemon.json Optimization:

1
2
3
4
5
6
7
8
9
10
11
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m",
    "max-file": "3"
  },
  "storage-driver": "btrfs",
  "live-restore": true,
  "iptables": false,
  "userns-remap": "default"
}

Container Management Commands:

1
2
3
4
5
# List containers with proper variable syntax
$ docker ps -a --format "table $CONTAINER_ID\t$CONTAINER_IMAGE\t$CONTAINER_STATUS"

# Resource constraints
$ docker run -it --cpus 0.5 --memory 512m nginx:alpine

SECURITY HARDENING & OPTIMIZATION

Kernel Parameter Tuning

/etc/sysctl.d/99-hardening.conf:

1
2
3
4
5
6
7
8
9
10
11
# Prevent SYN flood attacks
net.ipv4.tcp_syncookies = 1

# Disable IP forwarding
net.ipv4.ip_forward = 0

# Restrict core dumps
kernel.core_pattern = |/bin/false

# ASLR protection
kernel.randomize_va_space = 2

Mandatory Access Control

AppArmor Docker Profile:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <tunables/global>

profile docker-default flags=(attach_disconnected,mediate_deleted) {
  # Prevent container escapes
  deny /proc/[0-9]*/** wklx,
  deny /sys/[^f]*/** wklx,
  deny /sys/f[^s]*/** wklx,
  deny /sys/fs/[^c]*/** wklx,
  
  # Allow standard operations
  capability chown,
  capability dac_override,
  capability net_bind_service,
  
  # Container-specific paths
  /var/lib/docker/** rw,
}

OPERATIONAL WORKFLOWS & MAINTENANCE

Backup Strategy Implementation

BorgBackup Script:

1
2
3
4
5
6
7
8
9
10
11
12
#!/bin/bash
export BORG_PASSPHRASE='strongpassphrase'
DATE=$(date +%Y-%m-%d_%H-%M-%S)

borg create --stats --progress \
  /backup/hostname::system-$DATE \
  /etc /var /home \
  --exclude '/var/cache' \
  --exclude '/var/tmp'

# Retention policy
borg prune -v --list --keep-daily 7 --keep-weekly 4 /backup/hostname

Performance Monitoring Stack

Prometheus Alert Rules:

1
2
3
4
5
6
7
8
9
10
11
groups:
- name: system-alerts
  rules:
  - alert: HighLoad
    expr: node_load15 > (count by (instance)(node_cpu_seconds_total{mode="idle"})) * 0.8
    for: 10m
    labels:
      severity: critical
    annotations:
      summary: "High system load (instance )"
      description: "Load average is > 80% of CPU cores"

TROUBLESHOOTING METHODOLOGY

Systematic Debugging Approach

  1. Resource Analysis:
    1
    2
    
    $ dstat -tcmsdn --top-cpu --top-mem
    $ pidstat 1 5
    
  2. Network Inspection:
    1
    2
    
    $ tcpretrans -c -i eth0
    $ nstat -az | grep -i retrans
    
  3. Storage I/O Analysis:
    1
    2
    
    $ iostat -dxm 1
    $ iotop -oPa
    

Common Issues and Resolutions

Problem: Docker container network connectivity failures
Diagnosis:

1
2
$ nsenter -t $(docker inspect -f '' $CONTAINER_ID) -n ip addr
$ iptables-legacy -t nat -L -n -v

Solution: Disable firewalld/UFW conflicting with docker iptables rules

Problem: High system load with CPU steal
Diagnosis:

1
2
$ mpstat -P ALL 1 5
$ virsh list # Check for noisy neighbors

Solution: Migrate to dedicated hardware or limit VM CPU allocation

CONCLUSION

Mastering system administration fundamentals remains essential despite cloud abstractions and automation tooling. This guide has equipped you with practical techniques for:

  • Secure Linux environment configuration
  • Infrastructure automation patterns
  • Performance troubleshooting methodologies
  • Container runtime hardening

Continue Your Learning Journey:

  1. Linux Performance Analysis by Brendan Gregg
  2. CIS Benchmarks for security hardening
  3. Sysadmin Craftsmanship philosophy

The most effective DevOps practitioners combine infrastructure expertise with automation skills. Whether managing enterprise clusters or homelab experiments, these system administration fundamentals form the foundation of reliable, secure, and performant systems.

This post is licensed under CC BY 4.0 by the author.