Post

My First 10 Inch Rack With Local Llm No More Spotify Google Home Netflix Chatgpt

My First 10 Inch Rack With Local Llm No More Spotify Google Home Netflix Chatgpt

My First 10 Inch Rack With Local Llm No More Spotify Google Home Netflix Chatgpt

INTRODUCTION

The rapid convergence of cheap hardware, open‑source tooling, and a growing appetite for self‑hosted services has turned the humble 10‑inch rack into a viable centerpiece for modern homelabs. In this post we unpack the exact scenario described by the title – a compact rack that runs a local large language model (LLM), replaces commercial streaming endpoints such as Spotify, Google Home, Netflix, and ChatGPT, and still manages to host a full smart‑home stack, a Kiwix‑based offline Wikipedia mirror, and a lightweight NAS. The guide is written for seasoned sysadmins and DevOps engineers who already understand the fundamentals of infrastructure automation but are looking to tighten the feedback loop between hardware sizing, service orchestration, and cost‑effective operation.

Why does this matter? Traditional cloud‑first architectures often hide the true operational cost of always‑on services behind managed APIs. When you replace those APIs with locally deployed equivalents you gain:

  • Predictable latency and bandwidth usage
  • Full control over data privacy and retention policies
  • The ability to continue serving critical functionality during internet or cellular outages – a reality for many remote or disaster‑prone locations

Readers will walk away with a concrete blueprint for:

  1. Selecting hardware that fits a 10‑inch footprint while delivering enough headroom for a local LLM and multiple containers
  2. Installing and configuring a stack that includes Docker, Portainer, Kiwix, and a locally hosted LLM inference server
  3. Hardening the environment against common security pitfalls
  4. Monitoring, backing up, and scaling the services without introducing unnecessary complexity

The following sections break down each of these topics in depth, using real‑world commands, configuration snippets, and best‑practice recommendations that you can copy‑paste into your own lab.


UNDERSTANDING THE TOPIC

What is a “local LLM” and why does it replace Spotify, Google Home, Netflix, ChatGPT? A large language model (LLM) is a neural network trained on massive text corpora to generate human‑like responses. When deployed locally you can expose it through an API that your own applications query instead of calling external services. In the context of the title, the local LLM acts as a unified replacement for:

  • Spotify – by serving personalized music recommendations and playlist generation through a self‑hosted recommendation engine * Google Home – by handling voice‑triggered commands locally, turning smart‑home devices on/off without relying on Google’s cloud
  • Netflix – by providing a catalog of locally cached media metadata and on‑demand transcoding when paired with a Plex‑style backend
  • ChatGPT – by offering a private, offline conversational interface that never sends user data to third‑party servers

The key benefit is data sovereignty: all interactions stay inside your rack, eliminating the need for outbound API calls that consume bandwidth and expose you to service‑level outages.

Historical context

The concept of “self‑hosted entertainment” dates back to the early 2000s when projects like MPlayer and Xine allowed users to play local media without streaming services. The rise of Docker in 2013 made it feasible to package these tools alongside modern web‑based front‑ends, while the advent of open‑source LLM frameworks such as llama.cpp, text‑generation‑webui, and Ollama in 2022–2023 closed the gap between research prototypes and production‑ready inference services. Today, a 10‑inch rack can comfortably run a 7 B parameter model on a modest CPU‑only platform, thanks to quantized inference and efficient memory management.

Core components of the described stack

ComponentRoleTypical SoftwareWhy it fits a 10‑inch rack
Beelink ME MiniBare‑metal hostUbuntu Server 22.04 LTSLow power (≈ 70 W idle), compact dimensions, enough SATA/NVMe slots for NAS and SSD
Docker EngineContainer runtimeDocker CE 24.xProvides isolation, easy image distribution, and resource constraints
PortainerUI for container managementPortainer CE 2.11Simplifies monitoring and troubleshooting without SSH fatigue
KiwixOffline Wikipedia mirrorKiwix‑server 0.10Stores massive text corpora locally, serving via HTTP with minimal CPU load
Local LLM inference serverReplaces ChatGPT and other cloud LLMsOllama or text‑generation‑webui with ggml‑quantized modelRuns on CPU with 4‑bit quantization, fitting within 8 GB RAM envelope
Plex / JellyfinMedia streaming backendJellyfin 10.9Open‑source alternative to Netflix‑style on‑demand streaming
Home AssistantSmart‑home orchestrationHome Assistant 2024.9Controls Google Home‑compatible devices locally
NAS layerPersistent storageTrueNAS SCALE or OpenMediaVaultProvides SMB/NFS shares for media and backup data

Each of these pieces can be containerized, allowing you to scale, update, or replace them independently. The rack’s small footprint means you can mount it on a wall, a shelf, or a standard 19‑inch rackmount kit without needing a dedicated server room.

Pros and cons of this approach

Pros

  • Energy efficiency – Idle power stays under 70 W, translating to < 100 kWh per year in many climates.
  • Resilience – Kiwix and the local LLM continue to serve data during ISP outages.
  • Cost control – No recurring subscription fees for music, video, or AI APIs.
  • Learning opportunity – Hands‑on experience with Docker, network routing, and storage provisioning. Cons

  • Initial setup time – Requires careful planning of hardware compatibility and network segmentation.
  • Limited scalability – CPU‑only LLMs cannot match the throughput of dedicated GPU clouds for heavy traffic.
  • Feature gaps – Some premium features (e.g., high‑resolution HDR streaming) may be unavailable without commercial licences.

Overall, the trade‑off leans heavily toward self‑sufficiency for hobbyist and semi‑professional use cases.


PREREQUISITES

Before you begin, verify that your hardware meets the minimum specifications. The following checklist assumes a BeagleBoard‑style 10‑inch chassis similar to the Beelink ME Mini, but the same principles apply to any low‑profile server.

Hardware requirements

ItemMinimumRecommended
CPU4‑core ARMv8 (e.g., Rockchip RK3568)8‑core Intel/AMD Xeon with AES‑NI for faster encryption
RAM8 GB16 GB (to accommodate multiple containers and a 4‑bit LLM)
Storage256 GB NVMe SSD1 TB NVMe SSD (fast random I/O for Kiwix and media cache)
NetworkGigabit Ethernet2.5 GbE or 10 GbE for future media‑heavy workloads
Power supply12 V 5 A12 V 10 A with UPS integration for outage tolerance

Software dependencies

DependencyVersionInstallation source
Ubuntu Server22.04 LTSOfficial Ubuntu mirrors
Docker Engine24.0.xDocker apt repository
Docker Compose2.20.xBundled with Docker CE
Git2.34.xUbuntu packages
curl7.88.xUbuntu packages
jq1.6.xUbuntu packages
ffmpeg5.1.xUbuntu packages (for media transcoding)
Jellyfin10.9.xOfficial Jellyfin repository
Ollama0.1.45Ollama download page

Network and security considerations

  • Static IP assignment – Reserve a DHCP lease or configure a static address (e.g., 192.168.1.10) to simplify firewall rules.
  • Port isolation – Keep management interfaces (Portainer, Home Assistant) on a separate VLAN or a dedicated subnet.
  • Outbound firewall – Block all outbound traffic except DNS and NTP; this enforces the “no external API” philosophy.
  • TLS termination – Use Let’s Encrypt certificates via certbot for any publicly exposed services (e.g., Kiwix front‑end). #### User permissions

Create a dedicated system user for Docker operations to avoid running containers as root:

1
2
sudo adduser --system --group --no-create-home dockeradmin
sudo usermod -aG docker dockeradmin

All subsequent Docker commands in this guide assume you are logged in as dockeradmin or have sudo privileges.


INSTALLATION & SETUP

The following sections walk you through each layer of the stack, from Docker foundation to the final LLM endpoint. Every command is annotated with explanations of why it is needed and how it interacts with the surrounding components.

1. Install Docker Engine

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Add Docker’s official GPG key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

# Set up the stable repository
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] \
  https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Update package index and install Docker
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io

# Verify installation
docker version```

*Why this matters*: Using the official repository ensures you receive security patches promptly.  The `docker version` output confirms that the daemon (`dockerd`) and client (`docker`) are correctly installed.

#### 2. Enable and start Docker service  

```bash
sudo systemctl enable --now docker
sudo systemctl status docker```

*Explanation*: Enabling the service guarantees that containers start on boot, which is essential for a 24/7 homelab.  The status check confirms the daemon is healthy.

#### 3. Install Docker Compose (v2 plugin)  

```bash
DOCKER_COMPOSE_VERSION="v2.20.0"
sudo curl -SL "https://github.com/docker/compose/releases/download/${DOCKER_COMPOSE_VERSION}/docker-compose-linux-$(uname -m)" \
  -o /usr/local/lib/docker/cli-plugins/docker-compose
sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
docker compose version

Why: Compose v2 integrates natively with the Docker CLI, allowing you to manage multi‑container stacks with a single docker compose command.

4. Deploy Portainer for visual management

```yaml# Save as portainer.yml version: “3.8” services: portainer: image: portainer/portainer-ce:latest container_name: $CONTAINER_NAMES_PORTainer restart: unless-stopped sockets: - type: unix source: path: /var/run/docker.sock target: socket: /var/run/docker.sock ports: - “9000:9000” volumes: - /var/run/docker.sock:/var/run/docker.sock - portainer_data:/data

volumes: portainer_data:

1
2
3
4
5
6
```bash# Create the named volume
docker volume create portainer_data

# Launch Portainer
docker compose -f portainer.yml up -d

Key points:

  • The container is named $CONTAINER_NAMES_PORTainer to avoid hard‑coding IDs.
  • The sockets mapping grants Portainer direct access to the Docker socket, enabling full API control.
  • Persistent storage (portainer_data) ensures settings survive container recreation.

After the stack starts, navigate to http://<your‑ip>:9000 and complete the initial admin setup.

5. Deploy Kiwix for offline Wikipedia

```yaml# Save as kiwix.yml version: “3.8” services: kiwix: image: kiwix/kiwix-serve:latest container_name: $CONTAINER_NAMES_Kiwix restart: unless-stopped ports: - “8080:8080” volumes: - /srv/kiwix:/data:ro environment: - KIWIX_SITES=wiki/en command: [“kiwix-serve”, “/data/wiki/en/*.zim”, “–port”, “8080”, “–loglevel”, “info”]

1
2
3
4
5
6
```bash
# Create a directory for the ZIM files
sudo mkdir -p /srv/kiwix/wiki/en
# Download a sample Wikipedia dump (example: enwiki-2024-09-pages-articles.xml.gz)
curl -L https://download.kiwix.org/zim/wikipedia/enwiki/latest/wikipedia_en_all_maxi.z
This post is licensed under CC BY 4.0 by the author.