Asked Our Head Of Sales If Putting Client Addresses In Chatgpt Was Data Sharing She Looked At Me Like I Was The Idiot

Posted Apr 25, 2026

By Usman Masood Ashraf

views 9 min read

INTRODUCTION

In a modern homelab or self‑hosted environment, the line between personal productivity tools and enterprise‑grade infrastructure can blur surprisingly quickly. Imagine a sales executive polishing client outreach emails with an AI assistant, only to discover that the very same assistant has been fed a prompt containing a client’s full name, deal size, internal pricing strategy, and even a residential address. The immediate reaction is often a dismissive “I’m just asking for help with wording,” but the underlying question — is this data sharing? — has profound implications for security, compliance, and overall infrastructure design.

This guide unpacks that scenario from a DevOps perspective, focusing on how to architect a safe, auditable, and performant AI workflow that respects data boundaries while still delivering the productivity gains that teams expect. Readers will explore the fundamentals of data classification, the mechanics of secure model deployment, and practical steps to harden a homelab stack against accidental leakage. By the end, you will have a clear roadmap for turning a casual conversation into a concrete, policy‑driven practice that aligns with both technical best practices and regulatory expectations.

Key takeaways include:

Understanding why seemingly innocuous prompts can constitute data sharing in a regulated context.
Learning how to select and deploy open‑source language models that run entirely within your own network.
Implementing configuration patterns that isolate sensitive inputs from external services.
Applying security hardening techniques to prevent accidental exposure of client‑specific data.
Leveraging monitoring and logging to maintain visibility over AI‑driven processes.

The following sections provide a deep dive into the technology, prerequisites, installation, configuration, and operational workflows needed to embed privacy‑first AI into a homelab or small‑scale production environment.

UNDERSTANDING THE TOPIC

In the scenario described, the sales head used a public‑facing ChatGPT interface to refine email copy. The prompt included:

Full client name
Deal size (confidential revenue figures)
Internal pricing strategy (confidential discount rules)
Client home address (PII)

Each of these elements belongs to a distinct classification tier:

Data Type	Classification	Typical Handling	———–	—————-	——————
Full Name	PII (Personal Identifiable Information)	Encrypted at rest, limited access
Deal Size	Confidential Business Information	Restricted to finance/leadership
Pricing Strategy	Proprietary Intellectual Property	Guarded as trade secret
Home Address	PII (Sensitive)	Must be masked or omitted

When these items are sent to an external model, they become shared with the service provider, potentially violating internal data‑handling policies. The head of sales’ dismissal stems from a misunderstanding of the technical flow: the model does not simply “polish” text; it ingests the entire prompt, processes it through a neural network, and may store intermediate representations for future use.

Historical Perspective on Secure AI Deployment

The concept of running AI models locally is not new. Early open‑source initiatives like GPT‑Neo and LLaMA‑cpp demonstrated that large language models could be quantized and executed on commodity hardware. However, the explosion of interest in conversational AI over the past few years has shifted the focus toward hosted APIs, which offer convenience at the cost of data sovereignty.

Key milestones:

2019 – Release of GPT‑2 and subsequent community forks, sparking experimentation with local inference. - 2021 – Introduction of Ollama, a lightweight tool that packages models with a single command, making local deployment accessible to sysadmins.
2023 – Emergence of Text Generation WebUI, a web‑based interface that supports multiple model back‑ends and provides fine‑grained control over context windows and memory.
2024 – Standardization of Quantization‑Aware Training (QAT) and GGUF formats, enabling 4‑bit and 5‑bit inference on ARM‑based homelab nodes.

These developments have converged on a single practical goal: run the model where the data lives. By doing so, you eliminate the need to transmit sensitive prompts to external endpoints, thereby preventing inadvertent data sharing.

Key Features of a Self‑Hosted AI Stack

Feature	Benefit	Typical Implementation	———	———	————————
On‑Premise Execution	No external network traffic for prompts	Deploy via Docker or systemd services on local nodes	Model Quantization	Reduces RAM/CPU footprint, enables low‑power hardware	Use GGUF or AWQ formats, load with `llama.cpp` or `text-generation-webui`
Context Isolation	Guarantees that each request operates on a fresh context	Reset conversation history after each inference
Fine‑Grained Access Control	Limits who can invoke the model and with what parameters	Integrate with LDAP or OAuth2 proxies, enforce role‑based policies
Audit Logging	Provides traceability for every inference request	Forward logs to centralized syslog or Loki, tag with request IDs

Understanding these capabilities helps translate the abstract notion of “data sharing” into concrete technical controls that can be enforced at the infrastructure level.

Comparison with Alternative Approaches

Approach	Data Exposure	Operational Complexity	Cost	Typical Use Case
Public API (e.g., ChatGPT)	High – data leaves premises	Low – just an HTTP call	Pay‑per‑token	Quick prototyping, non‑sensitive tasks
Managed Cloud LLM (e.g., Azure OpenAI)	Medium – governed by service SLA	Medium – VNet integration required	Subscription	Enterprise‑wide analytics with compliance agreements
Self‑Hosted Model (Ollama, Text Generation WebUI)	Low – data never leaves network	High – requires DevOps expertise	Variable (hardware‑dependent)	Regulated environments, PII handling

The self‑hosted route is the only option that guarantees zero data exfiltration, making it the safest choice when dealing with client‑specific details.

PREREQUISITES Before embarking on a secure AI deployment, verify that your environment meets the following baseline requirements.

Requirement	Minimum Specification	Recommended Specification
CPU	4‑core modern x86_64 (e.g., Intel i5‑12400)	8‑core AMD Ryzen 7 or Intel i7‑12700K	RAM	8 GB	32 GB (for 7B‑parameter models in 4‑bit)
GPU	None (CPU‑only inference)	NVIDIA RTX 3060‑12GB or better (CUDA 12.x)
Storage	100 GB SSD	500 GB NVMe SSD (fast model loading)
OS	Ubuntu 22.04 LTS or Debian 12	Ubuntu 24.04 LTS (latest kernel)
Docker	Docker Engine 24.0+	Docker Engine 24.0+ with `containerd` 1.7+
Network	Outbound internet for model pulls (once)	Fully isolated LAN for production workloads
User Permissions	Non‑root user with `sudo` rights	Dedicated `aiadmin` group with restricted sudo

Dependency Checklist

Docker Engine – Install from the official Docker repository.
Docker Compose – Version 2.20+ for multi‑service orchestration.
git – To clone model repositories.
curl – For health‑checks and API calls.
jq – JSON processing in scripts.
systemd – For service persistence (optional but recommended).

Network & Security Considerations

Outbound Access: Allow HTTP/HTTPS to model registries (e.g., ghcr.io, ` Ollama`). Once models are cached, block all outbound traffic to prevent accidental leaks.
Inbound Access: Restrict ports to localhost or a trusted reverse‑proxy subnet. Typical exposure: 8080 for the web UI, 8081 for the API endpoint.
Firewall Rules: Use ufw or iptables to enforce a default deny stance on all interfaces except those explicitly opened.

User & Permission Setup

  
# Create a dedicated group for AI services
sudo groupadd aiadmin

# Add your admin user to the group
sudo usermod -aG aiadmin $USER

# Ensure Docker socket is accessible to the group
sudo chmod 770 /var/run/docker.sock
sudo chgrp aiadmin /var/run/docker.sock

After these steps, any member of aiadmin can run Docker commands without sudo, while maintaining a clear separation from regular system users.

INSTALLATION & SETUP

The following sections walk through a complete, reproducible installation of a self‑hosted AI stack using Ollama and Text Generation WebUI. Both tools are open‑source, actively maintained, and support a wide range of model formats.

1. Pulling the Base Images

```bash# Pull the official Ollama image (latest stable) docker pull ollama/ollama:latest

Pull the Text Generation WebUI image (includes a lightweight web server)

docker pull docker.io/ghcr.io/oobabooga/text-generation-webui:latest

> **Note**: Both images are built on Alpine Linux and include the necessary runtime dependencies.  

### 2. Creating a Persistent Data Volume  

```bash
# Create a Docker volume for model caches and configuration
docker volume create --name ai_models```

The volume will be mounted into the containers at `/models` (Ollama) and `/text-generation-webui/models` (WebUI).  

### 3. Running Ollama in a Restricted Mode  

```bash
docker run -d \
  --name ollama \
  --restart unless-stopped \
  --cpus="4.0" \
  --memory="8g" \
  --volume ai_models:/models \
  --network host \
  --env OLLAMA_HOST=0.0.0.0 \
  --env OLLAMA_ORIGINS="*" \
  ollama/ollama:latest \
  serve

--network host binds the container to the host network, allowing the web UI to reach the Ollama API via localhost.
--cpus and --memory limit resource consumption, preventing the container from monopolizing the host.
--restart unless-stopped ensures the service survives reboots. ### 4. Deploying Text Generation WebUI

  
docker run -d \
  --name webui \
  --restart unless-stopped \
  --cpus="4.0" \
  --memory="12g" \
  --volume ai_models:/text-generation-webui/models \
  --volume ./webui-config:/text-generation-webui/config \
  --network host \
  docker.io/ghcr.io/oobabooga/text-generation-webui:latest \
  --api \
  --listen \
  --port 8080 \
  --log-level info

--api enables the REST endpoint at /api/v1/generate.
--listen makes the service reachable from other containers on the host network.
--port 8080 exposes the UI on port 8080; you can front it with Nginx for TLS termination if needed.

5. Verifying the Deployment

  
# Check container status
docker ps --filter "name=ollama or name=webui"

# Test Ollama inference (replace "vicuna" with your chosen model)
curl -X POST http://localhost:11434/api/generate \
  -H

Open Source, Reddit Guides, Docker

This post is licensed under CC BY 4.0 by the author.