Zum Inhalt springen
>_<
AI EngineeringWiki

Security

Self-Hosted Security: The 6-Layer Model

When you host AI services yourself, you are responsible for security. No cloud provider catches mistakes for you. Here are the 6 layers that protect your infrastructure.

Reading time: 15 minLast updated: March 2026
πŸ“‹ At a Glance

Self-hosting means full control β€” but also full responsibility. Security is not a single feature but a layered model: 6 layers from physical infrastructure to monitoring. Each layer stops something different. None is sufficient on its own.

The 6 Security Layers

Each layer addresses a different attack surface. If Layer 1 fails, Layer 2 must hold. This is Defense in Depth.

6-Layer Security Model

Security in layers: Each level protects against different attack vectors.

LayerProtects AgainstTools
1. NetworkUnauthorized external accessFirewall (UFW/iptables), VLAN, VPN
2. SSH & AuthenticationBrute force, weak passwordsSSH Key-Only, fail2ban, 2FA
3. Host Operating SystemOutdated software, kernel exploitsUnattended Upgrades, CIS Benchmark
4. Containers & ServicesPrivilege escalation, unsecured APIsRootless containers, read-only FS, secrets
5. ApplicationPrompt injection, data leakageInput validation, output sanitizer, rate limits
6. Monitoring & ResponseUndetected intrusionsLoki, Grafana Alerts, audit logs

Layer 1: Network Segmentation

Your local AI stack should NOT be directly reachable from the internet. The most important rule: Default Deny β€” everything is blocked, you only open what is needed.

UFW Basic Configuration

# Block everything (Default Deny)
sudo ufw default deny incoming
sudo ufw default allow outgoing

# Allow SSH (only from local network)
sudo ufw allow from 10.40.10.0/24 to any port 22

# Reverse Proxy (HTTPS)
sudo ufw allow 443/tcp

# Enable firewall
sudo ufw enable
sudo ufw status verbose
⚠️ Do NOT expose ports publicly

Services like Ollama (11434), n8n (5678), Grafana (3000), PostgreSQL (5432) do NOT belong on the internet. If you need external access: VPN or reverse proxy with authentication (e.g., Cloudflare Tunnel or Traefik with BasicAuth).

Layer 2: SSH & Authentication

SSH is the main access point to your servers. Misconfigured, it becomes the biggest attack vector.

Hardening SSH (/etc/ssh/sshd_config)

# Disable password login
PasswordAuthentication no

# Prohibit root login (or key-only)
PermitRootLogin prohibit-password

# Allow specific users only
AllowUsers joe admin

# Idle timeout (5 minutes)
ClientAliveInterval 300
ClientAliveCountMax 0

# Restart: sudo systemctl restart sshd

Install fail2ban

# Installation
sudo apt install fail2ban

# Enable SSH jail
sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local

# In jail.local:
# [sshd]
# enabled = true
# maxretry = 3
# bantime = 3600

sudo systemctl enable fail2ban
sudo systemctl start fail2ban

# Check status
sudo fail2ban-client status sshd
ℹ️ Generate SSH Keys

If you do not have SSH keys yet: ssh-keygen -t ed25519 -C "[email protected]". Copy the public key to the server: ssh-copy-id user@server. Then disable password login.

Layer 3 & 4: Host & Container Security

Host System

MeasureCommand / ConfigurationWhy
Auto-Updatessudo apt install unattended-upgradesAutomatically apply security patches
Non-root Usersudo adduser deploy && sudo usermod -aG docker deployMinimal privileges, no permanent root
Kernel Updatessudo apt upgrade linux-genericClose kernel exploits
Unnecessary Servicessudo systemctl disable bluetooth cupsReduce attack surface

Container Security

PrincipleImplementationExample
Read-Only Rootread_only: true in composePrevents filesystem manipulation
No Root in Containeruser: '1000:1000' in composeContainer runs as normal user
Secrets ManagementDocker Secrets or VaultNo credentials in environment variables
Resource Limitsmem_limit: 4g, cpus: '2.0'Container cannot overwhelm the host
Network IsolationDedicated Docker networks per stackServices only see what they need
⚠️ Swagger/Docs in Production

Many frameworks (FastAPI, Express) serve API documentation at /docs or /swagger automatically. In production: disable with docs_url=None. Attackers use these endpoints to map your API structure.

Layer 5: Application Security for AI

AI applications have unique security risks. LLMs can be manipulated (prompt injection) and leak sensitive data in their responses.

RiskDescriptionCountermeasure
Prompt InjectionUser manipulates LLM instructionsSystem prompt isolation, input validation
Data LeakageLLM outputs environment variables or secretsOutput Sanitizer (MANDATORY)
Token TheftAPI keys interceptedVault, token rotation, rate limiting
Model PoisoningManipulated models from insecure sourcesOnly official sources (ollama.com, HuggingFace verified)
Resource ExhaustionExcessively long prompts/contextsMax token limits, request timeouts
⚠️ Output Sanitizer is Mandatory

LLMs can be tricked into outputting environment variables, API keys, or system information. EVERY response must pass through a sanitizer that detects and removes patterns like API keys, IP addresses, and file paths. This is not a nice-to-have β€” it is mandatory for production.

Layer 6: Monitoring & Incident Response

Security without monitoring is blind. You need to know what happens on your systems β€” in real time.

What to MonitorToolAlert Condition
SSH Login Attemptsfail2ban + GrafanaMore than 5 failed logins/hour
Container HealthDocker Health Checks + Uptime KumaContainer unhealthy or stopped
Disk UsagePrometheus Node ExporterDisk > 85% full
API ErrorsLoki Log QueriesHTTP 5xx rate > 1% of requests
Unexpected ProcessesAudit Logs (auditd)New process from unknown user
πŸ’‘ Security Dashboard in Grafana

Create a dedicated security dashboard with fail2ban bans, SSH logins, API error rates, and disk usage at a glance. You can see in 5 seconds if something is off. Our Grafana Monitoring Guide covers setting up security panels.

Das Wichtigste

  • βœ“Security is a layered model. 6 layers, each stops something different. None is sufficient alone.
  • βœ“Default Deny at the firewall. Only open what is needed. AI ports do NOT belong on the internet.
  • βœ“SSH Key-Only + fail2ban. No password login, automatic ban after 3 failed attempts.
  • βœ“Output Sanitizer is mandatory for AI applications. LLMs will leak secrets and system info otherwise.
  • βœ“Monitoring with alerting. Without monitoring, you will not know if someone is already inside.

Sources

Next step: move from knowledge to implementation

If you want more than theory: setups, workflows and templates from real operations for teams that want local, documented AI systems.

Why AI Engineering
  • Local and self-hosted by default
  • Documented and auditable
  • Built from our own runtime
  • Made in Austria
Not legal advice.