RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Local AI on Linux
  6. /Ch. 7
Local AI on Linux

07. Systemd Service for AI

Chapter 7 of 15 · 20 min
KEY INSIGHT

Systemd turns a manual process run into a managed service with automatic restart, resource limits, and log collection—essential for server deployments.

Running Ollama or llama.cpp as a background process with nohup is fragile. Systemd manages the process lifecycle, restarts it on failure, collects logs, and enforces resource limits.

Create a systemd service for Ollama:

sudo nano /etc/systemd/system/ollama.service
[Unit]
Description=Ollama Service
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=ollama
Group=ollama
ExecStart=/usr/local/bin/ollama serve
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment="OLLAMA_NUM_PARALLEL=4"
Environment="OLLAMA_GPU_OVERHEAD=0"
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

Create the user before enabling:

sudo useradd --system --no-create-home --shell /usr/sbin/nologin ollama
sudo systemctl daemon-reload
sudo systemctl enable ollama
sudo systemctl start ollama

Verify:

sudo systemctl status ollama
# ● ollama.service - Ollama Service
#    Loaded: loaded (/etc/systemd/system/ollama.service; enabled; vendor preset: enabled)
#    Active: active (running) since Mon 2026-05-25 10:00:00 UTC; 5s ago
journalctl -u ollama -f  # follow logs

Failure mode: Service starts but the process exits immediately with exit code 203. The User=ollama directive references a user that does not exist. Create it first or temporarily use User=root for debugging.

Failure mode: journalctl -u ollama shows Killed with OOM reason. The Ollama process exceeded the memory limit. Default systemd MemoryMax is unlimited unless set. Add MemoryMax=32G under [Service] if the system has limited RAM and you want to protect other services.

Failure mode: Service restarts repeatedly with Restart=always. The underlying cause (e.g., port already in use, CUDA out of memory) is not being resolved before the restart. Add ExecStartPre=/bin/sleep 2 to add a delay and check logs with journalctl -u ollama -n 50 before debugging.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

EXERCISE

Create a systemd service for Ollama, start it, verify it is running with systemctl status, view logs with journalctl -u ollama, then trigger a simulated crash with kill -9 and observe the automatic restart.

← Chapter 6
llama.cpp from Source
Chapter 8 →
Headless Server Setup