01. Why Security Matters
Local AI systems occupy a strange position: they run on-premises (suggesting control) but often expose APIs, process untrusted inputs, and load third-party models. This combination creates a distinct threat landscape.
The local AI threat model differs from cloud AI in three ways:
First, perimeter controls are weaker. Cloud providers maintain network isolation, DDoS mitigation, and WAF rules. A local Ollama instance exposed on a LAN has none of this unless you configure it.
Second, update cycles are longer. Cloud services patch within hours. Local deployments may run vulnerable versions for months because no automated update mechanism exists.
Third, data gravity is higher. Data stays local—which protects confidentiality—but also means a breach exposes everything on that system.
Key incidents shape the threat landscape:
The 2023 Prompt Injection Challenge demonstrated that instruction-following models can be subverted through adversarial prefixes. An attacker who can influence input text (via shared documents, user queries, or retrieval augmentation) can alter model behavior.
Model weight poisoning appears in several documented cases where pre-trained models contained backdoored behavior activated by specific trigger phrases.
API endpoint exposure leads to unauthorized usage. Unprotected local AI endpoints have been scanned and abused for cryptomining, spam generation, and exfiltrating conversation history.
The defender's advantage is real. Local deployments have no public-facing attack surface unless you expose them. Closing unused ports, restricting API access, and validating inputs blocks most opportunistic attacks. Security here is more tractable than in cloud environments because you control everything.
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
List every network service your local AI stack exposes. Run nmap -sT -p- localhost and document which ports are listening. For each open port, write one sentence explaining why it must remain open.