02. Threat Modeling for AI

Chapter 2 of 16 · 15 min

Threat modeling provides a structured way to identify what you're protecting and against whom. For local AI, the process compresses into four questions.

What are you protecting? AI systems store: model weights, configuration, conversation history, retrieved documents, and API keys. Classify each asset by sensitivity: public, internal, confidential, restricted.

Who wants it? Threat actors for local AI break into three categories. Opportunists run automated scanners looking for open AI APIs to abuse for compute. Targeted attackers probe specific organizations for data theft or competitive advantage. Insiders—employees, contractors—may access systems beyond their authorization.

How will they attack? The ATT&CK framework adapted for AI systems identifies primary vectors: prompt injection (manipulating inputs), model evasion (adversarial examples), data poisoning (corrupting training or retrieval data), and exfiltration (extracting model weights or conversation logs).

What controls exist? Rate limiting reduces abuse. Authentication prevents unauthorized access. Encryption protects data at rest and in transit. Input validation blocks known attack patterns.

A practical framework for local AI:

Start with a data flow diagram. Show how user input enters the system, where it touches the model, what external data sources participate, and where outputs go. Annotate each step with the trust level—what happens if this component is compromised?

STRIDE works well for AI systems:

  • Spoofing: Can an attacker impersonate a valid user or API key?
  • Tampering: Can input manipulation alter model behavior or system state?
  • Repudiation: Can someone deny sending a prompt that caused problems?
  • Information disclosure: Can unauthorized parties access conversation history or model weights?
  • Denial of service: Can attackers consume resources and block legitimate use?
  • Elevation of privilege: Can a standard user gain admin access to the AI service?
EXERCISE

Draw a data flow diagram for your local AI setup. Label each component with STRIDE threat categories that apply. Identify the single highest-risk component and write one sentence explaining your mitigation plan.