RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
Glossary / Specialized domains / Game AI
Specialized domains

Game AI

Game AI refers to the algorithms and systems that control non-player characters (NPCs), opponents, and procedural content in video games. Unlike general AI (e.g., LLMs), game AI is optimized for real-time performance, determinism, and low latency, often using finite state machines, behavior trees, or pathfinding (A*). Operators running local AI may encounter game AI when using LLMs to generate dialogue or narratives, but core game AI remains separate—it runs on CPU/GPU with strict frame-time budgets (e.g., <1 ms per frame).

Deeper dive

Game AI has evolved from simple rule-based systems (e.g., Pac-Man ghosts) to complex behavior trees (e.g., Halo's Elites) and reinforcement learning (e.g., AlphaStar for StarCraft II). However, most commercial games still rely on deterministic, lightweight techniques because they must run at 30-60 FPS on diverse hardware. Modern trends include using LLMs for dynamic dialogue (e.g., in-game NPCs powered by local models like Llama 3.1), but this is distinct from traditional game AI. Operators running local AI for gaming should note that LLM inference (even quantized) adds 100-500 ms latency, which is too slow for real-time combat but acceptable for turn-based or narrative-driven interactions.

Practical example

An operator running a local LLM (e.g., Llama 3.1 8B Q4 on an RTX 4090) to generate NPC dialogue in a Skyrim mod will see ~30-50 tok/s, translating to 2-3 seconds per response. This is acceptable for dialogue but not for real-time enemy behavior, which still uses the game's built-in AI (e.g., behavior trees). The operator must manage VRAM: the LLM uses ~5 GB, leaving room for the game itself.

Workflow example

When integrating a local LLM into a game via LM Studio, the operator sets up an HTTP server (e.g., lm studio serve --port 1234) and the game mod sends dialogue prompts. The runtime loads the model into VRAM, and each inference call blocks the game thread until a response is received. Operators must monitor VRAM usage to avoid crashes—e.g., an RTX 3060 12 GB can run a 7B Q4 model alongside most games, but a 70B Q4 (~40 GB) requires offloading to system RAM, causing multi-second delays.

Reviewed by Fredoline Eruo. See our editorial policy.

Buyer guides
  • Best GPU for local AI →
  • Best laptop for local AI →
  • Best Mac for local AI →
When it doesn't work
  • CUDA out of memory →
  • Ollama running slowly →
  • ROCm not detected →