Live runtime status

Editorial

Local AI runtime health

Single-glance answer for every major local AI inference engine: is the project active, how much of our benchmark corpus touches it, what's the failure mode if you deploy it. Live counts pulled from the database; cadence labels derived from real timestamps only.

See the runtime-health methodology for how labels are derived, what we measure, and what we don't.

Runtimes tracked

Active

Stalled

Reproduced runs

Eval-harness OK

SGLang

active · 16d

Setup: high

server · 0 editorial benchmarks · 0 reproduced community runs

Best workloads

· Heavy structured-output / function-calling agent loops
· Shared-prefix batched workloads (RadixAttention)
· Multi-architecture serving

Avoid if

· Want largest community / Stack Overflow surface
· macOS host
· Day-zero new architecture support

Common failure modes

· Smaller community = error messages with no Stack Overflow hits
· Architecture-specific kernel gaps
· Less mature observability — silent failures harder to spot

OS support

Linux

Hardware

NVIDIA

Compared withvLLM vs SGLang

Text Generation Inference (TGI)

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Llamafile

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Text Generation WebUI (oobabooga)

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Jan

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MLX-LM

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Open Interpreter

active · 16d

orchestrator · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

ExLlamaV2

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Hugging Face Hub CLI

active · 16d

quantizer · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Pinokio

active · 16d

orchestrator · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Axolotl

active · 16d

finetuner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Unsloth

active · 16d

finetuner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Cursor

active · 16d

ide · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

OpenCode

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

vLLM

active · 16d

Setup: high

server · 0 editorial benchmarks · 0 reproduced community runs

Best workloads

· Production multi-user serving
· Tensor-parallel multi-GPU
· OpenAI-compatible API serving

Avoid if

· macOS host (unsupported)
· Single-user hobby — operator burden too high
· Fast-moving experimental architectures (lag at day-zero)

Common failure modes

· Flash-attention pinning incompatibilities
· OOM on long contexts when KV cache isn't pre-sized
· WSL2 GPU passthrough breakage on Windows kernel updates

OS support

LinuxWindows (WSL2)

Hardware

NVIDIAAMD ROCm

Compared withvLLM vs SGLang vLLM vs llama.cpp TensorRT-LLM vs vLLM

Devin

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Kilo Code

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

OpenAI Codex

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Cline

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Windsurf (Codeium)

active · 16d

ide · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Droid (Factory)

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Replit Agent 3

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

JetBrains AI Assistant

active · 16d

ide · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

OpenHands

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Roo Code (sunsetting May 15, 2026)

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Msty

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Sourcegraph Cody

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Zed (with AI)

active · 16d

ide · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Pi (Inflection AI)

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Claude Desktop

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Qdrant

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Weaviate

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Model Context Protocol (MCP)

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Graphiti (Zep)

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

LanceDB

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Redis (vector search)

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Milvus

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Chroma

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Neo4j GraphRAG

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Zep (memory platform)

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

LangSmith

active · 16d

orchestrator · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Phoenix (Arize AI)

active · 16d

orchestrator · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MCP Brave Search Server

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

IPEX-LLM

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MCP Filesystem Server

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MCP PostgreSQL Server

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Playwright MCP

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Firecrawl MCP

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MCP Fetch Server

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MCP GitHub Server

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MCP Sequential Thinking

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MCP Memory Server

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

LibreChat

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Ray Serve

active · 16d

orchestrator · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MCP Git Server

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Continue

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Intel OpenVINO

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

TabbyAPI

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Petals

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

GitHub Copilot

active · 16d

ide · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

KoboldCPP

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

LangChain

active · 16d

orchestrator · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Mem0 (agent memory API)

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

DirectML

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Hyperspace (P2P inference network)

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

llama-cpp-python

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Aphrodite Engine

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

ONNX Runtime Mobile

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

ExecuTorch

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MLX Swift

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

MLC LLM

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Qualcomm AI Hub

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

TensorRT-LLM

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

SillyTavern

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

ROCm

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

AnythingLLM

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

ONNX Runtime

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

LM Studio

active · 16d

Setup: low

gui · 0 editorial benchmarks · 0 reproduced community runs

Best workloads

· Desktop chat interface for non-developers
· Browsing HuggingFace model library in-app
· Running local AI without a terminal

Avoid if

· Headless servers / homelab
· Embedded inference in scripts (use Ollama instead)
· Reproducibility requirements

Common failure modes

· Electron memory bloat on long sessions
· GUI updates can silently change inference defaults
· Server mode requires the app foregrounded on some OSes

OS support

macOSWindowsLinux

Hardware

NVIDIAApple MetalVulkan

Compared withOllama vs LM Studio

Open WebUI

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Ollama

active · 16d

Setup: low

runner · 0 editorial benchmarks · 0 reproduced community runs

Best workloads

· First local-AI deployment
· Single-user personal inference
· Drop-in OpenAI-compatible API

Avoid if

· Custom build flags / experimental kernels needed
· Multi-user serving at scale
· Reproducibility requires exact runtime version pinning

Common failure modes

· Auto-update can ship llama.cpp regressions
· WSL backend flakiness on Windows GPU
· Daemon restart loses concurrent state

OS support

LinuxmacOSWindows

Hardware

NVIDIAAMD ROCmApple Metal

Compared withOllama vs llama.cpp Ollama vs LM Studio

GPT4All

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

LlamaIndex

active · 16d

orchestrator · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

ComfyUI

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

llama.cpp

active · 16d

Setup: moderate

runner · 0 editorial benchmarks · 0 reproduced community runs

Best workloads

· Cross-platform single-user inference
· Mobile / iOS / Android / Pi
· Reproducible pinned-commit deployments

Avoid if

· Concurrent multi-user serving — sequential by default
· Production agent loops with parallel tool calls

Common failure modes

· GGUF format drift after major schema changes
· Metal kernel issues on macOS major-version transitions
· Vulkan support varies wildly by Intel/AMD driver

OS support

LinuxmacOSWindowsiOSAndroid

Hardware

NVIDIA CUDAApple MetalVulkan (any)CPU-only

Compared withOllama vs llama.cpp vLLM vs llama.cpp MLX vs llama.cpp

TurboVec

active · 10d

orchestrator · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Stable Diffusion WebUI (AUTOMATIC1111)

active · 16d

gui · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Exo

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

LocalAI

active · 16d

server · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

CTranslate2

active · 16d

runner · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Claude Code

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Codex CLI

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Aider

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Goose

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Letta (memory framework)

active · 16d

agent · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

OpenClaw

active · 16d

orchestrator · 0 editorial benchmarks · 0 reproduced community runs

Editorial guidance pending. See the tool detail page for current information.

Next recommended step

See engine head-to-heads

OrLocal AI engine choice matrix Browse benchmarks