Official Python SDK for Ollama. Async, streaming, typed — the right primitive for scripts.
Editorial verdict: “Foundational primitive for Python scripts against Ollama. Official, maintained, typed.”
Which runtime + OS combos this app works against. Source of truth for "will it run on my setup?"
For solo Ollama users writing Python scripts or small services, this is the SDK to reach for. It bridges directly to a local Ollama server with async, sync, and streaming support, plus typed responses that save you from guessing at dict keys. Fully offline by design — no API keys, no cloud dependency. Maintained by the Ollama team, so it stays current with vision, tools, and embeddings as they land. Just know it's Ollama-specific: if you need to swap in OpenAI or Anthropic backends later, LiteLLM is the better foundation. No batch endpoint helpers either, so you'll roll your own for bulk inference.
Thin SDK / proxy / compatibility layer.
Best universal LLM proxy. Foundational layer for multi-provider deployments.
Foundational primitive for Node + browser apps against Ollama. ESM-native, typed.
Cloud-only LLM router. Useful category, novel pricing. The 'unlimited' math has a known failure mode at heavy usage.
Pre-filled with this app's recommended use case + budget tier. Get the full rig + runtime + model picks.
The full directory — filter by category, runtime, OS, privacy posture, or VRAM.
What this app talks to: Ollama, vLLM, llama.cpp, MLX, LM Studio. The upstream layer.
Did this app work for you on a specific rig? Submit the benchmark — it powers the model + hardware pages.