Ollama REST API — Ollama — Installation to Mastery (Chapter 6)

The REST API exposes Ollama's functionality over HTTP. By default, it listens on port 11434. All endpoints accept JSON payloads and return JSON responses.

Core Endpoints

Generate endpoint (POST /api/generate):

curl http://localhost:11434/api/generate -d '{
  "model": "llama3.2:1b",
  "prompt": "Write a haiku about programming",
  "stream": false
}'

The stream parameter defaults to true. Setting it to false returns the complete response in one JSON object. Streaming returns newline-delimited JSON objects as tokens are generated.

Chat endpoint (POST /api/chat):

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2:1b",
  "messages": [
    {"role": "user", "content": "What is a kernel?"},
    {"role": "assistant", "content": "A kernel is the core of an operating system."},
    {"role": "user", "content": "Give an example"}
  ],
  "stream": false
}'

The chat endpoint maintains conversation history in the request body. For stateless behavior, send only the latest message.

Embeddings endpoint (POST /api/embeddings):

curl http://localhost:11434/api/embeddings -d '{
  "model": "nomic-embed-text",
  "prompt": "The quick brown fox jumps over the lazy dog"
}'

Returns a vector of floating-point numbers representing the semantic embedding of the input text.

API Server Configuration

By default, the API binds to 127.0.0.1:11434. To make it accessible from other machines, set the OLLAMA_HOST environment variable:

# Linux/macOS - bind to all interfaces
export OLLAMA_HOST=0.0.0.0
ollama serve

# Windows PowerShell
$env:OLLAMA_HOST = "0.0.0.0"
ollama serve

This exposes the API on all network interfaces. For production deployments, put the service behind a reverse proxy with TLS and authentication.

Streaming Responses

Streaming returns data incrementally. Each chunk is a JSON object:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3.2:1b",
  "prompt": "Count to 5",
  "stream": true
}'

Output:

{"model":"llama3.2:1b","created_at":"2024-11-15T10:00:00Z","response":"1","done":false}
{"model":"llama3.2:1b","created_at":"2024-11-15T10:00:00Z","response":"2","done":false}
{"model":"llama3.2:1b","created_at":"2024-11-15T10:00:00Z","response":"3","done":false}
{"model":"llama3.2:1b","created_at":"2024-11-15T10:00:00Z","response":"...4...5","done":true}