16. Continue.dev Integration
Continue.dev is a VS Code extension that uses Ollama for code completion and chat. It integrates directly into the IDE, providing context-aware assistance while you write code.
Installation
- Install VS Code
- Open Extensions (Ctrl+Shift+X)
- Search for "Continue" and install the "Continue" extension
Configuration
After installation, open the Continue settings (click the Continue icon in the sidebar or press Ctrl+Shift+=). Configure the Ollama connection:
{
"models": [
{
"title": "Codellama",
"provider": "ollama",
"model": "codellama:7b",
"api_base": "http://localhost:11434"
}
],
"tabAutocompleteModel": {
"title": "Codellama",
"provider": "ollama",
"model": "codellama:7b",
"api_base": "http://localhost:11434"
}
}
The models array configures chat models, while tabAutocompleteModel sets the model used for inline code completion.
Features
- Inline completion - Autocomplete suggestions as you type, powered by the
tabAutocompleteModel. - Chat panel - Ask questions about your codebase in the sidebar.
- Context retrieval - Continue automatically includes relevant file contents in prompts.
Model Selection
Continue works best with code-focused models:
codellama:7b- General code completion and explanationcodellama:13b- More capable, requires more VRAMdeepseek-coder:6.7b- Specialized for code generation
For autocomplete specifically, smaller models (7B parameters) respond faster and work well for single-line completions.
Troubleshooting
No completions appearing:
- Verify Ollama is running:
curl http://localhost:11434 - Check the
api_basein Continue settings matches your Ollama URL - Try a different model-some models are not well-tuned for code completion
Slow responses:
- Reduce the
tabAutocompleteModelsize (trycodellama:3binstead of7b) - Limit context by adjusting Continue's retrieval settings
- Ensure GPU acceleration is working (see Chapter 8)
Model not found:
- Pull the model:
ollama pull codellama:7b - Check the model name in Continue settings matches exactly (including version tag)
Install Continue, configure it with codellama:7b, and verify inline completions work in a Python file. If completion is slow, switch to codellama:3b and compare response times.