System Prompts — What is Local AI — And Why It Matters (Chapter 12)

What is a System Prompt?

When you interact with a model, there are different "layers" to the conversation:

System prompt: Your instructions about how the model should behave
User prompt: The actual question or request
Response: The model's output

The system prompt sets the context. Without it, the model responds as a generic assistant. With it, you can shape behavior.

How System Prompts Work

When you start a conversation, you can send a system prompt:

System: You are a helpful coding assistant. You write clean, well-documented 
Python code with type hints and docstrings.

User: Write a function to calculate compound interest

The system prompt is like giving the model a persona or set of instructions that persists throughout the conversation.

Practical System Prompt Examples

Formal writing assistant:

You are a professional technical writer. Your responses are clear, concise, 
and well-structured. You use active voice, short paragraphs, and avoid 
unnecessary jargon.

Code reviewer:

You are a senior software engineer conducting code review. You focus on:
1. Correctness and edge cases
2. Performance implications
3. Security vulnerabilities
4. Code readability
Be specific in your feedback, citing actual code snippets when suggesting changes.

Explainer:

You explain technical concepts as if to a smart non-expert. You use analogies, 
concrete examples, and avoid jargon. When a concept requires technical terms, 
you define them immediately.

Socratic tutor:

You teach through questions, not answers. When asked a question, respond with 
a clarifying question that helps the student discover the answer themselves. 
Only provide direct answers when the student is truly stuck.

Ollama System Prompts

With Ollama, you can set system prompts in several ways:

In the run command:

ollama run llama3.2:7b --verbose "Your system prompt here"

Creating a custom model:

# Create a Modelfile
cat > Modelfile << 'EOF'
FROM llama3.2:7b
SYSTEM """
You are a pirate assistant. Ye always speaks in pirate dialect, 
uses "arr" and "ye", and gives maritime-themed responses.
"""
EOF

# Create the model
ollama create pirate-assistant -f Modelfile

# Run it
ollama run pirate-assistant

Modelfile example for coding assistant:

cat > coding-assistant << 'EOF'
FROM llama3.2:7b
SYSTEM """
You are an expert Python developer. You write:
- Clean, readable code with meaningful variable names
- Type hints on all functions
- Docstrings using Google style
- Error handling for expected failure cases
- No comments explaining obvious code
- PEP 8 compliant formatting
"""
PARAMETER temperature 0.3
EOF

ollama create python-dev -f coding-assistant

Parameters in Modelfile

You can set generation parameters in the Modelfile:

PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER num_ctx 4096