RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Understanding AI Models
  6. /Ch. 16
Understanding AI Models

16. Instruct vs Base Models

Chapter 16 of 20 · 20 min
KEY INSIGHT

Base models give you maximum flexibility for fine-tuning; instruct models give you ready-to-use instruction following out of the box.

Base models and instruct models serve different purposes. Understanding when to use each helps you select the right model variant.

What base models are:

Base models (also called "pretrained" or "raw") are trained on text completion. Given "The capital of France is", they continue the sequence with high probability next tokens.

# Base model behavior
prompt = "The capital of France is"
response = base_model.complete(prompt)
# Output: "Paris, located in the Ile-de-France region..."

Base models are useful for:

  • Code completion (fill in the middle)
  • Text continuation with specific style
  • Fine-tuning on your own data

What instruct models are:

Instruct models are fine-tuned to follow instructions. Given the same prompt, they respond differently:

# Instruct model behavior
prompt = "The capital of France is"
response = instruct_model.generate(prompt)
# Output: "The capital of France is Paris."

Training methodology difference:

# Base model training: Next token prediction
# Document: "The capital of France is Paris."
# Target: "capital of France is Paris."

# Instruct fine-tuning: Instruction-response pairs
# Input: "What is the capital of France?"
# Target: "Paris."

The spectrum:

Not all models fit binary categories:

  1. Base only: No instruction tuning (Llama base)
  2. SFT (Supervised Fine-Tuned): Trained on instruction pairs
  3. RLHF-tuned: Further tuned with reinforcement learning
  4. DPO-tuned: Direct Preference Optimization
  5. Chat-tuned: System prompt optimized with personality

When to use base models:

  • You are fine-tuning on your own data
  • You need code completion (fill in the middle)
  • You want maximum control over generation behavior
  • You are building a custom system with your own prompting

When to use instruct models:

  • Direct user interaction
  • Standard chat interfaces
  • When you lack fine-tuning infrastructure
  • When you want predictable instruction following

Conversion possibility:

You can turn a base model into an instruct-like model with strong prompting:

system_prompt = """You are a helpful assistant. Answer questions directly.
If you need to reason through something, show your thinking step by step.
Format answers clearly with bullet points or numbered lists as appropriate."""

def query_base_as_instruct(base_model, question, system_prompt):
    full_prompt = f"{system_prompt}\
\
Question: {question}\
Answer:"
    return base_model.generate(full_prompt)

This works for some tasks but cannot fully replicate fine-tuned instruction following.

EXERCISE

Take a base model and create a system prompt that makes it behave like an instruct model. Compare responses to the same query with the actual instruct version of that model.

← Chapter 15
Model Selection for Reasoning
Chapter 17 →
Tokenizer Impact on Quality