RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Fine-Tuning with LoRA and QLoRA
  6. /Ch. 2
Fine-Tuning with LoRA and QLoRA

02. Fine-Tuning vs RAG

Chapter 2 of 24 · 15 min
KEY INSIGHT

RAG adds external knowledge at inference time; fine-tuning instills behavioral patterns in model weights.

Retrieval-Augmented Generation and fine-tuning represent complementary approaches to improving model outputs, each addressing different failure modes. RAG augments inference by retrieving relevant documents from an external corpus and including them in the context window. Fine-tuning modifies model weights to change the underlying behavior.

RAG excels when the model needs access to information that changes frequently, exceeds context window capacity, or requires verification against authoritative sources. A legal research assistant benefits from RAG because case law updates continuously and the system must cite specific precedents. RAG provides traceability—users can verify exactly which documents influenced the response.

Fine-tuning excels when the model needs to learn patterns, formats, or behaviors that cannot be easily specified in prompts. A customer service chatbot might require fine-tuning to adopt a consistent brand voice, handle objection responses fluently, or follow company-specific escalation protocols. These behaviors involve learned patterns rather than retrieved facts.

Combining both approaches often yields best results. A support system might use fine-tuning to handle conversation flow and tone while using RAG to retrieve current product documentation and policy details. This architecture separates concerns: fine-tuning handles "how to respond" while RAG handles "what information to include."

Memory and compute requirements differ substantially. RAG requires maintaining a vector database and retrieval infrastructure but uses no additional training compute. Fine-tuning requires training compute proportional to model size but adds no inference-time infrastructure overhead beyond storing adapter weights.

EXERCISE

Analyze an existing application and categorize each capability as better suited for RAG, fine-tuning, or both. Document the reasoning for each classification.

# capability_classifier.py
from enum import Enum

class CapabilityType(Enum):
    FACT_RETRIEVAL = "retrieval"
    PATTERN_LEARNING = "finetuning"
    HYBRID = "both"

def classify_capability(description: str) -> CapabilityType:
    """Classify a capability based on its characteristics."""
    retrieval_indicators = ["up-to-date", "specific document", "citation needed"]
    finetuning_indicators = ["consistent format", "style", "tone", "protocol"]
    
    retrieval_score = sum(1 for ind in retrieval_indicators if ind in description.lower())
    finetuning_score = sum(1 for ind in finetuning_indicators if ind in description.lower())
    
    if retrieval_score > finetuning_score:
        return CapabilityType.FACT_RETRIEVAL
    elif finetuning_score > retrieval_score:
        return CapabilityType.PATTERN_LEARNING
    return CapabilityType.HYBRID
← Chapter 1
Why Fine-Tune?
Chapter 3 →
LoRA Theory