RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Fine-Tuning with LoRA and QLoRA
  6. /Ch. 23
Fine-Tuning with LoRA and QLoRA

23. Domain Adaptation

Chapter 23 of 24 · 20 min
KEY INSIGHT

Domain adaptation tailors a pre-trained model to a specific domain's conventions, vocabulary, and reasoning patterns. Unlike task-specific fine-tuning (which adds new capabilities), domain adaptation refines existing capabilities for new contexts. The approach depends on data availability and domain distance: **Low-resource domain adaptation**: ```python def adapt_with_retrieval_augmentation(model, domain_knowledge_base, query): """ Augment generation with retrieved domain knowledge. No fine-tuning required. """ retrieved_chunks = domain_knowledge_base.search(query, top_k=5) context = "\n".join(retrieved_chunks) prompt = f"""Based on the following domain knowledge: {context} Answer this query: {query}""" return model.generate(prompt) ``` **Moderate-resource adaptation** with LoRA: ```python def prepare_domain_dataset(domain_texts, tokenizer, block_size=512): """Format domain corpus for causal language modeling.""" def tokenize_function(examples): return tokenizer( examples["text"], truncation=True, max_length=block_size, padding="max_length", return_tensors=None ) dataset = datasets.Dataset.from_dict({"text": domain_texts}) dataset = dataset.map(tokenize_function, batched=True, remove_columns=["text"]) dataset = dataset.train_test_split(test_size=0.1) return dataset ``` **High-resource adaptation** with full fine-tuning of last layers: ```python def freeze_domain_adaptation(model, freeze_layers=24): """ Freeze bottom layers, fine-tune top layers for domain. """ # Freeze embedding and early transformer layers for name, param in model.named_parameters(): layer_num = extract_layer_number(name) if layer_num is not None and layer_num < freeze_layers: param.requires_grad = False elif "lm_head" in name or "layer_norm" in name: # Keep output layers trainable param.requires_grad = True else: param.requires_grad = True trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad) total_params = sum(p.numel() for p in model.parameters()) print(f"Trainable: {trainable_params:,} / {total_params:,} ({100*trainable_params/total_params:.1f}%)") ```

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

EXERCISE

: Domain Vocabulary Analysis

def analyze_domain_vocabulary(domain_texts, base_vocab):
    """Identify domain-specific terms missing from base vocabulary."""
    from collections import Counter
    import re
    
    # Tokenize and count
    all_tokens = []
    for text in domain_texts:
        all_tokens.extend(re.findall(r'\b\w+\b', text.lower()))
    
    freq = Counter(all_tokens)
    
    # Find terms absent from base vocab or rare
    oov_terms = []
    for term, count in freq.most_common(5000):
        if term not in base_vocab and count > 50:
            oov_terms.append((term, count))
    
    return oov_terms
← Chapter 22
Multi-Task Fine-Tuning
Chapter 24 →
Domain Fine-Tune Project