RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Custom Training Pipelines
  6. /Ch. 13
Custom Training Pipelines

13. Hyperparameter Search

Chapter 13 of 18 · 15 min
KEY INSIGHT

Random search with 50 trials finds better hyperparameters than grid search with 10—use Bayesian optimization for expensive evaluations.

Hyperparameter tuning is search, not guesswork. Systematic search beats intuition for any non-trivial problem.

Grid vs. Random Search

Random search outperforms grid search when some hyperparameters matter more than others:

import itertools
import random

def grid_search(param_grid, n_trials):
    """Grid search - exhaustive, wastes trials on insensitive dims."""
    keys, values = zip(*param_grid.items())
    for combination in itertools.product(*values):
        config = dict(zip(keys, combination))
        yield config

def random_search(param_grid, n_trials):
    """Random search - more efficient for high-dim spaces."""
    for _ in range(n_trials):
        config = {k: random.choice(v) for k, v in param_grid.items()}
        yield config

Optuna for Bayesian Optimization

import optuna

def objective(trial):
    config = {
        'lr': trial.suggest_float('lr', 1e-5, 1e-2, log=True),
        'batch_size': trial.suggest_categorical('batch_size', [16, 32, 64, 128]),
        'weight_decay': trial.suggest_float('weight_decay', 1e-6, 1e-2, log=True),
        'num_layers': trial.suggest_int('num_layers', 2, 8),
        'hidden_dim': trial.suggest_categorical('hidden_dim', [256, 512, 1024]),
    }
    
    model = train_model(config)
    val_loss = evaluate(model, val_loader)
    
    return val_loss

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=50, timeout=3600)  # 1 hour max

Asynchronous Hyperparameter Tuning

# Ray Tune for distributed tuning
from ray import tune

def train_with_tune(config):
    model = build_model(config)
    trainer = pl.Trainer(
        max_epochs=10,
        callbacks=[
            tune.report(val_loss=val_loss)  # Report to scheduler
        ]
    )
    trainer.fit(model, datamodule=data_module)

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

EXERCISE

Set up Optuna with 20 trials on your current model. Log the best configuration. Compare to your manually-tuned baseline.

← Chapter 12
Optimizers and Schedulers
Chapter 14 →
Experiment Tracking with MLflow