RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /MLOps for Local AI
  6. /Ch. 2
MLOps for Local AI

02. Experiment Tracking

Chapter 2 of 24 · 15 min
KEY INSIGHT

Parameters and metrics are the visible layer. The invisible layer is the computational graph—your code, dependencies, and data. Capture enough context to reproduce a run without depending on institutional memory. Parameters flow in both directions. Hyperparameters (learning rate, batch size) are inputs you control. Learned parameters (weights) are outputs you measure. But the real value emerges when you track all parameters: data paths, feature flags, random seeds, hardware configurations. A single run with a bad seed can produce wildly different results. Artifacts are the outputs worth keeping: model binaries, processed datasets, visualizations, serialized preprocessors. MLflow and similar tools store artifacts in designated locations, making retrieval deterministic. ```python # Minimal experiment tracking with MLflow import mlflow import mlflow.sklearn from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import accuracy_score mlflow.set_experiment("spam-classifier-v2") with mlflow.start_run(run_name="baseline-rf"): # Log parameters mlflow.log_param("n_estimators", 100) mlflow.log_param("max_depth", 10) mlflow.log_param("random_seed", 42) # Train model = RandomForestClassifier(n_estimators=100, max_depth=10, random_state=42) model.fit(X_train, y_train) # Evaluate and log metrics preds = model.predict(X_test) accuracy = accuracy_score(y_test, preds) mlflow.log_metric("accuracy", accuracy) # Log model artifact mlflow.sklearn.log_model(model, "model") ``` This pattern—log params, train, evaluate, log metrics, save model—forms the foundation of every experiment tracking workflow.

Experiment tracking captures the context of machine learning development. Without it, you're flying blind—unable to compare runs, reproduce successes, or diagnose failures. Every training run is an experiment, and experiments need logs.

The fundamental unit is the run: a single execution of training code that produces metrics, artifacts, and metadata. A run captures what you trained (parameters), how well it trained (metrics), what it produced (model artifacts), and the context (data version, environment). Later, you can query runs to find the best-performing model for a given scenario.

Metrics are the backbone of comparison. Track loss curves, accuracy curves, and custom business metrics. The trap is tracking too many metrics without understanding what matters. Define your primary metric before training—it's your optimization target. Secondary metrics are for context and debugging, not decision-making.

EXERCISE

Run the above code with MLflow. Navigate to the MLflow UI (mlflow ui) and locate your run. Note the automatically-captured source code, parameters, and metrics. Modify hyperparameters and run again; compare the two runs in the UI.

← Chapter 1
MLOps Overview
Chapter 3 →
MLflow Setup