MLOps & deployment

Model Versioning

Model Versioning tracks the evolution of ML models over time by assigning unique identifiers to each trained artifact and its associated metadata, enabling reproducibility, rollback, and audit trails. A model version captures: the exact weights (SHA-256 hash), training data snapshot (dataset version + split seed), hyperparameters (learning rate, batch size, optimizer), dependencies (Python 3.11, PyTorch 2.1.0, CUDA 12.1), evaluation metrics on holdout sets, and the git commit of training code. When a production model serving 50M predictions/day produces a regression — precision drops from 0.93 to 0.87 — the versioning system enables instant rollback to the previous version

Practical example

Model versioning tracks different iterations of a model — same as code versioning (git tags, releases) but for weight files. Every deployed model should have a unique version identifier. Without versioning, you can't roll back, can't compare, and can't debug production issues.

Workflow example

Model versioning implementation: (1) semantic versioning: v1.0.0 (major.retrain.finetune), (2) store model with version in path: s3://models/my-model/v1.2.3/, (3) model cards: document what changed in each version, (4) deployment configs reference specific versions: model: [email protected], not model: latest, (5) for LLMs: version the base model hash + LoRA adapters + prompt + generation config — the full deployment configuration, not just the weights.

Reviewed by Fredoline Eruo. See our editorial policy.