RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
Glossary / Evaluation metrics / R²
Evaluation metrics

R²

R² (coefficient of determination) measures how well a regression model's predictions match actual outcomes, on a scale from 0 to 1. In local AI, R² appears when evaluating fine-tuned models on regression tasks (e.g., predicting token-level latency or VRAM usage). An R² of 1 means perfect prediction; 0 means the model performs no better than always predicting the mean. Operators encounter R² in training logs or evaluation scripts to judge whether a quantized or fine-tuned model preserves predictive accuracy.

Deeper dive

R² is calculated as 1 - (SS_res / SS_tot), where SS_res is the sum of squared residuals (prediction errors) and SS_tot is the sum of squared differences from the mean. A negative R² can occur if the model fits worse than a horizontal line, which sometimes happens with poorly quantized models on small datasets. In practice, R² is sensitive to outliers and does not indicate bias or variance individually. For local AI, R² is most relevant when benchmarking quantized models on regression benchmarks (e.g., predicting perplexity or runtime). A drop in R² after quantization signals loss of predictive fidelity.

Practical example

An operator fine-tunes a small regression model to predict inference latency on an RTX 4090. After applying 4-bit quantization, the R² on a held-out test set drops from 0.95 to 0.82, indicating that quantization reduced the model's ability to accurately predict latency. This helps the operator decide whether the speed gain from quantization is worth the accuracy loss.

Workflow example

In a Hugging Face Transformers training script, operators can log R² using evaluate.load('r_squared') after each epoch. For example, running python train.py --output_dir ./results prints R² alongside loss. If R² plateaus below 0.9, the operator might adjust learning rate or increase dataset size. In llama.cpp, R² is not directly computed, but operators can compute it externally using model outputs and ground truth.

Reviewed by Fredoline Eruo. See our editorial policy.

Buyer guides
  • Best GPU for local AI →
  • Best laptop for local AI →
  • Best Mac for local AI →
When it doesn't work
  • CUDA out of memory →
  • Ollama running slowly →
  • ROCm not detected →