RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
Glossary / Natural language processing / Text Classification
Natural language processing

Text Classification

Text classification is a natural language processing task where a model assigns a predefined category label to a piece of text. Operators encounter it when building systems like spam filters, sentiment analyzers, or content moderators. The model processes the input text and outputs a probability distribution over the possible classes, typically using a softmax layer. Common architectures include fine-tuned transformer models (e.g., BERT, RoBERTa) or smaller, faster models like DistilBERT. The runtime cost depends on sequence length and model size: a 110M-parameter BERT base processes ~100 tokens in ~10 ms on a modern GPU, while a 6M-parameter DistilBERT is ~2x faster. VRAM usage scales with batch size and sequence length, typically 1-2 GB for single-sample inference.

Practical example

A sentiment classifier trained on movie reviews labels text as 'positive' or 'negative'. Using Hugging Face Transformers, an operator loads distilbert-base-uncased-finetuned-sst-2-english (67M parameters, ~260 MB). On an RTX 3060, inference on a 50-token review takes ~5 ms and uses ~800 MB VRAM. The model outputs logits; applying softmax gives probabilities. For batch inference of 32 reviews, VRAM usage jumps to ~2 GB.

Workflow example

In Ollama, text classification models are served via the API. An operator runs ollama pull mxbai-embed-large (though primarily for embeddings, classification can be done by adding a linear head). More commonly, operators use Hugging Face pipelines: from transformers import pipeline; classifier = pipeline('sentiment-analysis', model='distilbert-base-uncased-finetuned-sst-2-english'); classifier('I loved the movie!'). This loads the model into VRAM and returns a label and score.

Reviewed by Fredoline Eruo. See our editorial policy.

Buyer guides
  • Best GPU for local AI →
  • Best laptop for local AI →
  • Best Mac for local AI →
When it doesn't work
  • CUDA out of memory →
  • Ollama running slowly →
  • ROCm not detected →