Text Classification — AI glossary

Text classification is a natural language processing task where a model assigns a predefined category label to a piece of text. Operators encounter it when building systems like spam filters, sentiment analyzers, or content moderators. The model processes the input text and outputs a probability distribution over the possible classes, typically using a softmax layer. Common architectures include fine-tuned transformer models (e.g., BERT, RoBERTa) or smaller, faster models like DistilBERT. The runtime cost depends on sequence length and model size: a 110M-parameter BERT base processes ~100 tokens in ~10 ms on a modern GPU, while a 6M-parameter DistilBERT is ~2x faster. VRAM usage scales with batch size and sequence length, typically 1-2 GB for single-sample inference.

Practical example

A sentiment classifier trained on movie reviews labels text as 'positive' or 'negative'. Using Hugging Face Transformers, an operator loads distilbert-base-uncased-finetuned-sst-2-english (67M parameters, ~260 MB). On an RTX 3060, inference on a 50-token review takes ~5 ms and uses ~800 MB VRAM. The model outputs logits; applying softmax gives probabilities. For batch inference of 32 reviews, VRAM usage jumps to ~2 GB.

Workflow example

In Ollama, text classification models are served via the API. An operator runs ollama pull mxbai-embed-large (though primarily for embeddings, classification can be done by adding a linear head). More commonly, operators use Hugging Face pipelines: from transformers import pipeline; classifier = pipeline('sentiment-analysis', model='distilbert-base-uncased-finetuned-sst-2-english'); classifier('I loved the movie!'). This loads the model into VRAM and returns a label and score.