Large language models

Large Language Model (LLM)

A Large Language Model is a neural network with billions of parameters trained on massive text corpora to predict the next token. The "large" generally means at least 1B parameters; modern LLMs range from 1B (edge models) to 1.8T (proprietary frontier models).

Modern LLMs are decoder-only transformers, autoregressive (generating one token at a time), and trained in two stages: pre-training (trillions of tokens, unsupervised next-token prediction) and post-training (RLHF, DPO, or similar — teaching the model to follow instructions and avoid harmful outputs).

For local AI: open-weight LLMs you can run yourself include Llama, Qwen, Mistral, Phi, Gemma, and DeepSeek families. The 7B-32B range fits on consumer hardware; 70B-class needs 24GB+ VRAM or unified-memory Apple Silicon; 100B+ MoE models need workstation-tier setups. See the directory for what runs where.

Related terms

Reviewed by Fredoline Eruo. See our editorial policy.