GPT (OpenAI)

by OpenAI

OpenAI's closed-weight family — GPT-4, GPT-4o, GPT-5 series. Reference baseline for capability comparisons; not self-hostable. RunLocalAI covers GPT family for benchmark context only.

Best entry point for local use

GPT is OpenAI's closed-weight family — there is no open-weight GPT model for local deployment. For local open-weight alternatives: Llama 3.3 70B at Q4 (40 GB) on 2× RTX 4090 matches GPT-4o on MMLU (87%); Qwen 3 32B at Q4 (18 GB) on RTX 4090 matches GPT-4-Turbo on math; DeepSeek V4 at FP8 on 8× H100 SXM matches GPT-5 class reasoning. The only open-weight GPT model is GPT-2 1.5B (2019, MIT license) — historically significant, runs on any hardware, but functionally obsolete for production use. OpenAI's GPT-4o and GPT-5 are API-only. For self-hosted alternatives with GPT-compatible tool-use/function-calling, use Llama 3.3 70B with vLLM serving an OpenAI-compatible API — most GPT SDKs (Python openai, LangChain) work against vLLM endpoints with a base_url change. Skip building a local "GPT replacement" without specific requirements — the API is the pragmatic choice for GPT-family models.

Deployment guidance

GPT models are API-only. OpenAI API: GPT-4o ($2.50/1M input, $10/1M output), GPT-4o-mini ($0.15/1M input, $0.60/1M output), GPT-5 (pricing TBD). For self-hosted API-compatible alternatives: vLLM 0.6.3+ serving Llama 3.3 70B AWQ 4-bit with --api-key and --served-model-name gpt-4 flags — exposes an OpenAI-compatible /v1/chat/completions endpoint. For on-device mobile: llama.cpp serving Llama 3.1 8B Q4_0 on Snapdragon X Elite — ~18 tok/s with OpenAI-compatible HTTP API. For Azure OpenAI users with data residency requirements: deploy Llama 3.3 70B on-premise with vLLM on H100 SXM in your own VPC — matches GPT-4o quality with full data sovereignty. The GPT-2 1.5B open-weight model runs on any hardware via Transformers AutoModelForCausalLM — 3 GB FP32, ~40 tok/s on CPU — educational/research use only.

Related families

Gemini (Google)Claude (Anthropic)

Related — keep moving

Compare hardware

Buyer guides

When it doesn't work

Alternatives

Gemini (Google)Claude (Anthropic)

Before you buy

Verify GPT (OpenAI) runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →