RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /How-to
  5. /How to download and run a model with Ollama
HOW-TO · SET

How to download and run a model with Ollama

beginner·10 min·By Fredoline Eruo
Target environment
Ubuntu 24.04 · Ollama 0.4.xWindows 11 · Ollama 0.4.xmacOS 15 · Ollama 0.4.x
PREREQUISITES

Ollama installed and running. Run `ollama list` to confirm the service is responsive before proceeding.

What this does

This guide pulls a language model from the Ollama library and runs an inference session interactively. After completion, prompts sent to the model return generated responses locally with no network dependency for inference.

Steps

  1. Pull a model from the Ollama library. The pull command downloads the model files and registers the model locally. Smaller models like llama3.2:1b are fastest to download.

    ollama pull llama3.2:1b
    

    Expected output: pulling <manifest> followed by progress bars for each layer, ending in success.

  2. Confirm the model is available locally. The model is now registered and can be invoked without a network connection.

    ollama list
    

    Expected output: Table listing the model name, ID, size, and modification date. Example row: llama3.2:1b a48c9... 1.3GB 2026-05-29.

  3. Run an interactive inference session. This command opens a prompt loop in the terminal.

    ollama run llama3.2:1b
    

    Expected output: A >>> prompt appears. Type a question and press Enter. The model returns a generated response.

  4. Exit the interactive session. Type /bye and press Enter to terminate cleanly.

    /bye
    

    Expected output: Session terminates and returns to the shell prompt.

Verification

ollama run llama3.2:1b "What is 2+2?"
# Expected: A numerical or textual answer containing "4"

Common failures

  • Model pull fails with connection error — Network issue reaching the model registry. Check internet connectivity and DNS resolution for ollama.com.
  • Out of disk space during pull — Model download requires temporary space. Free at least the model's size plus 500 MB before retrying.
  • Out of memory during inference — Model too large for available RAM. Choose a smaller model variant (e.g., 1b instead of 70b) or close other memory-intensive applications.
  • ollama run hangs with no response — GPU drivers may not be recognized. Check with ollama list first, then restart the Ollama service with systemctl restart ollama.
  • Model not found after pull — Pull may have been interrupted. Re-run ollama pull <model-name> to complete the download.

Related guides

  • How to create a custom Modelfile for Ollama
  • How to configure Ollama port and networking
  • Course Ollama Deep Dive
RELATED GUIDES
SET
How to configure Ollama port and networking
SET
How to create a custom Modelfile for Ollama
← All how-to guidesCourses →