Coding
pr review
code review ai

Code Review

Reviewing PRs/MRs for bugs, style, security. Tool-use + repo-context awareness drive quality.

Setup walkthrough

  1. Install Ollamaollama pull qwen2.5-coder:14b (~9 GB).
  2. For reviewing a PR/MR locally: use a git diff → feed to the model:
git diff main...feature-branch > diff.txt
cat diff.txt | ollama run qwen2.5-coder:14b "Review this code diff for bugs, security issues, style violations, and architectural concerns. For each issue, specify the file and line number."
  1. First review in 5-15 seconds for a typical PR (200-500 lines changed).
  2. For VS Code integration: install Continue extension → select files → right-click → "Continue: Review Selection" → model reviews for bugs, style, and suggestions.
  3. For automated PR review in CI/CD: set up a GitHub Action that runs on PR open → git diff → Ollama API → post review comments.
  4. For security-focused review: use a prompt emphasizing OWASP Top 10, SQL injection, XSS, hardcoded secrets. The model catches ~70-80% of security issues that a human reviewer would catch.
  5. Best models: Qwen 2.5 Coder 32B > DeepSeek Coder V3 > Qwen 2.5 Coder 14B. The 32B model catches subtle logic errors the 14B misses.

The cheap setup

Used RTX 3060 12 GB (~$200-250, see /hardware/rtx-3060-12gb). Runs Qwen 2.5 Coder 14B at 25-35 tok/s — reviews a 500-line diff in 10-20 seconds. The 14B model catches obvious bugs, style issues, and well-known security patterns. Pair with Ryzen 5 5600 + 16 GB DDR4 + 512 GB NVMe. Total: ~$360-405. For small-to-medium teams (1-10 devs, PRs under 500 lines): $400 handles daily code review. The 14B model is surprisingly good at flagging missing error handling, race conditions, and SQL injection — things juniors miss frequently.

The serious setup

Used RTX 3090 24 GB (~$700-900, see /hardware/rtx-3090). Runs Qwen 2.5 Coder 32B at 35-50 tok/s or DeepSeek Coder V3 at 15-20 tok/s — reviews a 2,000-line PR in 30-60 seconds with architectural-level understanding. For enterprise teams (50+ devs, CI/CD integration): the 32B model acts as a first-pass reviewer that catches 80%+ of bugs before human review. Total: ~$1,800-2,200. Code review quality jumps significantly at 32B — the model can reason about data flow across files and spot architectural anti-patterns.

Common beginner mistake

The mistake: Running an AI code review, seeing "no issues found," and merging without a human glance. Why it fails: AI reviewers have false negatives — they miss novel vulnerability patterns, business logic errors, and context-dependent bugs. A model trained on open-source code won't flag proprietary business logic violations (e.g., "this discount calculation allows negative prices if the user is a premium member from a specific region"). The fix: Use AI review as a first pass (catch the obvious), then human review for domain knowledge and security. AI catches syntax errors, common bug patterns, and style violations. Humans catch business logic errors, novel exploits, and architectural decisions. AI-first, human-verified. Never AI-only for production code review. The model is a very thorough junior reviewer, not a senior architect.

Reality check

Code models are LLM workloads — same VRAM math applies. 16 GB runs 13-32B Q4 (Qwen 2.5 Coder, DeepSeek Coder); 24 GB unlocks 70B-class code models. The killer detail is context window — code review wants 32K+, which pushes KV cache beyond 16 GB on 70B.

Common mistakes

  • Skipping context-window math (KV cache eats VRAM at scale)
  • Using base instruct models for code (specialized code models 30-50% better)
  • Running coding agent loops on 8 GB (works for 7B but agent loops compound)
  • Forgetting flash-attention impacts code workflows more than chat

What breaks first

The errors most operators hit when running code review locally. Each links to a diagnose+fix walkthrough.

Before you buy

Verify your specific hardware can handle code review before committing money.

Hardware buying guidance for Code Review

Local coding workflows live or die on time-to-first-token and 32K+ context. The guides below cover the developer-specific hardware decision.

Specialized buyer guides
Updated 2026 roundup