Legal Analysis
Contract review, case-law analysis, regulatory interpretation. Privacy + on-prem deployment is the wedge — legal data can't leave the firm. Long-context handling is critical.
Setup walkthrough
- Install LM Studio (local-first, no data ever leaves) → download Llama 3.3 70B Q4_K_M (40 GB) or Qwen 2.5 32B Q6_K (24 GB).
- For contract review: load a contract PDF into a local RAG pipeline. AnythingLLM + LM Studio — upload the contract, ask: "Identify any unusual termination clauses, limitation of liability exceeding 12 months of fees, and missing force majeure provisions."
- First analysis in 15-30 seconds. All data stays on-device — essential for attorney-client privilege.
- For case law research: index a corpus of case PDFs → semantic search over case law → LLM synthesizes relevant precedents.
- For e-discovery: embed millions of documents → retrieve relevant ones based on natural language queries → LLM reviews the top matches. Reduces document review time by 70-90%.
- Critical: Legal AI MUST run locally. Sending client documents to cloud AI services (ChatGPT, Claude API) waives attorney-client privilege in most jurisdictions. On-prem/local deployment is not a preference — it's a professional obligation for attorneys.
- For specialized legal models: SaulLM, LexiLaw, and LLaMA-2-legal fine-tunes exist but general-purpose 70B models often outperform them on reasoning tasks.
The cheap setup
$300-400 genuinely cannot do reliable legal analysis. Legal work requires: (a) 70B+ models for accurate contract interpretation (hallucinating a contract clause = malpractice), (b) long context (32K+ tokens) for full contracts, (c) strict data privacy. A 7B-14B model on 12 GB VRAM ($400 build) will hallucinate legal precedents, miss subtle contractual language, and confuse jurisdictional differences. For legal AI on a budget: used RTX 3090 24 GB + 64 GB RAM ($1,500 total) is the minimum viable setup for contract review. Below that, the risk of AI-generated legal errors exceeds the benefit. $400 buys you document classification and basic keyword extraction, not legal analysis. Be honest about this limitation.
The serious setup
Dual RTX 3090 48 GB total ($1,600, see /hardware/rtx-3090). Runs Llama 3.3 70B Q5_K_M (48 GB) — professional-grade contract analysis, case law reasoning, and regulatory interpretation. For a small-to-medium law firm (5-20 attorneys): this handles e-discovery (1M+ documents), contract review (50+ contracts/day), and legal research. Pair with Ryzen 7 7700X + 64 GB DDR5 + 4TB NVMe (legal document archives are massive). Total: ~$2,500-3,500. For AmLaw 100 firms: 4-8 GPU servers with 8× RTX 3090/4090 for firm-wide deployment. The ROI is dramatic — 70% reduction in first-pass document review pays for the hardware in one case.
Common beginner mistake
The mistake: Uploading client contracts to ChatGPT/Claude/Gemini "for a quick review" because local setup "takes too long." Why it fails: This isn't a quality problem — it's a legal ethics violation. Uploading client documents to a third-party AI service: (1) waives attorney-client privilege (the AI provider's ToS typically claim rights to process/analyze uploaded data), (2) violates data protection regulations (GDPR, CCPA, HIPAA if medical-legal), (3) creates discoverability — the uploaded data is now held by a third party and potentially subpoenaable, (4) violates most state bar ethics opinions on technology competence. Multiple state bars have issued formal opinions on this — the guidance is unanimous: don't upload client data to consumer AI services. The fix: Use local-only AI. LM Studio + AnythingLLM + Llama 3.3 70B provides contract review quality approaching GPT-4 with zero data leakage. If your firm can't afford ~$3,000 for a local AI server, use manual review until you can — the malpractice liability from a ChatGPT data breach far exceeds $3,000. Never put client data into cloud AI. Ever.
Recommended setup for legal analysis
Browse all tools for runtimes that fit this workload.
Reality check
Local AI workloads have real hardware constraints that vary by task type. VRAM ceiling decides what model fits; bandwidth decides decode speed; compute decides prefill speed. Pick the GPU tier that fits your actual workload, not the spec sheet.
Common mistakes
- Buying for spec-sheet VRAM without modeling KV cache + activation overhead
- Underestimating quantization quality loss below Q4
- Skipping flash-attention support (real perf gap on long context)
- Ignoring sustained-load thermals (laptops thermal-throttle within 30 min)
What breaks first
The errors most operators hit when running legal analysis locally. Each links to a diagnose+fix walkthrough.
Before you buy
Verify your specific hardware can handle legal analysis before committing money.