06. MVP Methodology

Chapter 6 of 24 · 15 min

KEY INSIGHT

Your MVP needs to prove that users want the outcome your AI delivers, not that you can build AI. Many operators confuse "MVP" with "minimum interesting demo." Ship something users will actually pay for, even if it's ugly. MVP architecture for local AI products follows a three-layer stack: 1. **Interface layer** (web app, WhatsApp bot, USSD menu): Minimal viable UX 2. **Inference layer** (local model or API): Enough capability for core use case 3. **Data layer** (user data, interactions, feedback): Capture everything The temptation is to build a beautiful interface with weak AI inside. Don't. Users forgive ugly interfaces when the AI works. They don't forgive beautiful interfaces that produce wrong outputs. ```python # MVP architecture: Local model serving with basic web interface # This is a minimal but functional structure """ MVP Stack: - Backend: FastAPI (lightweight Python web framework) - Model: Llama-based model via Ollama or llama.cpp - Interface: Basic HTML/JS or Telegram bot - Data: SQLite for MVP (upgrade to Postgres when you have traction) """ # mvp/app.py - Core MVP structure from fastapi import FastAPI, HTTPException from pydantic import BaseModel from typing import Optional import sqlite3 import ollama app = FastAPI() class UserRequest(BaseModel): user_id: str query: str context: Optional[str] = "" class UserResponse(BaseModel): response: str tokens_used: int latency_ms: int def init_db(): """Initialize SQLite database for MVP data capture""" conn = sqlite3.connect('mvp_data.db') c = conn.cursor() c.execute(''' CREATE TABLE IF NOT EXISTS interactions ( id INTEGER PRIMARY KEY AUTOINCREMENT, user_id TEXT, query TEXT, response TEXT, tokens_used INTEGER, latency_ms INTEGER, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ) ''') conn.commit() return conn db_conn = init_db() @app.post("/api/query", response_model=UserResponse) async def process_query(request: UserRequest): """Core MVP endpoint: receive query, return AI response""" import time start = time.time() try: # Call local model full_prompt = f"Context: {request.context}\n\nQuery: {request.query}" response = ollama.generate( model='llama3.2:3b', # Small model for MVP speed prompt=full_prompt, options={'num_predict': 512} # Limit output tokens ) latency = int((time.time() - start) * 1000) # Log interaction c = db_conn.cursor() c.execute(''' INSERT INTO interactions (user_id, query, response, tokens_used, latency_ms) VALUES (?, ?, ?, ?, ?) ''', (request.user_id, request.query, response['response'], response['eval_count'], latency)) db_conn.commit() return UserResponse( response=response['response'], tokens_used=response['eval_count'], latency_ms=latency ) except Exception as e: raise HTTPException(status_code=500, detail=str(e)) # Run with: uvicorn mvp.app:app --host 0.0.0.0 --port 8000 ``` The MVP should answer three questions within 4-6 weeks of launch: 1. Do users actually have this problem? (Engagement data) 2. Does our solution address it? (Outcome metrics) 3. Will they pay? (Conversion to paid tier) If you can't answer these questions by week six, the MVP isn't minimal enough—you're building features before validating.

The MVP (Minimum Viable Product) methodology for local AI products borrows from lean startup principles but requires adaptation for AI-specific constraints. You need enough AI capability to validate the core value proposition, not a fully polished experience.

EXERCISE

Define your MVP feature set in three bullet points. What is the minimum viable version that tests all three validation questions above? What are you cutting that you wish you could include?