Security Audit — Capstone: Full-Stack AI App (Chapter 9)

Security audits examine the application for vulnerabilities. Automated tools scan for known issues. Manual review finds business logic flaws. Both are necessary for production systems handling user data.

The OWASP ZAP tool provides automated scanning:

# Run ZAP baseline scan against running application
docker run -t owasp/zap2docker-stable zap-baseline.py \
  -t http://localhost:8000 \
  -J zap_report.json

# Run full scan with authentication
docker run -t owasp/zap2docker-stable zap-full-scan.py \
  -t http://localhost:8000 \
  -z "-config spider.maxDepth=3" \
  -J zap_full_report.json

Code review focuses on injection vulnerabilities. User input flows through many services—each hop is an injection point. Sanitize at boundaries, not in the middle of processing:

# Backend input validation - reject early, validate thoroughly
from pydantic import BaseModel, validator
import re

class QuestionRequest(BaseModel):
    question: str
    document_id: str
    
    @validator('question')
    def validate_question(cls, v):
        # Reject empty or whitespace-only
        if not v or not v.strip():
            raise ValueError('Question cannot be empty')
        
        # Length limit prevents resource exhaustion
        if len(v) > 1000:
            raise ValueError('Question exceeds 1000 characters')
        
        # Remove control characters that could affect terminal output
        v = re.sub(r'[\x00-\x08\x0b\x0c\x0e-\x1f]', '', v)
        
        return v.strip()
    
    @validator('document_id')
    def validate_document_id(cls, v):
        # UUID format validation prevents path traversal
        uuid_pattern = re.compile(
            r'^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$',
            re.IGNORECASE
        )
        if not uuid_pattern.match(v):
            raise ValueError('Invalid document ID format')
        return v.lower()

File upload security requires multiple layers. Validate MIME types by checking file magic bytes, not just extensions:

# Secure file handling
ALLOWED_MIME_TYPES = {'application/pdf'}
MAX_FILE_SIZE = 50 * 1024 * 1024  # 50MB

def validate_file_content(file_content: bytes) -> bool:
    # Check magic bytes for PDF
    if file_content[:5] != b'%PDF-':
        return False
    
    # Check file size
    if len(file_content) > MAX_FILE_SIZE:
        return False
    
    # Verify MIME type matches content
    import magic
    detected_type = magic.from_buffer(file_content, mime=True)
    return detected_type in ALLOWED_MIME_TYPES

Dependency scanning catches known vulnerabilities in third-party packages:

# Scan Python dependencies
pip-audit -r requirements.txt

# Scan JavaScript dependencies
npm audit --audit-level=moderate

# Scan Docker images
trivy image your-registry/your-app:latest

Common findings in security audits include exposed debug endpoints, missing rate limiting, insufficient session timeout, and verbose error messages revealing internal implementation. Fix each finding and re-test.