How to build an AI content generation pipeline
LLM endpoint, content templates, Python
What this does
A content generation pipeline transforms structured inputs—keywords, outlines, product data—into polished articles, social posts, or product descriptions. This guide covers orchestration with topic clustering, LLM generation, quality scoring, and human review gates. Built with Python and an LLM API (OpenAI, Anthropic, or self-hosted).
Steps
1. Define input schemas Create pydantic models for each content type. Articles need title, target word count, tone, and key points. Social posts need platform, character limit, and hashtags. Product descriptions need SKU, features, and audience.
2. Implement topic clustering Group incoming content requests by theme using embeddings. Store requests in a queue (Redis or in-memory list) and cluster before generation to batch similar requests, reducing API cost.
from sklearn.cluster import AgglomerativeClustering
import numpy as np
def cluster_topics(requests, embedding_model):
embeddings = [embedding_model.encode(r["topic"]) for r in requests]
clusters = AgglomerativeClustering(n_clusters=None, distance_threshold=0.7)
labels = clusters.fit_predict(np.array(embeddings))
return labels
3. Build the generation engine
Route each request to the appropriate prompt template. Inject structured fields into the template and call the LLM. Store results with a status of pending_review.
4. Add quality scoring Score outputs on coherence (LLM-judged), keyword coverage, and length compliance. Assign a 0–100 score and flag outputs below threshold for human review.
5. Implement review gates
Route content above the score threshold to publication. Route flagged content to a review queue. Use a simple status enum: generated → reviewed → approved → published.
6. Persist and serve Store completed content in a database or CMS with metadata (source request ID, cluster, score, timestamps).
- Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.
Verification
Run the pipeline with a test request:
python -m pipeline.run --input '{"type":"article","topic":"renewable energy","word_count":500}'
Expected output:
[cluster:2] Generating article on 'renewable energy'
Quality score: 84/100 — auto-approved
Status: approved -> published
Run quality gate tests:
pytest tests/test_pipeline.py -k quality
Expected: 3 passed, 0 failed
Common failures
- LLM rate limits: Implement exponential backoff and request queuing to avoid throttling.
- Low-quality clusters: Embedding model choice matters. Switch to a domain-specific embedding model if clustering results are noisy.
- Review gate bottleneck: If human reviewers fall behind, tune the quality threshold downward temporarily or add more reviewers.
- Template injection: Always validate and escape user-provided fields before injecting into prompts to prevent prompt injection attacks.
Related guides
- How to implement AI-powered data extraction from PDFs — Structured output handling complements content generation for multi-format pipelines.
- How to set up agent scheduling with cron and triggers — Scheduling nightly batch runs keeps the pipeline continuously fed with fresh content.