RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /How-to
  5. /How to build an AI content generation pipeline
HOW-TO · SUP

How to build an AI content generation pipeline

advanced·35 min·By Fredoline Eruo
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES

LLM endpoint, content templates, Python

What this does

A content generation pipeline transforms structured inputs—keywords, outlines, product data—into polished articles, social posts, or product descriptions. This guide covers orchestration with topic clustering, LLM generation, quality scoring, and human review gates. Built with Python and an LLM API (OpenAI, Anthropic, or self-hosted).

Steps

1. Define input schemas Create pydantic models for each content type. Articles need title, target word count, tone, and key points. Social posts need platform, character limit, and hashtags. Product descriptions need SKU, features, and audience.

2. Implement topic clustering Group incoming content requests by theme using embeddings. Store requests in a queue (Redis or in-memory list) and cluster before generation to batch similar requests, reducing API cost.

from sklearn.cluster import AgglomerativeClustering
import numpy as np

def cluster_topics(requests, embedding_model):
    embeddings = [embedding_model.encode(r["topic"]) for r in requests]
    clusters = AgglomerativeClustering(n_clusters=None, distance_threshold=0.7)
    labels = clusters.fit_predict(np.array(embeddings))
    return labels

3. Build the generation engine Route each request to the appropriate prompt template. Inject structured fields into the template and call the LLM. Store results with a status of pending_review.

4. Add quality scoring Score outputs on coherence (LLM-judged), keyword coverage, and length compliance. Assign a 0–100 score and flag outputs below threshold for human review.

5. Implement review gates Route content above the score threshold to publication. Route flagged content to a review queue. Use a simple status enum: generated → reviewed → approved → published.

6. Persist and serve Store completed content in a database or CMS with metadata (source request ID, cluster, score, timestamps).

  • Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.

Verification

Run the pipeline with a test request:

python -m pipeline.run --input '{"type":"article","topic":"renewable energy","word_count":500}'

Expected output:

[cluster:2] Generating article on 'renewable energy'
Quality score: 84/100 — auto-approved
Status: approved -> published

Run quality gate tests:

pytest tests/test_pipeline.py -k quality

Expected: 3 passed, 0 failed

Common failures

  • LLM rate limits: Implement exponential backoff and request queuing to avoid throttling.
  • Low-quality clusters: Embedding model choice matters. Switch to a domain-specific embedding model if clustering results are noisy.
  • Review gate bottleneck: If human reviewers fall behind, tune the quality threshold downward temporarily or add more reviewers.
  • Template injection: Always validate and escape user-provided fields before injecting into prompts to prevent prompt injection attacks.

Related guides

  • How to implement AI-powered data extraction from PDFs — Structured output handling complements content generation for multi-format pipelines.
  • How to set up agent scheduling with cron and triggers — Scheduling nightly batch runs keeps the pipeline continuously fed with fresh content.
← All how-to guidesCourses →