RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Document Processing with Local AI
  6. /Ch. 8
Document Processing with Local AI

08. Document Summarization

Chapter 8 of 18 · 25 min
KEY INSIGHT

Extractive summarization (selecting important sentences) works without LLMs and runs fast; abstractive summarization (generating new text) requires LLMs but produces more coherent output. ### Two Summarization Approaches Extractive summarization selects existing sentences from the document. No language generation requiredΓÇöfaster, more reliable, but may produce choppy output. Abstractive summarization generates new text that paraphrases contentΓÇömore coherent but requires LLMs and may introduce hallucinations. ### Extractive Summarization with TF-IDF Extract the most important sentences using TF-IDF scoring: ```python import fitz import re from sklearn.feature_extraction.text import TfidfVectorizer import numpy as np def extractive_summarize(text, num_sentences=5): # Split into sentences sentences = re.split(r'(?<=[.!?])\s+', text) sentences = [s for s in sentences if len(s) > 20] # Filter short sentences if len(sentences) <= num_sentences: return text # TF-IDF scoring vectorizer = TfidfVectorizer(stop_words='english') tfidf_matrix = vectorizer.fit_transform(sentences) # Score each sentence by sum of TF-IDF values sentence_scores = np.array(tfidf_matrix.sum(axis=1)).flatten() # Get top sentences (by original position, not score order) top_indices = sentence_scores.argsort()[-num_sentences:] top_indices.sort() # Sort by position in document summary = ' '.join(sentences[i] for i in top_indices) return summary # Usage doc = fitz.open("document.pdf") text = doc[0].get_text() doc.close() summary = extractive_summarize(text, num_sentences=5) print(summary) ``` ### LexRank for Better Extraction LexRank uses graph-based ranking similar to Google's PageRank. Often produces more coherent summaries: ```bash pip install sumy ``` ```python from sumy.parsers.plaintext import PlaintextParser from sumy.nlp.tokenizers import Tokenizer from sumy.summarizers.lex_rank import LexRankSummarizer from sumy.nlp.stemmers import Stemmer from sumy.utils import get_stop_words def lexrank_summarize(text, num_sentences=5): parser = PlaintextParser.from_string(text, Tokenizer("english")) stemmer = Stemmer("english") summarizer = LexRankSummarizer(stemmer) summarizer.stop_words = get_stop_words("english") summary = summarizer(parser.document, sentences_count=num_sentences) return ' '.join(str(sentence) for sentence in summary) summary = lexrank_summarize(text) print(summary) ``` ### Abstractive Summarization with Local LLMs For coherent, human-readable summaries, use local LLMs: ```bash pip install llama-cpp-python transformers ``` ```python from llama_cpp import Llama import fitz llm = Llama( model_path="./models/llama-2-7b-chat.gguf", n_ctx=4096, n_threads=4 ) def summarize_llm(text, max_tokens=200): prompt = f"""Summarize the following document in 3-5 sentences: {text[:4000]} Summary:""" response = llm(prompt, max_tokens=max_tokens, temperature=0.3) return response['choices'][0]['text'] doc = fitz.open("document.pdf") text = " ".join(page.get_text() for page in doc) doc.close() summary = summarize_llm(text) print(summary) ``` Temperature 0.3 keeps output factual with minimal hallucination. Higher temperature produces more creative but less reliable summaries. ### Hybrid Approach: Extract + Abstract Combine extractive and abstractive for best results: ```python def hybrid_summarize(text): # First extract key sentences extracted = extractive_summarize(text, num_sentences=10) # Then abstract with LLM summary = summarize_llm(extracted) return summary ``` This approach reduces input length for the LLM (faster, cheaper) while preserving key information. ### Handling Long Documents Documents longer than LLM context require chunking: ```python def chunk_summarize(text, chunk_size=2000, overlap=200): chunks = [] start = 0 while start < len(text): end = start + chunk_size chunks.append(text[start:end]) start = end - overlap # Overlap for continuity # Summarize each chunk chunk_summaries = [summarize_llm(chunk) for chunk in chunks] # Final summary of summaries combined = " ".join(chunk_summaries) return summarize_llm(combined) long_text = "..." # Your full document summary = chunk_summarize(long_text) ``` Overlap ensures context continuity across chunk boundaries.

EXERCISE

Take a 10+ page document (research paper, report, article). Generate summaries using: (1) TF-IDF extractive, (2) LexRank extractive, (3) Local LLM abstractive, (4) hybrid approach. Evaluate each on: coherence (does it read naturally?), coverage (does it capture main points?), and length appropriate for skimming. Identify which approach works best for your use case.

← Chapter 7
Document Classification
Chapter 9 →
Entity Extraction