RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Document Processing with Local AI
  6. /Ch. 6
Document Processing with Local AI

06. Image Preprocessing

Chapter 6 of 18 · 25 min
KEY INSIGHT

OCR accuracy depends more on preprocessing quality than the OCR engine itselfΓÇöinvest time in proper image preparation and you'll need less capable (faster) OCR models. ### Why Preprocessing Matters Raw scanned documents contain noise, skew, uneven lighting, and artifacts. Preprocessing transforms these into clean inputs that OCR engines can handle reliably. The improvement from good preprocessing typically exceeds the improvement from switching OCR engines. ### Binarization: Converting to Black and White Binarization converts grayscale or color images to pure black and white. This removes noise and standardizes input. ```python import cv2 import numpy as np def binarize(image_path, method='otsu'): img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE) if method == 'otsu': # Otsu's method finds optimal threshold automatically _, binary = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) elif method == 'adaptive': # Adaptive threshold handles uneven lighting binary = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2) cv2.imwrite('binary.png', binary) return binary ``` Adaptive threshold excels on scanned documents with uneven lightingΓÇöcommon in photographs of documents. ### Deskewing: Correcting Rotation Skewed documents reduce OCR accuracy. Detect and correct skew: ```python import cv2 import numpy as np def deskew(image_path): img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE) # Apply Gaussian blur to reduce noise blurred = cv2.GaussianBlur(img, (5, 5), 0) # Edge detection edges = cv2.Canny(blurred, 50, 150, apertureSize=3) # Hough line transform to find lines lines = cv2.HoughLines(edges, 1, np.pi / 180, 200) if lines is None: return cv2.imread(image_path) # Calculate average angle angles = [] for line in lines[:20]: # Sample first 20 lines rho, theta = line[0] angle = np.degrees(theta) - 90 angles.append(angle) median_angle = np.median(angles) # Rotate image h, w = img.shape center = (w // 2, h // 2) rotation_matrix = cv2.getRotationMatrix2D(center, median_angle, 1.0) rotated = cv2.warpAffine(img, rotation_matrix, (w, h), flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE) cv2.imwrite('deskewed.png', rotated) return rotated ``` Typical deskew angles range from -5┬░ to +5┬░. Angles outside this range often indicate scanning errors rather than document skew. ### Noise Reduction Scanner noise and compression artifacts create false text. Apply morphological operations: ```python def denoise(image_path): img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE) # Morphological opening (remove small white noise) kernel = np.ones((2, 2), np.uint8) opened = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel) # Morphological closing (remove small black noise) closed = cv2.morphologyEx(opened, cv2.MORPH_CLOSE, kernel) cv2.imwrite('denoised.png', closed) return closed ``` For heavy noise, apply bilateral filtering which preserves edges while smoothing: ```python denoised = cv2.bilateralFilter(img, 9, 75, 75) ``` ### Contrast Enhancement Enhance text visibility through histogram equalization: ```python def enhance_contrast(image_path): img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE) # CLAHE (Contrast Limited Adaptive Histogram Equalization) clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8)) enhanced = clahe.apply(img) cv2.imwrite('enhanced.png', enhanced) return enhanced ``` CLAHE handles documents with both bright and dark regions better than global histogram equalization. ### Complete Preprocessing Pipeline ```python def preprocess_document(input_path, output_path): import cv2 import numpy as np # Load image img = cv2.imread(input_path) gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Step 1: Denoise denoised = cv2.bilateralFilter(gray, 9, 75, 75) # Step 2: Binarize with adaptive threshold binary = cv2.adaptiveThreshold(denoised, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2) # Step 3: Morphological cleanup kernel = np.ones((1, 1), np.uint8) cleaned = cv2.morphologyEx(binary, cv2.MORPH_CLOSE, kernel) # Step 4: Deskew # ... (deskew function from above) ... deskewed = deskew_from_image(cleaned) cv2.imwrite(output_path, deskewed) return output_path ```

EXERCISE

Take a poorly scanned document (phone photo of printed receipt works well). Create a preprocessing script that applies each transformation incrementally, running OCR after each step. Measure character accuracy at each stage. Identify which transformation has the greatest impact on your specific document type. This informs which preprocessing steps to prioritize in production.

← Chapter 5
OCR with AI Models
Chapter 7 →
Document Classification