Clinical Note Processing — Healthcare AI with Local Models (Chapter 5)

Clinical notes present a high-volume, repetitive task where local AI delivers immediate value. Physicians spend significant time documenting patient encounters; even modest productivity improvements in documentation translate to substantial time savings across an organization.

The key distinction is between structuring existing notes and generating new content. Structure involves extracting key information from free-text notes—identifying diagnoses, medications, allergies, and care plans. Generation involves drafting new content from encounter data. Both improve with local deployment: no PHI leaves the premises, and model customization becomes possible without vendor constraints.

# clinical_note_processor.py
from dataclasses import dataclass
from typing import Optional, List
from datetime import datetime
import json

@dataclass
class ExtractedClinicalData:
    diagnoses: List[str]
    medications: List[dict]  # name, dosage, frequency
    allergies: List[str]
    procedures: List[str]
    vitals: Optional[dict] = None
    follow_up: Optional[str] = None
    confidence_scores: Optional[dict] = None

class ClinicalNoteProcessor:
    """Extract structured data from clinical notes using local LLM."""
    
    EXTRACTION_PROMPT = """Extract structured clinical information from this note.
    Return valid JSON with these fields:
    - diagnoses: list of identified diagnoses
    - medications: list of objects with name, dosage, frequency
    - allergies: list of identified allergies
    - procedures: list of mentioned procedures
    - vitals: object with any vital signs found
    - follow_up: any follow-up instructions
    
    If a field is not mentioned, use an empty list or null.
    Only extract explicit information; do not infer or guess.
    
    Clinical Note:
    {note_text}
    
    JSON Output:"""
    
    def __init__(self, ollama_client):
        self.ollama = ollama_client
        
    def extract_structured_data(self, note_text: str) -> ExtractedClinicalData:
        """Parse unstructured clinical note into structured format."""
        prompt = self.EXTRACTION_PROMPT.format(note_text=note_text)
        
        response = self.ollama.generate(prompt)
        
        try:
            data = json.loads(response)
            return ExtractedClinicalData(
                diagnoses=data.get("diagnoses", []),
                medications=data.get("medications", []),
                allergies=data.get("allergies", []),
                procedures=data.get("procedures", []),
                vitals=data.get("vitals"),
                follow_up=data.get("follow_up")
            )
        except json.JSONDecodeError:
            # Fallback: try to parse with more explicit instructions
            return self._fallback_extraction(note_text)
    
    def _fallback_extraction(self, note_text: str) -> ExtractedClinicalData:
        """Handle LLM output that isn't valid JSON."""
        # Retry with stricter formatting requirements
        retry_prompt = f"""Parse this clinical note into strict JSON format.
        The output must be parseable by json.loads().
        Escape any special characters.
        
        Note: {note_text[:2000]}"""
        
        response = self.ollama.generate(retry_prompt)
        try:
            return json.loads(response)
        except:
            return ExtractedClinicalData(
                diagnoses=[], medications=[], allergies=[],
                procedures=[]
            )
    
    def batch_process(self, notes: List[str], 
                      max_concurrent: int = 4) -> List[ExtractedClinicalData]:
        """Process multiple notes concurrently."""
        from concurrent.futures import ThreadPoolExecutor
        
        results = []
        with ThreadPoolExecutor(max_workers=max_concurrent) as executor:
            futures = [
                executor.submit(self.extract_structured_data, note)
                for note in notes
            ]
            for future in futures:
                results.append(future.result())
        return results

The extraction prompt demonstrates a common failure mode: model hallucinations in structured data extraction. When the model doesn't find explicit information, it may generate plausible but incorrect data. Confidence scores help, but require validation against a curated dataset before production use.