LangChain Callbacks — LangChain for Local AI (Chapter 17)

Callbacks intercept events during chain executionâ€”useful for logging, monitoring, timing, and debugging. LangChain's callback system fires events at predefined points: chain start/end, LLM start/end, retrieval events, and errors.

Create a custom callback handler.

from langchain_core.callbacks import BaseCallbackHandler
from langchain.schema import AgentAction, AgentFinish

class TimingCallback(BaseCallbackHandler):
    def __init__(self):
        self.tokens = 0
        self.start_time = None
    
    def on_llm_start(self, serialized, prompts, **kwargs):
        self.start_time = time.time()
        print(f"LLM started at {datetime.now()}")
    
    def on_llm_end(self, response, **kwargs):
        elapsed = time.time() - self.start_time
        print(f"LLM finished in {elapsed:.2f}s")
    
    def on_chain_start(self, serialized, inputs, **kwargs):
        print(f"Chain started with {len(inputs)} inputs")
    
    def on_chain_end(self, outputs, **kwargs):
        print(f"Chain output keys: {list(outputs.keys())}")

from datetime import datetime
import time

callback = TimingCallback()

Attach callbacks at chain creation or invocation time.

# At chain creation
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    callbacks=[callback]  # Attach here
)

# At invocation (takes precedence)
result = qa_chain.invoke({"query": "..."}, callbacks=[callback])

For token counting, inspect the LLM output in on_llm_end.

class TokenCounterCallback(BaseCallbackHandler):
    def on_llm_end(self, response, **kwargs):
        if hasattr(response, "llm_output") and response.llm_output:
            token_usage = response.llm_output.get("token_usage", {})
            print(f"Tokens used: {token_usage}")

LangChain also provides built-in handlers: StdOutCallbackHandler for verbose output, FileCallbackHandler for file logging, and LangchainTracer for LangSmith integration.

from langchain_core.callbacks import StdOutCallbackHandler

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    callbacks=[StdOutCallbackHandler()]  # Verbose output
)

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.