05. LLMChain Basics

Chapter 5 of 18 · 20 min

LLMChain combines a prompt template with an LLM into a runnable unit. This is the workhorse of LangChain—most simple chains are LLMChain instances. It takes a prompt template and an LLM, exposes the same invoke() interface as the underlying model, and adds the ability to run the prompt through the template before sending to the model.

from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain_ollama import ChatOllama

llm = ChatOllama(model="llama3.2:3b", temperature=0.9)

template = PromptTemplate.from_template(
    "Give me a {adjective} one-sentence explanation of {concept}."
)

chain = LLMChain(prompt=template, llm=llm)

result = chain.invoke({"adjective": "witty", "concept": "recursion"})
print(result)
# {'adjective': 'witty', 'concept': 'recursion', 'text': 'Recursion is ..., ...'}

Output keys default to the template's input variables plus the LLM's output key (default: "text" for LLMChain with PromptTemplate). To customize or rename output keys, pass output_key to the constructor:

chain = LLMChain(
    prompt=template,
    llm=llm,
    output_key="explanation",  # result['explanation'] instead of result['text']
    verbose=True,              # prints template rendering + LLM call + output
)

The verbose=True flag is invaluable during development. It writes template rendering and raw LLM responses to stderr, showing you exactly what was sent to the model and what came back.

LLMChain also has a .apply() method for running multiple inputs in sequence without a for-loop:

adjectives = ["concise", "technical", "humorous"]
inputs = [{"adjective": adj, "concept": "deadlock"} for adj in adjectives]

results = chain.apply(inputs)
for r in results:
    print(r["text"])

This does not parallelize (it runs sequentially within the same model session), but it saves boilerplate.

One important behavior: LLMChain passes through input variables from the dict to the result dict. If your prompt has variables ["adjective", "concept"] and you invoke with {"adjective": "witty", "concept": "recursion"}, the result dict includes those keys. This is predictable but can clutter the output if you want only the model's response. Extract result["text"] explicitly in application code.

For handling the raw model response as an AIMessage instead of a plain string, set output_parser=None (the default with ChatPromptTemplate):

from langchain.schema import AgentFinish

# Chat model path: the output is an AIMessage
chat_template = ChatPromptTemplate.from_messages([
    ("human", "What is the capital of {country}?")
])

chat_chain = LLMChain(
    prompt=chat_template,
    llm=llm,
    output_key="answer",
    verbose=True
)

result = chat_chain.invoke({"country": "Brazil"})
print(result["answer"].content)  # AIMessage.content attribute
# Brasilia

Failure mode: if the LLM raises an exception (timeout, Ollama restart), LLMChain propagates it. Wrap invocations in try/except and consider adding retry logic using tenacity or LangChain's built-in retry components if your Ollama instance is unreliable.

EXERCISE

Build an LLMChain with a ChatPromptTemplate that asks for a code example. Include verbose=True, invoke it, and capture the output showing what verbose prints during execution. Extract only the model's response text (not the template inputs) from the result dict.