How to implement web search for AI agents
Search API key (Tavily/SerpAPI), agent framework
What this does
Implementing web search for AI agents gives agents the ability to retrieve real-time information from the internet. When the agent encounters a query about current events, recent data, or topics beyond its training cutoff, it invokes a search tool that queries a search API, retrieves relevant results, extracts key content, and returns it as context. This transforms agents from static knowledge bases into dynamic systems capable of accessing up-to-date information.
Steps
Register for a search API key and set it as an environment variable: export TAVILY_API_KEY="tvly-xxxxxxxx". Install the client library and create the search tool. For Tavily: from tavily import TavilyClient; tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"]). Define the tool function: def web_search(query: str, max_results: int = 5) -> str: response = tavily.search(query=query, max_results=max_results, include_raw_content=False); results = []; for r in response["results"]: results.append(f"Title: {r['title']}\nURL: {r['url']}\nContent: {r['content']}\n"); return "\n---\n".join(results). Wrap it as a LangChain-compatible tool: from langchain.tools import tool; @tool def web_search_tool(query: str) -> str: return web_search(query). Add the tool to the agent's tool list: tools = [web_search_tool]. The agent's LLM must have search described in its system prompt: "Use the web_search tool when the query requires current information or facts beyond your knowledge cutoff." Ensure the tool description is clear so the model knows when to invoke it: description="Search the web for current information. Use for recent events, news, or facts you are unsure about.". For advanced usage, add a content extraction step that fetches and processes full page content from search result URLs, then summarize each page before returning to the agent. Implement result caching with a simple dictionary keyed by query to avoid redundant API calls: cache = {}; if query in cache: return cache[query].
Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.
Confirm the local starting state. Print the active binary, package version, model name, or configuration path before changing the workflow.
Run the smallest complete path. Execute the minimum command or script that proves the guide works end to end on the local machine.
Compare against expected output. Check the final line, status code, generated artifact, or model response against the verification section before expanding the setup.
Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.
Verification
Ask the agent: "What were the top technology news headlines today?" Verify it calls the search tool (check logs) and returns current, real information. Ask a question the model should know without search: "What is 2+2?" Verify no search call is made. Test rate limiting: send 5 rapid search queries and confirm the tool completes without API errors. Check the search results for URL and content presence—each result should have all three fields populated. Test caching: send the same query twice and verify the second call uses cached results (no API call).
Common failures
API key invalid or expired: Check the key on the provider dashboard and verify the environment variable is correctly set with echo $TAVILY_API_KEY. Search returns irrelevant results: Add more specific query parameters—use search_depth="advanced" for Tavily and include include_domains to restrict to trusted sources. Rate limit exceeded: Implement exponential backoff in the tool function with time.sleep(retry_delay) and a maximum of 3 retries. Tool not being invoked by the LLM: Strengthen the system prompt instruction and ensure the tool description mentions specific trigger keywords. Results too long for context window: Trim content to 300 characters per result and limit max_results to 3 for models with small context windows.
- Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
- Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.
Related guides
- setup-agent-tool-use-function-calling
- build-langgraph-agent-scratch
- setup-agent-memory-vector-databases