Web Search
AI Gateway proxies native web search tools from supported providers so models can answer questions about events after their training cutoff. Search runs on the upstream provider; AI Gateway applies its standard features — logging, caching, rate limiting, and guardrails — to the request.
How you enable web search depends on the provider. Activation is either a tool entry on a tools array or a top-level flag on the request body. The table below points you to the right section.
| Provider | Endpoint | Activation |
|---|---|---|
| Anthropic | POST /ai/v1/messages | tools: [{ "type": "web_search_20250305", "name": "web_search", "max_uses": N }] |
| OpenAI | POST /ai/v1/responses | tools: [{ "type": "web_search_preview" }] |
| xAI | POST /ai/v1/responses | tools: [{ "type": "web_search" }] |
| Alibaba | POST /ai/v1/chat/completions | top-level "enable_search": true |
For providers whose product is search itself — Perplexity and Parallel — refer to Search-first providers.
Anthropic models expose web search through their native web_search_20250305 tool ↗. Add it to the tools array on a POST /ai/v1/messages request.
Supported models — anthropic/claude-haiku-4.5, anthropic/claude-opus-4.5, anthropic/claude-opus-4.6, anthropic/claude-opus-4.7, anthropic/claude-opus-4.8, anthropic/claude-sonnet-4.5, anthropic/claude-sonnet-4.6.
# Run `wrangler whoami` to get your account ID to replace $CLOUDFLARE_ACCOUNT_ID,# and `wrangler auth token` to get an auth token to replace $CLOUDFLARE_API_TOKEN.curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/messages" \ --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "model": "anthropic/claude-haiku-4.5", "max_tokens": 4096, "messages": [ { "role": "user", "content": "What were the top news stories about Cloudflare this week? Summarize in three bullets." } ], "tools": [ { "type": "web_search_20250305", "name": "web_search", "max_uses": 3 } ] }'Equivalent call from a Worker using the AI binding:
const resp = await env.AI.run( "anthropic/claude-haiku-4.5", { max_tokens: 4096, messages: [ { role: "user", content: "What were the top news stories about Cloudflare this week? Summarize in three bullets.", }, ], tools: [{ type: "web_search_20250305", name: "web_search", max_uses: 3 }], }, { gateway: { id: "default", // or use a specific gateway name }, },);const resp = await env.AI.run( "anthropic/claude-haiku-4.5", { max_tokens: 4096, messages: [ { role: "user", content: "What were the top news stories about Cloudflare this week? Summarize in three bullets.", }, ], tools: [{ type: "web_search_20250305", name: "web_search", max_uses: 3 }], }, { gateway: { id: "default", // or use a specific gateway name }, },);Search invocations and results appear in the response as server_tool_use and web_search_tool_result content blocks. Configurable parameters include max_uses, allowed_domains, blocked_domains, and user_location — refer to Anthropic's web search tool documentation ↗ for the full list.
OpenAI models expose web search through the web_search_preview tool ↗ on the Responses API. Use the POST /ai/v1/responses endpoint and add the tool to the tools array.
Supported models — openai/gpt-4.1, openai/gpt-4.1-mini, openai/gpt-4o, openai/gpt-4o-mini, openai/gpt-5, openai/gpt-5-mini, openai/gpt-5-nano, openai/gpt-5.1, openai/gpt-5.4, openai/gpt-5.4-mini, openai/gpt-5.4-nano, openai/gpt-5.4-pro, openai/gpt-5.5, openai/gpt-5.5-pro, openai/o3, openai/o4-mini.
# Run `wrangler whoami` to get your account ID to replace $CLOUDFLARE_ACCOUNT_ID,# and `wrangler auth token` to get an auth token to replace $CLOUDFLARE_API_TOKEN.curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/responses" \ --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "model": "openai/gpt-4o-mini", "input": "What were the top news stories about Cloudflare this week? Summarize in three bullets.", "max_output_tokens": 4096, "tools": [ { "type": "web_search_preview" } ] }'Equivalent call from a Worker using the AI binding:
const resp = await env.AI.run( "openai/gpt-4o-mini", { input: "What were the top news stories about Cloudflare this week? Summarize in three bullets.", max_output_tokens: 4096, tools: [{ type: "web_search_preview" }], }, { gateway: { id: "default", // or use a specific gateway name }, },);const resp = await env.AI.run( "openai/gpt-4o-mini", { input: "What were the top news stories about Cloudflare this week? Summarize in three bullets.", max_output_tokens: 4096, tools: [{ type: "web_search_preview" }], }, { gateway: { id: "default", // or use a specific gateway name }, },);OpenAI web search is available only on the Responses API endpoint (POST /ai/v1/responses). The /ai/v1/chat/completions endpoint does not accept the web_search_preview tool.
Both { "type": "web_search_preview" } and { "type": "web_search" } are accepted on the Responses API. The examples here use web_search_preview.
xAI's multi-agent Grok model exposes web search through the web_search tool ↗ on the Responses API. Add { "type": "web_search" } to the tools array on a POST /ai/v1/responses request.
Supported models — xai/grok-4.20-multi-agent-0309.
# Run `wrangler whoami` to get your account ID to replace $CLOUDFLARE_ACCOUNT_ID,# and `wrangler auth token` to get an auth token to replace $CLOUDFLARE_API_TOKEN.curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/responses" \ --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "model": "xai/grok-4.20-multi-agent-0309", "input": "What were the top news stories about Cloudflare this week? Summarize in three bullets.", "max_turns": 4, "tools": [ { "type": "web_search" } ] }'Equivalent call from a Worker using the AI binding:
const resp = await env.AI.run( "xai/grok-4.20-multi-agent-0309", { input: "What were the top news stories about Cloudflare this week? Summarize in three bullets.", max_turns: 4, tools: [{ type: "web_search" }], }, { gateway: { id: "default", // or use a specific gateway name }, },);const resp = await env.AI.run( "xai/grok-4.20-multi-agent-0309", { input: "What were the top news stories about Cloudflare this week? Summarize in three bullets.", max_turns: 4, tools: [{ type: "web_search" }], }, { gateway: { id: "default", // or use a specific gateway name }, },);xai/grok-4.20-multi-agent-0309 is the only xAI model that accepts web search through AI Gateway. For other Grok models, refer to Models without web search support.
Alibaba DashScope Qwen models enable web search through a top-level enable_search ↗ flag on a chat completions request. Unlike Anthropic, OpenAI, and xAI, there is no tools entry — web search is activated by the flag alone.
Supported models — alibaba/qwen3-max, alibaba/qwen3.5-397b-a17b.
# Run `wrangler whoami` to get your account ID to replace $CLOUDFLARE_ACCOUNT_ID,# and `wrangler auth token` to get an auth token to replace $CLOUDFLARE_API_TOKEN.curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "model": "alibaba/qwen3-max", "enable_search": true, "max_tokens": 4096, "messages": [ { "role": "user", "content": "What were the top news stories about Cloudflare this week? Summarize in three bullets." } ] }'Equivalent call from a Worker using the AI binding:
const resp = await env.AI.run( "alibaba/qwen3-max", { enable_search: true, max_tokens: 4096, messages: [ { role: "user", content: "What were the top news stories about Cloudflare this week? Summarize in three bullets.", }, ], }, { gateway: { id: "default", // or use a specific gateway name }, },);const resp = await env.AI.run( "alibaba/qwen3-max", { enable_search: true, max_tokens: 4096, messages: [ { role: "user", content: "What were the top news stories about Cloudflare this week? Summarize in three bullets.", }, ], }, { gateway: { id: "default", // or use a specific gateway name }, },);DashScope does not return search-grounded context as separate tool-call response blocks. It folds the fetched context into the prompt as additional input tokens — expect prompt_tokens to increase substantially on a successful search-grounded response.
For some providers, the primary API is a search endpoint rather than a chat endpoint with a web search tool. AI Gateway exposes them through their existing provider proxy endpoints at gateway.ai.cloudflare.com.
AI Gateway does not provide a provider-agnostic web search abstraction. Call the provider proxy directly using the patterns below.
Call any Perplexity Sonar model ↗ through the Perplexity provider proxy.
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/perplexity-ai/chat/completions \ --header "Authorization: Bearer $PERPLEXITY_API_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "model": "sonar", "messages": [ { "role": "user", "content": "What were the top news stories about Cloudflare this week?" } ] }'Call Parallel's Search API through the Parallel provider proxy. Refer to Parallel's Search API documentation ↗ for the full request schema.
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/parallel/v1beta/search \ --header "x-api-key: $PARALLEL_API_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "objective": "Top news stories about Cloudflare this week.", "processor": "base", "max_results": 10 }'The following models do not accept web search through AI Gateway:
- Google Gemini — not available through the unified
web_searchtool, because Vertex's OpenAI-compatible surface does not translate it into Gemini's nativegoogleSearchtool. To use Gemini grounding, pass the nativegoogle_searchtool to the provider-specific Vertex endpoint. - Grok chat-completions models —
xai/grok-4.20-0309-non-reasoning,xai/grok-4.20-0309-reasoning, andxai/grok-4.3use the chat-completions endpoint, which does not accept theweb_searchtool. For Grok web search, refer to xAI web search. - DeepSeek
deepseek-v4-flash,deepseek-v4-pro— these models accept function tools only. - MiniMax
m2.7,m3— these models accept{ "type": "function" }tools only. - OpenAI
gpt-4.1-nano,o1-pro,o3-mini— the upstream returnsinvalid_request_errorforweb_search_previewon these models. - OpenAI
gpt-4o-search-preview,gpt-4o-mini-search-preview— these preview models are deprecated upstream.
Web search requests are billed at the upstream provider's web-search rates and flow through Unified Billing along with the rest of the model call. AI Gateway does not charge a separate web-search fee.
Web search tool calls and their results are visible in AI Gateway logs alongside the rest of the request and response.
- REST API — the four endpoints these examples target
- Workers Bindings —
env.AI.runreference - Anthropic provider
- OpenAI provider
- Grok (xAI) provider
- Perplexity provider
- Parallel provider
- Unified Billing