---
title: AI Changelog
image: https://developers.cloudflare.com/cf-twitter-card.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/changelog/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Changelog

New updates and improvements at Cloudflare.

[ Subscribe to RSS ](https://developers.cloudflare.com/changelog/rss/index.xml) [ View RSS feeds ](https://developers.cloudflare.com/fundamentals/new-features/available-rss-feeds/) 

AI

![hero image](https://developers.cloudflare.com/_astro/hero.CVYJHPAd_26AMqX.svg) 

Jun 19, 2025
1. ### [View custom metadata in responses and guide AI-search with context in AutoRAG](https://developers.cloudflare.com/changelog/post/2025-06-19-autorag-custom-metadata-and-context/)  
[ AI Search ](https://developers.cloudflare.com/ai-search/)  
In [AutoRAG](https://developers.cloudflare.com/ai-search/), you can now view your object's custom metadata in the response from [/search](https://developers.cloudflare.com/ai-search/api/search/workers-binding/) and [/ai-search](https://developers.cloudflare.com/ai-search/api/search/workers-binding/), and optionally add a `context` field in the custom metadata of an object to provide additional guidance for AI-generated answers.  
You can add [custom metadata](https://developers.cloudflare.com/r2/api/workers/workers-api-reference/#r2putoptions) to an object when uploading it to your R2 bucket.  
#### Object's custom metadata in search responses  
When you run a search, AutoRAG now returns any custom metadata associated with the object. This metadata appears in the response inside `attributes` then `file` , and can be used for downstream processing.  
For example, the `attributes` section of your search response may look like:  
```  
{  "attributes": {    "timestamp": 1750001460000,    "folder": "docs/",    "filename": "launch-checklist.md",    "file": {      "url": "https://wiki.company.com/docs/launch-checklist",      "context": "A checklist for internal launch readiness, including legal, engineering, and marketing steps."    }  }}  
```  
#### Add a `context` field to guide LLM answers  
When you include a custom metadata field named `context`, AutoRAG attaches that value to each chunk of the file. When you run an `/ai-search` query, this `context` is passed to the LLM and can be used as additional input when generating an answer.  
We recommend using the `context` field to describe supplemental information you want the LLM to consider, such as a summary of the document or a source URL. If you have several different metadata attributes, you can join them together however you choose within the `context` string.  
For example:  
```  
{  "context": "summary: 'Checklist for internal product launch readiness, including legal, engineering, and marketing steps.'; url: 'https://wiki.company.com/docs/launch-checklist'"}  
```  
This gives you more control over how your content is interpreted, without requiring you to modify the original contents of the file.  
Learn more in AutoRAG's [metadata filtering documentation](https://developers.cloudflare.com/ai-search/configuration/indexing/metadata/).

Jun 19, 2025
1. ### [Filter your AutoRAG search by file name](https://developers.cloudflare.com/changelog/post/2025-06-19-autorag-filename-filter/)  
[ AI Search ](https://developers.cloudflare.com/ai-search/)  
In [AutoRAG](https://developers.cloudflare.com/ai-search/), you can now [filter](https://developers.cloudflare.com/ai-search/configuration/indexing/metadata/) by an object's file name using the `filename` attribute, giving you more control over which files are searched for a given query.  
This is useful when your application has already determined which files should be searched. For example, you might query a PostgreSQL database to get a list of files a user has access to based on their permissions, and then use that list to limit what AutoRAG retrieves.  
For example, your search query may look like:  
JavaScript  
```  
const response = await env.AI.autorag("my-autorag").search({  query: "what is the project deadline?",  filters: {    type: "eq",    key: "filename",    value: "project-alpha-roadmap.md",  },});  
```  
This allows you to connect your application logic with AutoRAG's retrieval process, making it easy to control what gets searched without needing to reindex or modify your data.  
Learn more in AutoRAG's [metadata filtering documentation](https://developers.cloudflare.com/ai-search/configuration/indexing/metadata/).

Jun 03, 2025
1. ### [AI Gateway adds OpenAI compatible endpoint](https://developers.cloudflare.com/changelog/post/2025-06-03-aig-openai-compatible-endpoint/)  
[ AI Gateway ](https://developers.cloudflare.com/ai-gateway/)  
Users can now use an [OpenAI Compatible endpoint](https://developers.cloudflare.com/ai-gateway/usage/chat-completion/) in AI Gateway to easily switch between providers, while keeping the exact same request and response formats. We're launching now with the chat completions endpoint, with the embeddings endpoint coming up next.  
To get started, use the OpenAI compatible chat completions endpoint URL with your own account id and gateway id and switch between providers by changing the `model` and `apiKey` parameters.  
OpenAI SDK Example  
```  
import OpenAI from "openai";const client = new OpenAI({  apiKey: "YOUR_PROVIDER_API_KEY", // Provider API key  baseURL:    "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat",});  
const response = await client.chat.completions.create({  model: "google-ai-studio/gemini-2.0-flash",  messages: [{ role: "user", content: "What is Cloudflare?" }],});  
console.log(response.choices[0].message.content);  
```  
Additionally, the [OpenAI Compatible endpoint](https://developers.cloudflare.com/ai-gateway/usage/chat-completion/) can be combined with our [Universal Endpoint](https://developers.cloudflare.com/ai-gateway/usage/universal/) to add fallbacks across multiple providers. That means AI Gateway will return every response in the same standardized format, no extra parsing logic required!  
Learn more in the [OpenAI Compatibility](https://developers.cloudflare.com/ai-gateway/usage/chat-completion/) documentation.

May 28, 2025
1. ### [Playwright MCP server is now compatible with Browser Rendering](https://developers.cloudflare.com/changelog/post/2025-05-28-playwright-mcp/)  
[ Browser Run ](https://developers.cloudflare.com/browser-run/)  
We're excited to share that you can now use the [Playwright MCP ↗](https://github.com/cloudflare/playwright-mcp) server with Browser Rendering.  
Once you [deploy the server](https://developers.cloudflare.com/browser-run/playwright/playwright-mcp/#deploying), you can use any MCP client with it to interact with Browser Rendering. This allows you to run AI models that can automate browser tasks, such as taking screenshots, filling out forms, or scraping data.  
![Access Analytics](https://developers.cloudflare.com/_astro/playground-ai-screenshot.v44jFMBu_Z1xgc6e.webp)  
Playwright MCP is available as an npm package at [@cloudflare/playwright-mcp ↗](https://www.npmjs.com/package/@cloudflare/playwright-mcp). To install it, type:  
 npm  yarn  pnpm  bun  
```  
npm i -D @cloudflare/playwright-mcp  
```  
```  
yarn add -D @cloudflare/playwright-mcp  
```  
```  
pnpm add -D @cloudflare/playwright-mcp  
```  
```  
bun add -d @cloudflare/playwright-mcp  
```  
Deploying the server is then as easy as:  
TypeScript  
```  
import { env } from "cloudflare:workers";import { createMcpAgent } from "@cloudflare/playwright-mcp";  
export const PlaywrightMCP = createMcpAgent(env.BROWSER);export default PlaywrightMCP.mount("/sse");  
```  
Check out the full code at [GitHub ↗](https://github.com/cloudflare/playwright-mcp).  
Learn more about Playwright MCP in our [documentation](https://developers.cloudflare.com/browser-run/playwright/playwright-mcp/).

Apr 23, 2025
1. ### [Metadata filtering and multitenancy support in AutoRAG](https://developers.cloudflare.com/changelog/post/2025-04-23-autorag-metadata-filtering/)  
[ AI Search ](https://developers.cloudflare.com/ai-search/)  
You can now filter [AutoRAG](https://developers.cloudflare.com/ai-search/) search results by `folder` and `timestamp` using [metadata filtering](https://developers.cloudflare.com/ai-search/configuration/indexing/metadata/) to narrow down the scope of your query.  
This makes it easy to build [multitenant experiences](https://developers.cloudflare.com/ai-search/how-to/per-tenant-search/) where each user can only access their own data. By organizing your content into per-tenant folders and applying a `folder` filter at query time, you ensure that each tenant retrieves only their own documents.

**Example folder structure:**  
Terminal window  
```  
customer-a/logs/customer-a/contracts/customer-b/contracts/  
```

**Example query:**  
JavaScript  
```  
const response = await env.AI.autorag("my-autorag").search({  query: "When did I sign my agreement contract?",  filters: {    type: "eq",    key: "folder",    value: "customer-a/contracts/",  },});  
```  
You can use metadata filtering by creating a new AutoRAG or reindexing existing data. To reindex all content in an existing AutoRAG, update any chunking setting and select **Sync index**. Metadata filtering is available for all data indexed on or after **April 21, 2025**.  
If you are new to AutoRAG, get started with the [Get started AutoRAG guide](https://developers.cloudflare.com/ai-search/get-started/).

Apr 11, 2025
1. ### [Workers AI for Developer Week - faster inference, new models, async batch API, expanded LoRA support](https://developers.cloudflare.com/changelog/post/2025-04-11-new-models-faster-inference/)  
[ Workers AI ](https://developers.cloudflare.com/workers-ai/)  
Happy Developer Week 2025! Workers AI is excited to announce a couple of new features and improvements available today. Check out our [blog ↗](https://blog.cloudflare.com/workers-ai-improvements) for all the announcement details.  
#### Faster inference + New models  
We’re rolling out some in-place improvements to our models that can help speed up inference by 2-4x! Users of the models below will enjoy an automatic speed boost starting today:

  * [@cf/meta/llama-3.3-70b-instruct-fp8-fast](https://developers.cloudflare.com/workers-ai/models/llama-3.3-70b-instruct-fp8-fast/) gets a speed boost of 2-4x, leveraging techniques like speculative decoding, prefix caching, and an updated inference backend.
  * [@cf/baai/bge-small-en-v1.5](https://developers.cloudflare.com/workers-ai/models/bge-small-en-v1.5/), [@cf/baai/bge-base-en-v1.5](https://developers.cloudflare.com/workers-ai/models/bge-base-en-v1.5/), [@cf/baai/bge-large-en-v1.5](https://developers.cloudflare.com/workers-ai/models/bge-large-en-v1.5/) get an updated back end, which should improve inference times by 2x.  
    * With the `bge` models, we’re also announcing a new parameter called `pooling` which can take `cls` or `mean` as options. We highly recommend using `pooling: cls` which will help generate more accurate embeddings. However, embeddings generated with cls pooling are not backwards compatible with mean pooling. For this to not be a breaking change, the default remains as mean pooling. Please specify `pooling: cls` to enjoy more accurate embeddings going forward.  
We’re also excited to launch a few new models in our catalog to help round out your experience with Workers AI. We’ll be deprecating some older models in the future, so stay tuned for a deprecation announcement. Today’s new models include:

  * [@cf/mistralai/mistral-small-3.1-24b-instruct](https://developers.cloudflare.com/workers-ai/models/mistral-small-3.1-24b-instruct/): a 24B parameter model achieving state-of-the-art capabilities comparable to larger models, with support for vision and tool calling.
  * [@cf/google/gemma-3-12b-it](https://developers.cloudflare.com/workers-ai/models/gemma-3-12b-it/): well-suited for a variety of text generation and image understanding tasks, including question answering, summarization and reasoning, with a 128K context window, and multilingual support in over 140 languages.
  * [@cf/qwen/qwq-32b](https://developers.cloudflare.com/workers-ai/models/qwq-32b/): a medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.
  * [@cf/qwen/qwen2.5-coder-32b-instruct](https://developers.cloudflare.com/workers-ai/models/qwen2.5-coder-32b-instruct/): the current state-of-the-art open-source code LLM, with its coding abilities matching those of GPT-4o.  
#### Batch Inference  
Introducing a new batch inference feature that allows you to send us an array of requests, which we will fulfill as fast as possible and send them back as an array. This is really helpful for large workloads such as summarization, embeddings, etc. where you don’t have a human-in-the-loop. Using the batch API will guarantee that your requests are fulfilled eventually, rather than erroring out if we don’t have enough capacity at a given time.  
Check out the [tutorial](https://developers.cloudflare.com/workers-ai/features/batch-api/) to get started! Models that support batch inference today include:

  * [@cf/meta/llama-3.3-70b-instruct-fp8-fast](https://developers.cloudflare.com/workers-ai/models/llama-3.3-70b-instruct-fp8-fast/)
  * [@cf/baai/bge-small-en-v1.5](https://developers.cloudflare.com/workers-ai/models/bge-small-en-v1.5/)
  * [@cf/baai/bge-base-en-v1.5](https://developers.cloudflare.com/workers-ai/models/bge-base-en-v1.5/)
  * [@cf/baai/bge-large-en-v1.5](https://developers.cloudflare.com/workers-ai/models/bge-large-en-v1.5/)
  * [@cf/baai/bge-m3](https://developers.cloudflare.com/workers-ai/models/bge-m3/)
  * [@cf/meta/m2m100-1.2b](https://developers.cloudflare.com/workers-ai/models/m2m100-1.2b/)  
#### Expanded LoRA support  
We’ve upgraded our LoRA experience to include 8 newer models, and can support ranks of up to 32 with a 300MB safetensors file limit (previously limited to rank of 8 and 100MB safetensors) Check out our [LoRAs page](https://developers.cloudflare.com/workers-ai/features/fine-tunes/loras/) to get started. Models that support LoRAs now include:

  * [@cf/meta/llama-3.2-11b-vision-instruct](https://developers.cloudflare.com/workers-ai/models/llama-3.2-11b-vision-instruct/)
  * [@cf/meta/llama-3.3-70b-instruct-fp8-fast](https://developers.cloudflare.com/workers-ai/models/llama-3.3-70b-instruct-fp8-fast/)
  * [@cf/meta/llama-guard-3-8b](https://developers.cloudflare.com/workers-ai/models/llama-guard-3-8b/)
  * [@cf/meta/llama-3.1-8b-instruct-fast](https://developers.cloudflare.com/workers-ai/models/llama-3.1-8b-instruct-fast/) (coming soon)
  * [@cf/deepseek-ai/deepseek-r1-distill-qwen-32b](https://developers.cloudflare.com/workers-ai/models/deepseek-r1-distill-qwen-32b/) (coming soon)
  * [@cf/qwen/qwen2.5-coder-32b-instruct](https://developers.cloudflare.com/workers-ai/models/qwen2.5-coder-32b-instruct/)
  * [@cf/qwen/qwq-32b](https://developers.cloudflare.com/workers-ai/models/qwq-32b/)
  * [@cf/mistralai/mistral-small-3.1-24b-instruct](https://developers.cloudflare.com/workers-ai/models/mistral-small-3.1-24b-instruct/)
  * [@cf/google/gemma-3-12b-it](https://developers.cloudflare.com/workers-ai/models/gemma-3-12b-it/)

Apr 07, 2025
1. ### [Build MCP servers with the Agents SDK](https://developers.cloudflare.com/changelog/post/2025-04-07-mcp-servers-agents-sdk-updates/)  
[ Agents ](https://developers.cloudflare.com/agents/)[ Workers ](https://developers.cloudflare.com/workers/)  
The Agents SDK now includes built-in support for building remote MCP (Model Context Protocol) servers directly as part of your Agent. This allows you to easily create and manage MCP servers, without the need for additional infrastructure or configuration.  
The SDK includes a new `MCPAgent` class that extends the `Agent` class and allows you to expose resources and tools over the MCP protocol, as well as authorization and authentication to enable remote MCP servers.

  * [  JavaScript ](#tab-panel-3532)
  * [  TypeScript ](#tab-panel-3533)  
JavaScript  
```  
export class MyMCP extends McpAgent {  server = new McpServer({    name: "Demo",    version: "1.0.0",  });  
  async init() {    this.server.resource(`counter`, `mcp://resource/counter`, (uri) => {      // ...    });  
    this.server.tool(      "add",      "Add two numbers together",      { a: z.number(), b: z.number() },      async ({ a, b }) => {        // ...      },    );  }}  
```  
TypeScript  
```  
export class MyMCP extends McpAgent<Env> {  server = new McpServer({    name: "Demo",    version: "1.0.0",  });  
  async init() {    this.server.resource(`counter`, `mcp://resource/counter`, (uri) => {      // ...    });  
    this.server.tool(      "add",      "Add two numbers together",      { a: z.number(), b: z.number() },      async ({ a, b }) => {        // ...      },    );  }}  
```  
See [the example ↗](https://github.com/cloudflare/agents/tree/main/examples/mcp) for the full code and as the basis for building your own MCP servers, and the [client example ↗](https://github.com/cloudflare/agents/tree/main/examples/mcp-client) for how to build an Agent that acts as an MCP client.  
To learn more, review the [announcement blog ↗](https://blog.cloudflare.com/building-ai-agents-with-mcp-authn-authz-and-durable-objects) as part of Developer Week 2025.  
#### Agents SDK updates  
We've made a number of improvements to the [Agents SDK](https://developers.cloudflare.com/agents/), including:

  * Support for building MCP servers with the new `MCPAgent` class.
  * The ability to export the current agent, request and WebSocket connection context using `import { context } from "agents"`, allowing you to minimize or avoid direct dependency injection when calling tools.
  * Fixed a bug that prevented query parameters from being sent to the Agent server from the `useAgent` React hook.
  * Automatically converting the `agent` name in `useAgent` or `useAgentChat` to kebab-case to ensure it matches the naming convention expected by [routeAgentRequest](https://developers.cloudflare.com/agents/runtime/communication/routing/).  
To install or update the Agents SDK, run `npm i agents@latest` in an existing project, or explore the `agents-starter` project:  
Terminal window  
```  
npm create cloudflare@latest -- --template cloudflare/agents-starter  
```  
See the full release notes and changelog [on the Agents SDK repository ↗](https://github.com/cloudflare/agents/blob/main/packages/agents/CHANGELOG.md) and

Apr 07, 2025
1. ### [Create fully-managed RAG pipelines for your AI applications with AutoRAG](https://developers.cloudflare.com/changelog/post/2025-04-07-autorag-open-beta/)  
[ AI Search ](https://developers.cloudflare.com/ai-search/)[ Vectorize ](https://developers.cloudflare.com/vectorize/)  
[AutoRAG](https://developers.cloudflare.com/ai-search/) is now in open beta, making it easy for you to build fully-managed retrieval-augmented generation (RAG) pipelines without managing infrastructure. Just upload your docs to [R2](https://developers.cloudflare.com/r2/get-started/), and AutoRAG handles the rest: embeddings, indexing, retrieval, and response generation via API.  
With AutoRAG, you can:

  * **Customize your pipeline:** Choose from [Workers AI](https://developers.cloudflare.com/workers-ai) models, configure chunking strategies, edit system prompts, and more.
  * **Instant setup:** AutoRAG provisions everything you need from [Vectorize](https://developers.cloudflare.com/vectorize), [AI gateway](https://developers.cloudflare.com/ai-gateway), to pipeline logic for you, so you can go from zero to a working RAG pipeline in seconds.
  * **Keep your index fresh:** AutoRAG continuously syncs your index with your data source to ensure responses stay accurate and up to date.
  * **Ask questions:** Query your data and receive grounded responses via a [Workers binding](https://developers.cloudflare.com/ai-search/api/search/workers-binding/) or [API](https://developers.cloudflare.com/ai-search/api/search/rest-api/).  
Whether you're building internal tools, AI-powered search, or a support assistant, AutoRAG gets you from idea to deployment in minutes.  
Get started in the [Cloudflare dashboard ↗](https://dash.cloudflare.com/?to=/:account/ai/autorag) or check out the [guide](https://developers.cloudflare.com/ai-search/get-started/) for instructions on how to build your RAG pipeline today.

Apr 07, 2025
1. ### [Browser Rendering REST API is Generally Available, with new endpoints and a free tier](https://developers.cloudflare.com/changelog/post/2025-04-07-br-free-ga-playwright/)  
[ Browser Run ](https://developers.cloudflare.com/browser-run/)  
We’re excited to announce Browser Rendering is now available on the [Workers Free plan ↗](https://www.cloudflare.com/plans/developer-platform/), making it even easier to prototype and experiment with web search and headless browser use-cases when building applications on Workers.  
The Browser Rendering **[REST API](https://developers.cloudflare.com/browser-run/quick-actions/) is now Generally Available**, allowing you to control browser instances from outside of Workers applications. We've added three new endpoints to help automate more browser tasks:

  * **Extract structured data** – Use `/json` to retrieve structured data from a webpage.
  * **Retrieve links** – Use `/links` to pull all links from a webpage.
  * **Convert to Markdown** – Use `/markdown` to convert webpage content into Markdown format.  
For example, to fetch the Markdown representation of a webpage:  
Markdown example  
```  
curl -X 'POST' 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/markdown' \  -H 'Content-Type: application/json' \  -H 'Authorization: Bearer <apiToken>' \  -d '{    "url": "https://example.com"  }'  
```  
For the full list of endpoints, check out our [REST API documentation](https://developers.cloudflare.com/browser-run/quick-actions/). You can also interact with Browser Rendering via the [Cloudflare TypeScript SDK ↗](https://github.com/cloudflare/cloudflare-typescript).  
We also recently landed support for [Playwright](https://developers.cloudflare.com/browser-run/playwright/) in Browser Rendering for browser automation from Cloudflare [Workers](https://developers.cloudflare.com/workers/), in addition to [Puppeteer](https://developers.cloudflare.com/browser-run/puppeteer/), giving you more flexibility to test across different browser environments.  
Visit the [Browser Rendering docs](https://developers.cloudflare.com/browser-run/) to learn more about how to use headless browsers in your applications.

Apr 04, 2025
1. ### [Playwright for Browser Rendering now available](https://developers.cloudflare.com/changelog/post/2025-04-04-playwright-beta/)  
[ Browser Run ](https://developers.cloudflare.com/browser-run/)  
We're excited to share that you can now use Playwright's browser automation [capabilities ↗](https://playwright.dev/docs/api/class-playwright) from Cloudflare [Workers](https://developers.cloudflare.com/workers/).  
[Playwright ↗](https://playwright.dev/) is an open-source package developed by Microsoft that can do browser automation tasks; it's commonly used to write software tests, debug applications, create screenshots, and crawl pages. Like [Puppeteer](https://developers.cloudflare.com/browser-run/puppeteer/), we [forked ↗](https://github.com/cloudflare/playwright) Playwright and modified it to be compatible with Cloudflare Workers and [Browser Rendering](https://developers.cloudflare.com/browser-run/).  
Below is an example of how to use Playwright with Browser Rendering to test a TODO application using assertions:  
Assertion example  
```  
import { launch, type BrowserWorker } from "@cloudflare/playwright";import { expect } from "@cloudflare/playwright/test";  
interface Env {  MYBROWSER: BrowserWorker;}  
export default {  async fetch(request: Request, env: Env) {    const browser = await launch(env.MYBROWSER);    const page = await browser.newPage();  
    await page.goto("https://demo.playwright.dev/todomvc");  
    const TODO_ITEMS = [      "buy some cheese",      "feed the cat",      "book a doctors appointment",    ];  
    const newTodo = page.getByPlaceholder("What needs to be done?");    for (const item of TODO_ITEMS) {      await newTodo.fill(item);      await newTodo.press("Enter");    }  
    await expect(page.getByTestId("todo-title")).toHaveCount(TODO_ITEMS.length);  
    await Promise.all(      TODO_ITEMS.map((value, index) =>        expect(page.getByTestId("todo-title").nth(index)).toHaveText(value),      ),    );  },};  
```  
Playwright is available as an npm package at [@cloudflare/playwright ↗](https://www.npmjs.com/package/@cloudflare/playwright) and the code is at [GitHub ↗](https://github.com/cloudflare/playwright).  
Learn more in our [documentation](https://developers.cloudflare.com/browser-run/playwright/).

Mar 21, 2025
1. ### [AI Gateway launches Realtime WebSockets API](https://developers.cloudflare.com/changelog/post/2025-03-20-websockets/)  
[ AI Gateway ](https://developers.cloudflare.com/ai-gateway/)  
We are excited to announce that [AI Gateway](https://developers.cloudflare.com/ai-gateway/) now supports real-time AI interactions with the new [Realtime WebSockets API](https://developers.cloudflare.com/ai-gateway/usage/websockets-api/realtime-api/).  
This new capability allows developers to establish persistent, low-latency connections between their applications and AI models, enabling natural, real-time conversational AI experiences, including speech-to-speech interactions.  
The Realtime WebSockets API works with the [OpenAI Realtime API ↗](https://platform.openai.com/docs/guides/realtime#connect-with-websockets), [Google Gemini Live API ↗](https://ai.google.dev/gemini-api/docs/multimodal-live), and supports real-time text and speech interactions with models from [Cartesia ↗](https://docs.cartesia.ai/api-reference/tts/tts), and [ElevenLabs ↗](https://elevenlabs.io/docs/conversational-ai/api-reference/conversational-ai/websocket).  
Here's how you can connect AI Gateway to [OpenAI's Realtime API ↗](https://platform.openai.com/docs/guides/realtime#connect-with-websockets) using WebSockets:  
OpenAI Realtime API example  
```  
import WebSocket from "ws";  
const url =  "wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/openai?model=gpt-4o-realtime-preview-2024-12-17";const ws = new WebSocket(url, {  headers: {    "cf-aig-authorization": process.env.CLOUDFLARE_API_KEY,    Authorization: "Bearer " + process.env.OPENAI_API_KEY,    "OpenAI-Beta": "realtime=v1",  },});  
ws.on("open", () => console.log("Connected to server."));ws.on("message", (message) => console.log(JSON.parse(message.toString())));  
ws.send(  JSON.stringify({    type: "response.create",    response: { modalities: ["text"], instructions: "Tell me a joke" },  }),);  
```  
Get started by checking out the [Realtime WebSockets API](https://developers.cloudflare.com/ai-gateway/usage/websockets-api/realtime-api/) documentation.

Mar 20, 2025
1. ### [Markdown conversion in Workers AI](https://developers.cloudflare.com/changelog/post/2025-03-20-markdown-conversion/)  
[ Workers AI ](https://developers.cloudflare.com/workers-ai/)  
Document conversion plays an important role when designing and developing AI applications and agents. Workers AI now provides the `toMarkdown` utility method that developers can use to for quick, easy, and convenient conversion and summary of documents in multiple formats to Markdown language.  
You can call this new tool using a binding by calling `env.AI.toMarkdown()` or the using the [REST API](https://developers.cloudflare.com/api/resources/ai/) endpoint.  
In this example, we fetch a PDF document and an image from R2 and feed them both to `env.AI.toMarkdown()`. The result is a list of converted documents. Workers AI models are used automatically to detect and summarize the image.  
TypeScript  
```  
import { Env } from "./env";  
export default {  async fetch(request: Request, env: Env, ctx: ExecutionContext) {    // https://pub-979cb28270cc461d94bc8a169d8f389d.r2.dev/somatosensory.pdf    const pdf = await env.R2.get("somatosensory.pdf");  
    // https://pub-979cb28270cc461d94bc8a169d8f389d.r2.dev/cat.jpeg    const cat = await env.R2.get("cat.jpeg");  
    return Response.json(      await env.AI.toMarkdown([        {          name: "somatosensory.pdf",          blob: new Blob([await pdf.arrayBuffer()], {            type: "application/octet-stream",          }),        },        {          name: "cat.jpeg",          blob: new Blob([await cat.arrayBuffer()], {            type: "application/octet-stream",          }),        },      ]),    );  },};  
```  
This is the result:  
```  
[  {    "name": "somatosensory.pdf",    "mimeType": "application/pdf",    "format": "markdown",    "tokens": 0,    "data": "# somatosensory.pdf\n## Metadata\n- PDFFormatVersion=1.4\n- IsLinearized=false\n- IsAcroFormPresent=false\n- IsXFAPresent=false\n- IsCollectionPresent=false\n- IsSignaturesPresent=false\n- Producer=Prince 20150210 (www.princexml.com)\n- Title=Anatomy of the Somatosensory System\n\n## Contents\n### Page 1\nThis is a sample document to showcase..."  },  {    "name": "cat.jpeg",    "mimeType": "image/jpeg",    "format": "markdown",    "tokens": 0,    "data": "The image is a close-up photograph of Grumpy Cat, a cat with a distinctive grumpy expression and piercing blue eyes. The cat has a brown face with a white stripe down its nose, and its ears are pointed upright. Its fur is light brown and darker around the face, with a pink nose and mouth. The cat's eyes are blue and slanted downward, giving it a perpetually grumpy appearance. The background is blurred, but it appears to be a dark brown color. Overall, the image is a humorous and iconic representation of the popular internet meme character, Grumpy Cat. The cat's facial expression and posture convey a sense of displeasure or annoyance, making it a relatable and entertaining image for many people."  }]  
```  
See [Markdown Conversion](https://developers.cloudflare.com/workers-ai/features/markdown-conversion/) for more information on supported formats, REST API and pricing.

Mar 18, 2025
1. ### [npm i agents](https://developers.cloudflare.com/changelog/post/2025-03-18-npm-i-agents/)  
[ Agents ](https://developers.cloudflare.com/agents/)[ Workers ](https://developers.cloudflare.com/workers/)  
![npm i agents](https://developers.cloudflare.com/_astro/npm-i-agents.CXCpJ1-7.apng)  
#### `agents-sdk` \-> `agents` Updated  
📝 **We've renamed the Agents package to `agents`**!  
If you've already been building with the Agents SDK, you can update your dependencies to use the new package name, and replace references to `agents-sdk` with `agents`:  
Terminal window  
```  
# Install the new packagenpm i agents  
```  
Terminal window  
```  
# Remove the old (deprecated) packagenpm uninstall agents-sdk  
# Find instances of the old package name in your codebasegrep -r 'agents-sdk' .# Replace instances of the old package name with the new one# (or use find-replace in your editor)sed -i 's/agents-sdk/agents/g' $(grep -rl 'agents-sdk' .)  
```  
All future updates will be pushed to the new `agents` package, and the older package has been marked as deprecated.  
#### Agents SDK updates New  
We've added a number of big new features to the Agents SDK over the past few weeks, including:

  * You can now set `cors: true` when using `routeAgentRequest` to return permissive default CORS headers to Agent responses.
  * The regular client now syncs state on the agent (just like the React version).
  * `useAgentChat` bug fixes for passing headers/credentials, including properly clearing cache on unmount.
  * Experimental `/schedule` module with a prompt/schema for adding scheduling to your app (with evals!).
  * Changed the internal `zod` schema to be compatible with the limitations of Google's Gemini models by removing the discriminated union, allowing you to use Gemini models with the scheduling API.  
We've also fixed a number of bugs with state synchronization and the React hooks.

  * [  JavaScript ](#tab-panel-3528)
  * [  TypeScript ](#tab-panel-3529)  
JavaScript  
```  
// via https://github.com/cloudflare/agents/tree/main/examples/cross-domainexport default {  async fetch(request, env) {    return (      // Set { cors: true } to enable CORS headers.      (await routeAgentRequest(request, env, { cors: true })) ||      new Response("Not found", { status: 404 })    );  },};  
```  
TypeScript  
```  
// via https://github.com/cloudflare/agents/tree/main/examples/cross-domainexport default {  async fetch(request: Request, env: Env) {    return (      // Set { cors: true } to enable CORS headers.      (await routeAgentRequest(request, env, { cors: true })) ||      new Response("Not found", { status: 404 })    );  },} satisfies ExportedHandler<Env>;  
```  
#### Call Agent methods from your client code New  
We've added a new [@unstable\_callable()](https://developers.cloudflare.com/agents/runtime/agents-api/) decorator for defining methods that can be called directly from clients. This allows you call methods from within your client code: you can call methods (with arguments) and get native JavaScript objects back.

  * [  JavaScript ](#tab-panel-3530)
  * [  TypeScript ](#tab-panel-3531)  
JavaScript  
```  
// server.tsimport { unstable_callable, Agent } from "agents";  
export class Rpc extends Agent {  // Use the decorator to define a callable method  @unstable_callable({    description: "rpc test",  })  async getHistory() {    return this.sql`SELECT * FROM history ORDER BY created_at DESC LIMIT 10`;  }}  
```  
TypeScript  
```  
// server.tsimport { unstable_callable, Agent, type StreamingResponse } from "agents";import type { Env } from "../server";  
export class Rpc extends Agent<Env> {  // Use the decorator to define a callable method  @unstable_callable({    description: "rpc test",  })  async getHistory() {    return this.sql`SELECT * FROM history ORDER BY created_at DESC LIMIT 10`;  }}  
```  
#### agents-starter Updated  
We've fixed a number of small bugs in the [agents-starter ↗](https://github.com/cloudflare/agents-starter) project — a real-time, chat-based example application with tool-calling & human-in-the-loop built using the Agents SDK. The starter has also been upgraded to use the latest [wrangler v4](https://developers.cloudflare.com/changelog/2025-03-13-wrangler-v4/) release.  
If you're new to Agents, you can install and run the `agents-starter` project in two commands:  
Terminal window  
```  
# Install it$ npm create cloudflare@latest agents-starter -- --template="cloudflare/agents-starter"# Run it$ npm run start  
```  
You can use the starter as a template for your own Agents projects: open up `src/server.ts` and `src/client.tsx` to see how the Agents SDK is used.  
#### More documentation Updated  
We've heard your feedback on the Agents SDK documentation, and we're shipping more API reference material and usage examples, including:

  * Expanded [API reference documentation](https://developers.cloudflare.com/agents/runtime/), covering the methods and properties exposed by the Agents SDK, as well as more usage examples.
  * More [Client API](https://developers.cloudflare.com/agents/runtime/agents-api/#client-api) documentation that documents `useAgent`, `useAgentChat` and the new `@unstable_callable` RPC decorator exposed by the SDK.
  * New documentation on how to [route requests to agents](https://developers.cloudflare.com/agents/runtime/communication/routing/) and (optionally) authenticate clients before they connect to your Agents.  
Note that the Agents SDK is continually growing: the type definitions included in the SDK will always include the latest APIs exposed by the `agents` package.  
If you're still wondering what Agents are, [read our blog on building AI Agents on Cloudflare ↗](https://blog.cloudflare.com/build-ai-agents-on-cloudflare/) and/or visit the [Agents documentation](https://developers.cloudflare.com/agents/) to learn more.

Mar 17, 2025
1. ### [New models in Workers AI](https://developers.cloudflare.com/changelog/post/2025-03-17-new-workers-ai-models/)  
[ Workers AI ](https://developers.cloudflare.com/workers-ai/)  
Workers AI is excited to add 4 new models to the catalog, including 2 brand new classes of models with a text-to-speech and reranker model. Introducing:

  * [@cf/baai/bge-m3](https://developers.cloudflare.com/workers-ai/models/bge-m3/) \- a multi-lingual embeddings model that supports over 100 languages. It can also simultaneously perform dense retrieval, multi-vector retrieval, and sparse retrieval, with the ability to process inputs of different granularities.
  * [@cf/baai/bge-reranker-base](https://developers.cloudflare.com/workers-ai/models/bge-reranker-base/) \- our first reranker model! Rerankers are a type of text classification model that takes a query and context, and outputs a similarity score between the two. When used in RAG systems, you can use a reranker after the initial vector search to find the most relevant documents to return to a user by reranking the outputs.
  * [@cf/openai/whisper-large-v3-turbo](https://developers.cloudflare.com/workers-ai/models/whisper-large-v3-turbo/) \- a faster, more accurate speech-to-text model. This model was added earlier but is graduating out of beta with pricing included today.
  * [@cf/myshell-ai/melotts](https://developers.cloudflare.com/workers-ai/models/melotts/) \- our first text-to-speech model that allows users to generate an MP3 with voice audio from inputted text.  
Pricing is available for each of these models on the [Workers AI pricing page](https://developers.cloudflare.com/workers-ai/platform/pricing/).  
This docs update includes a few minor bug fixes to the model schema for llama-guard, llama-3.2-1b, which you can review on the [product changelog](https://developers.cloudflare.com/workers-ai/changelog/).  
Try it out and let us know what you think! Stay tuned for more models in the coming days.

Feb 27, 2025
1. ### [New REST API is in open beta!](https://developers.cloudflare.com/changelog/post/2025-02-27-br-rest-api-beta/)  
[ Browser Run ](https://developers.cloudflare.com/browser-run/)  
We've released a new REST API for [Browser Rendering](https://developers.cloudflare.com/browser-run/) in open beta, making interacting with browsers easier than ever. This new API provides endpoints for common browser actions, with more to be added in the future.  
With the **REST API** you can:

  * **Capture screenshots** – Use `/screenshot` to take a screenshot of a webpage from provided URL or HTML.
  * **Generate PDFs** – Use `/pdf` to convert web pages into PDFs.
  * **Extract HTML content** – Use `/content` to retrieve the full HTML from a page. **Snapshot (HTML + Screenshot)** – Use `/snapshot` to capture both the page's HTML and a screenshot in one request
  * **Scrape Web Elements** – Use `/scrape` to extract specific elements from a page.  
For example, to capture a screenshot:  
Screenshot example  
```  
curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/screenshot' \  -H 'Authorization: Bearer <apiToken>' \  -H 'Content-Type: application/json' \  -d '{    "html": "Hello World!",    "screenshotOptions": {      "type": "webp",      "omitBackground": true    }  }' \  --output "screenshot.webp"  
```  
Learn more in our [documentation](https://developers.cloudflare.com/browser-run/quick-actions/).

Feb 26, 2025
1. ### [Introducing Guardrails in AI Gateway](https://developers.cloudflare.com/changelog/post/2025-02-26-guardrails/)  
[ AI Gateway ](https://developers.cloudflare.com/ai-gateway/)  
[AI Gateway](https://developers.cloudflare.com/ai-gateway/) now includes [Guardrails](https://developers.cloudflare.com/ai-gateway/features/guardrails/), to help you monitor your AI apps for harmful or inappropriate content and deploy safely.  
Within the AI Gateway settings, you can configure:

  * **Guardrails**: Enable or disable content moderation as needed.
  * **Evaluation scope**: Select whether to moderate user prompts, model responses, or both.
  * **Hazard categories**: Specify which categories to monitor and determine whether detected inappropriate content should be blocked or flagged.  
![Guardrails in AI Gateway](https://developers.cloudflare.com/_astro/Guardrails.BTNc0qeC_Z1HC20z.webp)  
Learn more in the [blog ↗](https://blog.cloudflare.com/guardrails-in-ai-gateway/) or our [documentation](https://developers.cloudflare.com/ai-gateway/features/guardrails/).

Feb 25, 2025
1. ### [Introducing the Agents SDK](https://developers.cloudflare.com/changelog/post/2025-02-25-agents-sdk/)  
[ Agents ](https://developers.cloudflare.com/agents/)[ Workers ](https://developers.cloudflare.com/workers/)  
We've released the [Agents SDK ↗](http://blog.cloudflare.com/build-ai-agents-on-cloudflare/), a package and set of tools that help you build and ship AI Agents.  
You can get up and running with a [chat-based AI Agent ↗](https://github.com/cloudflare/agents-starter) (and deploy it to Workers) that uses the Agents SDK, tool calling, and state syncing with a React-based front-end by running the following command:  
Terminal window  
```  
npm create cloudflare@latest agents-starter -- --template="cloudflare/agents-starter"# open up README.md and follow the instructions  
```  
You can also add an Agent to any existing Workers application by installing the `agents` package directly  
Terminal window  
```  
npm i agents  
```  
... and then define your first Agent:  
TypeScript  
```  
import { Agent } from "agents";  
export class YourAgent extends Agent<Env> {  // Build it out  // Access state on this.state or query the Agent's database via this.sql  // Handle WebSocket events with onConnect and onMessage  // Run tasks on a schedule with this.schedule  // Call AI models  // ... and/or call other Agents.}  
```  
Head over to the [Agents documentation](https://developers.cloudflare.com/agents/) to learn more about the Agents SDK, the SDK APIs, as well as how to test and deploying agents to production.

Feb 25, 2025
1. ### [Workers AI now supports structured JSON outputs.](https://developers.cloudflare.com/changelog/post/2025-02-25-json-mode/)  
[ Workers AI ](https://developers.cloudflare.com/workers-ai/)  
Workers AI now supports structured JSON outputs with [JSON mode](https://developers.cloudflare.com/workers-ai/features/json-mode/), which allows you to request a structured output response when interacting with AI models.  
This makes it much easier to retrieve structured data from your AI models, and avoids the (error prone!) need to parse large unstructured text responses to extract your data.  
JSON mode in Workers AI is compatible with the OpenAI SDK's [structured outputs ↗](https://platform.openai.com/docs/guides/structured-outputs) `response_format` API, which can be used directly in a Worker:

  * [  JavaScript ](#tab-panel-3536)
  * [  TypeScript ](#tab-panel-3537)  
JavaScript  
```  
import { OpenAI } from "openai";  
// Define your JSON schema for a calendar eventconst CalendarEventSchema = {  type: "object",  properties: {    name: { type: "string" },    date: { type: "string" },    participants: { type: "array", items: { type: "string" } },  },  required: ["name", "date", "participants"],};  
export default {  async fetch(request, env) {    const client = new OpenAI({      apiKey: env.OPENAI_API_KEY,      // Optional: use AI Gateway to bring logs, evals & caching to your AI requests      // https://developers.cloudflare.com/ai-gateway/usage/providers/openai/      // baseUrl: "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai"    });  
    const response = await client.chat.completions.create({      model: "gpt-4o-2024-08-06",      messages: [        { role: "system", content: "Extract the event information." },        {          role: "user",          content: "Alice and Bob are going to a science fair on Friday.",        },      ],      // Use the `response_format` option to request a structured JSON output      response_format: {        // Set json_schema and provide ra schema, or json_object and parse it yourself        type: "json_schema",        schema: CalendarEventSchema, // provide a schema      },    });  
    // This will be of type CalendarEventSchema    const event = response.choices[0].message.parsed;  
    return Response.json({      calendar_event: event,    });  },};  
```  
TypeScript  
```  
import { OpenAI } from "openai";  
interface Env {  OPENAI_API_KEY: string;}  
// Define your JSON schema for a calendar eventconst CalendarEventSchema = {  type: "object",  properties: {    name: { type: "string" },    date: { type: "string" },    participants: { type: "array", items: { type: "string" } },  },  required: ["name", "date", "participants"],};  
export default {  async fetch(request: Request, env: Env) {    const client = new OpenAI({      apiKey: env.OPENAI_API_KEY,      // Optional: use AI Gateway to bring logs, evals & caching to your AI requests      // https://developers.cloudflare.com/ai-gateway/usage/providers/openai/      // baseUrl: "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai"    });  
    const response = await client.chat.completions.create({      model: "gpt-4o-2024-08-06",      messages: [        { role: "system", content: "Extract the event information." },        {          role: "user",          content: "Alice and Bob are going to a science fair on Friday.",        },      ],      // Use the `response_format` option to request a structured JSON output      response_format: {        // Set json_schema and provide ra schema, or json_object and parse it yourself        type: "json_schema",        schema: CalendarEventSchema, // provide a schema      },    });  
    // This will be of type CalendarEventSchema    const event = response.choices[0].message.parsed;  
    return Response.json({      calendar_event: event,    });  },};  
```  
To learn more about JSON mode and structured outputs, visit the [Workers AI documentation](https://developers.cloudflare.com/workers-ai/features/json-mode/).

Feb 24, 2025
1. ### [Workers AI larger context windows](https://developers.cloudflare.com/changelog/post/2025-02-24-context-windows/)  
[ Workers AI ](https://developers.cloudflare.com/workers-ai/)  
We've updated the Workers AI text generation models to include context windows and limits definitions and changed our APIs to estimate and validate the number of tokens in the input prompt, not the number of characters.  
This update allows developers to use larger context windows when interacting with Workers AI models, which can lead to better and more accurate results.  
Our [catalog page](https://developers.cloudflare.com/workers-ai/models/) provides more information about each model's supported context window.

Feb 20, 2025
1. ### [Workers AI updated pricing](https://developers.cloudflare.com/changelog/post/2025-02-20-updated-pricing-docs/)  
[ Workers AI ](https://developers.cloudflare.com/workers-ai/)  
We've updated the Workers AI [pricing](https://developers.cloudflare.com/workers-ai/platform/pricing/) to include the latest models and how model usage maps to Neurons.

  * Each model's core input format(s) (tokens, audio seconds, images, etc) now include mappings to Neurons, making it easier to understand how your included Neuron volume is consumed and how you are charged at scale
  * Per-model pricing, instead of the previous bucket approach, allows us to be more flexible on how models are charged based on their size, performance and capabilities. As we optimize each model, we can then pass on savings for that model.
  * You will still only pay for what you consume: Workers AI inference is serverless, and not billed by the hour.  
Going forward, models will be launched with their associated Neuron costs, and we'll be updating the Workers AI dashboard and API to reflect consumption in both raw units and Neurons. Visit the [Workers AI pricing](https://developers.cloudflare.com/workers-ai/platform/pricing/) page to learn more about Workers AI pricing.

Feb 14, 2025
1. ### [Build AI Agents with Example Prompts](https://developers.cloudflare.com/changelog/post/2025-02-14-example-ai-prompts/)  
[ Agents ](https://developers.cloudflare.com/agents/)[ Workers ](https://developers.cloudflare.com/workers/)[ Workflows ](https://developers.cloudflare.com/workflows/)  
We've added an [example prompt](https://developers.cloudflare.com/workers/get-started/prompting/) to help you get started with building AI agents and applications on Cloudflare [Workers](https://developers.cloudflare.com/workers/), including [Workflows](https://developers.cloudflare.com/workflows/), [Durable Objects](https://developers.cloudflare.com/durable-objects/), and [Workers KV](https://developers.cloudflare.com/kv/).  
You can use this prompt with your favorite AI model, including Claude 3.5 Sonnet, OpenAI's o3-mini, Gemini 2.0 Flash, or Llama 3.3 on Workers AI. Models with large context windows will allow you to paste the prompt directly: provide your own prompt within the `<user_prompt></user_prompt>` tags.  
Terminal window  
```  
{paste_prompt_here}<user_prompt>user: Build an AI agent using Cloudflare Workflows. The Workflow should run when a new GitHub issue is opened on a specific project with the label 'help' or 'bug', and attempt to help the user troubleshoot the issue by calling the OpenAI API with the issue title and description, and a clear, structured prompt that asks the model to suggest 1-3 possible solutions to the issue. Any code snippets should be formatted in Markdown code blocks. Documentation and sources should be referenced at the bottom of the response. The agent should then post the response to the GitHub issue. The agent should run as the provided GitHub bot account.</user_prompt>  
```  
This prompt is still experimental, but we encourage you to try it out and [provide feedback ↗](https://github.com/cloudflare/cloudflare-docs/issues/new?template=content.edit.yml).

Feb 06, 2025
1. ### [Request timeouts and retries with AI Gateway](https://developers.cloudflare.com/changelog/post/2025-02-05-aig-request-handling/)  
[ AI Gateway ](https://developers.cloudflare.com/ai-gateway/)  
AI Gateway adds additional ways to handle requests - [Request Timeouts](https://developers.cloudflare.com/ai-gateway/configuration/request-handling/#request-timeouts) and [Request Retries](https://developers.cloudflare.com/ai-gateway/configuration/request-handling/#request-retries), making it easier to keep your applications responsive and reliable.  
Timeouts and retries can be used on both the [Universal Endpoint](https://developers.cloudflare.com/ai-gateway/usage/universal/) or directly to a [supported provider](https://developers.cloudflare.com/ai-gateway/usage/providers/).

**Request timeouts**A [request timeout](https://developers.cloudflare.com/ai-gateway/configuration/request-handling/#request-timeouts) allows you to trigger [fallbacks](https://developers.cloudflare.com/ai-gateway/configuration/fallbacks/) or a retry if a provider takes too long to respond.  
To set a request timeout directly to a provider, add a `cf-aig-request-timeout` header.  
Provider-specific endpoint example  
```  
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/meta/llama-3.1-8b-instruct \ --header 'Authorization: Bearer {cf_api_token}' \ --header 'Content-Type: application/json' \ --header 'cf-aig-request-timeout: 5000' --data '{"prompt": "What is Cloudflare?"}'  
```

**Request retries**A [request retry](https://developers.cloudflare.com/ai-gateway/configuration/request-handling/#request-retries) automatically retries failed requests, so you can recover from temporary issues without intervening.  
To set up request retries directly to a provider, add the following headers:

  * cf-aig-max-attempts (number)
  * cf-aig-retry-delay (number)
  * cf-aig-backoff ("constant" | "linear" | "exponential)

Feb 05, 2025
1. ### [AI Gateway adds Cerebras, ElevenLabs, and Cartesia as new providers](https://developers.cloudflare.com/changelog/post/2025-02-04-aig-provider-cartesia-eleven-cerebras/)  
[ AI Gateway ](https://developers.cloudflare.com/ai-gateway/)  
[AI Gateway](https://developers.cloudflare.com/ai-gateway/) has added three new providers: [Cartesia](https://developers.cloudflare.com/ai-gateway/usage/providers/cartesia/), [Cerebras](https://developers.cloudflare.com/ai-gateway/usage/providers/cerebras/), and [ElevenLabs](https://developers.cloudflare.com/ai-gateway/usage/providers/elevenlabs/), giving you more even more options for providers you can use through AI Gateway. Here's a brief overview of each:

  * [Cartesia](https://developers.cloudflare.com/ai-gateway/usage/providers/cartesia/) provides text-to-speech models that produce natural-sounding speech with low latency.
  * [Cerebras](https://developers.cloudflare.com/ai-gateway/usage/providers/cerebras/) delivers low-latency AI inference to Meta's Llama 3.1 8B and Llama 3.3 70B models.
  * [ElevenLabs](https://developers.cloudflare.com/ai-gateway/usage/providers/elevenlabs/) offers text-to-speech models with human-like voices in 32 languages.  
![Example of Cerebras log in AI Gateway](https://developers.cloudflare.com/_astro/cerebras2.qHYP0ZnF_XMtnx.webp)  
To get started with AI Gateway, just update the base URL. Here's how you can send a request to [Cerebras](https://developers.cloudflare.com/ai-gateway/usage/providers/cerebras/) using cURL:  
Example fetch request  
```  
curl -X POST https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY/cerebras/chat/completions \ --header 'content-type: application/json' \ --header 'Authorization: Bearer CEREBRAS_TOKEN' \ --data '{    "model": "llama-3.3-70b",    "messages": [        {            "role": "user",            "content": "What is Cloudflare?"        }    ]}'  
```

Jan 30, 2025
1. ### [AI Gateway Introduces New Worker Binding Methods](https://developers.cloudflare.com/changelog/post/2025-01-26-worker-binding-methods/)  
[ AI Gateway ](https://developers.cloudflare.com/ai-gateway/)  
We have released new [Workers bindings API methods](https://developers.cloudflare.com/ai-gateway/usage/worker-binding-methods/), allowing you to connect Workers applications to AI Gateway directly. These methods simplify how Workers calls AI services behind your AI Gateway configurations, removing the need to use the REST API and manually authenticate.  
To add an AI binding to your Worker, include the following in your [Wrangler configuration file](https://developers.cloudflare.com/workers/wrangler/configuration/):  
![Add an AI binding to your Worker.](https://developers.cloudflare.com/_astro/add-binding.BoYTiyon_ZjdDNx.webp)  
With the new AI Gateway binding methods, you can now:

  * Send feedback and update metadata with `patchLog`.
  * Retrieve detailed log information using `getLog`.
  * Execute [universal requests](https://developers.cloudflare.com/ai-gateway/usage/universal/) to any AI Gateway provider with `run`.  
For example, to send feedback and update metadata using `patchLog`:  
![Send feedback and update metadata using patchLog:](https://developers.cloudflare.com/_astro/send-feedback.BGRzKmd9_NDVos.webp)

Jan 30, 2025
1. ### [Increased Browser Rendering limits!](https://developers.cloudflare.com/changelog/post/2025-01-30-browser-rendering-more-instances/)  
[ Workers ](https://developers.cloudflare.com/workers/)[ Browser Run ](https://developers.cloudflare.com/browser-run/)  
[Browser Rendering](https://developers.cloudflare.com/browser-run/) now supports 10 concurrent browser instances per account _and_ 10 new instances per minute, up from the previous limits of 2.  
This allows you to launch more browser tasks from [Cloudflare Workers](https://developers.cloudflare.com/workers).  
To manage concurrent browser sessions, you can use [Queues](https://developers.cloudflare.com/queues/) or [Workflows](https://developers.cloudflare.com/workflows/):

  * [  JavaScript ](#tab-panel-3534)
  * [  TypeScript ](#tab-panel-3535)  
index.js  
```  
export default {  async queue(batch, env) {    for (const message of batch.messages) {      const browser = await puppeteer.launch(env.BROWSER);      const page = await browser.newPage();  
      try {        await page.goto(message.url, {          waitUntil: message.waitUntil,        });        // Process page...      } finally {        await browser.close();      }    }  },};  
```  
index.ts  
```  
interface QueueMessage {  url: string;  waitUntil: number;}  
export interface Env {  BROWSER_QUEUE: Queue<QueueMessage>;  BROWSER: Fetcher;}  
export default {  async queue(batch: MessageBatch<QueueMessage>, env: Env): Promise<void> {    for (const message of batch.messages) {      const browser = await puppeteer.launch(env.BROWSER);      const page = await browser.newPage();  
      try {        await page.goto(message.url, {          waitUntil: message.waitUntil,        });        // Process page...      } finally {        await browser.close();      }    }  },};  
```

```json
{"@context":"https://schema.org","@type":"BlogPosting","@id":"https://developers.cloudflare.com/changelog/product-group/ai/5/#page","headline":"AI Changelog | Cloudflare Docs","url":"https://developers.cloudflare.com/changelog/product-group/ai/5/","inLanguage":"en","image":"https://developers.cloudflare.com/cf-twitter-card.png","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
```
