The Cloudflare Blog

Dynamic, identity-aware, and secure Sandbox auth

Mike Nomitch — Mon, 13 Apr 2026 13:00:00 GMT

As AI Large Language Models and harnesses like OpenCode and Claude Code become increasingly capable, we see more users kicking off sandboxed agents in response to chat messages, Kanban updates, vibe coding UIs, terminal sessions, GitHub comments, and more.

The sandbox is an important step beyond simple containers, because it gives you a few things:

Security: Any untrusted end user (or a rogue LLM) can run in the sandbox and not compromise the host machine or other sandboxes running alongside it. This is traditionally (but not always) accomplished with a microVM.
Speed: An end user should be able to pick up a new sandbox quickly and restore the state from a previously used one quickly.
Control: The trusted platform needs to be able to take actions within the untrusted domain of the sandbox. This might mean mounting files in the sandbox, or controlling which requests access it, or executing specific commands.

Today, we’re excited to add another key component of control to our Sandboxes and all Containers: outbound Workers. These are programmatic egress proxies that allow users running sandboxes to easily connect to different services, add observability, and, importantly for agents, add flexible and safe authentication.

How it works

Here’s a quick look at adding a secret key to a header using an outbound Worker:

class OpenCodeInABox extends Sandbox {
  static outboundByHost = {
    "github.com": (request, env, ctx) => {
      const headersWithAuth = new Headers(request.headers);
      headersWithAuth.set("x-auth-token", env.SECRET);
      return fetch(request, { headers: headersWithAuth });
    }
  }
}

Any time code running in the sandbox makes a request to “github.com”, the request is proxied through the handler. This allows you to do anything you want on each request, including logging, modifying, or cancelling it. In this case, we’re safely injecting a secret (more on this later). The proxy runs on the same machine as any sandbox, has access to distributed state, and can be easily modified with simple JavaScript.

We’re excited about all the possibilities this adds to Sandboxes, especially around authentication for agents. Before going into details, let’s back up and take a quick tour of traditional forms of auth, and why we think there’s something better.

Common auth for agentic workloads

The core issue with agentic auth is that we can’t fully trust the workload. While our LLMs aren’t nefarious (at least not yet), we still need to be able to apply protections to ensure they don’t use data inappropriately or take actions they shouldn’t.

A few common methods exist to provide auth to agents, and each has downsides:

Standard API tokens are the most basic method of authentication, typically injected into applications via environment variables or in mounted secrets files. This is the arguably simplest method, but least secure. You have to trust that the sandbox won’t somehow be compromised or accidentally exfiltrate the token while making a request. Since you can’t fully trust the agent, you’ll need to set up token expiry and rotation, which can be a hassle.

Workload identity tokens, such as OIDC tokens, can solve some of this pain. Rather than granting the agent a token with general permissions, you can grant it a token that attests its identity. Now, rather than the agent having direct access to some service with a token, it can exchange an identity token for a very short-lived access token. The OIDC token can be invalidated after a specific agent’s workflow completes, and expiry is easier to manage. One of the biggest downsides of workload identity tokens is the potential inflexibility of integrations. Many services don’t have first-class support for OIDC, so in order to get working integrations with upstream services, platforms will need to roll their own token-exchanging services. This makes adoption difficult.

Custom proxies provide maximum flexibility, and can be paired with workload identity tokens. If you can pass some or all of your sandbox egress through a trusted piece of code, you can insert whatever rules you need. Maybe the upstream service your agent is communicating with has a bad RBAC story, and it can’t provide granular permissions. No problem, just write the controls and permissions yourself! This is a great option for agents that you need to lock down with granular controls. However, how do you intercept all of a sandbox’s traffic? How do you set up a proxy that is dynamic and easily programmable? How do you proxy traffic efficiently? These aren’t easy problems to solve.

With those imperfect methods in mind, what does an ideal auth mechanism look like?

Ideally, it is:

Zero trust. No token is ever granted to an untrusted user for any amount of time.
Simple. Easy to author. Doesn’t involve a complex system of minting, rotating, and decrypting tokens.
Flexible. We don’t rely on the upstream system to provide the granular access we need. We can apply whatever rules we want.
Identity-aware. We can identify the sandbox making the call and apply specific rules for it.
Observable. We can easily gather information about what calls are being made.
Performant. We aren’t round-tripping to a centralized or slow source of truth.
Transparent. The sandboxed workload doesn’t have to know about it. Things just work.
Dynamic. We can change rules on the fly.

We believe outbound Workers for Sandboxes fit the bill on all of these. Let’s see how.

Outbound Workers in practice

Basics: restriction and observability

First, we’ll look at a very basic example: logging requests and denying specific actions.

In this case, we’ll use the outbound function, which intercepts all outgoing HTTP requests from the sandbox. With a few lines of JavaScript, it’s easy to ensure only GETs are made and log then deny any disallowed methods.

class MySandboxedApp extends Sandbox {
  static outbound = (req, env, ctx) => {
    // Deny any non-GET action and log
    if (req.method !== 'GET') {
      console.log(`Container making ${req.method} request to: ${req.url}`);
      return new Response('Not Allowed', { status: 405, statusText: 'Method Not Allowed'});
    }

    // Proceed if it is a GET request
    return fetch(req);
  };
}

This proxy runs on Workers and runs on the same machine as the sandboxed VM. Workers were built for quick response times, often sitting in front of cached CDN traffic, so additional latency is extremely minimal.

Because this is running on Workers, we get observability out of the box. You can view logs and outbound requests in the Workers dashboard or export them to your application performance monitoring tool of choice.

Zero trust credential injection

How would we use this to enforce a zero trust environment for our agent? Let’s imagine we want to make a request to a private GitHub instance, but we never want our LLM to access a private token.

We can use outboundByHost to define functions for specific domains or IPs. In this case, we’ll inject a protected credential if the domain is “my-internal-vcs.dev”. The sandboxed agent never has access to these credentials.

class OpenCodeInABox extends Sandbox {
  static outboundByHost = {
    "my-internal-vcs.dev": (request, env, ctx) => {
      const headersWithAuth = new Headers(request.headers);
      headersWithAuth.set("x-auth-token", env.SECRET);
      return fetch(request, { headers: headersWithAuth });
    }
  }
}

It is also easy to conditionalize the response based on the identity of the container. You don’t have to inject the same tokens for every sandbox instance.

 static outboundByHost = {
  "my-internal-vcs.dev": (request, env, ctx) => {
    // note: KV is encrypted at rest and in transit
    const authKey = await env.KEYS.get(ctx.containerId);

    const requestWithAuth = new Request(request);
    requestWithAuth.headers.set("x-auth-token", authKey);
    return fetch(requestWithAuth);
  }
}

Using the Cloudflare Developer Platform

As you may have noticed in the last example, another major advantage of using outbound Workers is that it makes integration into the Workers ecosystem easier. Previously, if a user wanted to access R2, they would have to inject an R2 credential, then make a call from their container to the public R2 API. Same for KV, Agents, other Containers, other Worker services, etc.

Now, you just call any binding from your outbound Workers.

class MySandboxedApp extends Sandbox {
  static outboundByHost = {
    "my.kv": async (req, env, ctx) => {
      const key = keyFromReq(req);
      const myResult = await env.KV.get(key);
      return new Response(myResult);
    },
    "objects.cf": async (req, env, ctx) => {
      const prefix = ctx.containerId
      const path = pathFromRequest(req);
      const object = await env.R2.get(`${prefix}/${path}`);
      const myResult = await env.KV.get(key);
      return new Response(myResult);
    },
  };
}

Rather than parsing tokens and setting up policies, we can easily conditionalize access with code and whatever logic we want. In the R2 example, we also were able to use the sandbox’s ID to further scope access with ease.

Making controls dynamic

Networking control should also be dynamic. On many platforms, config for Container and VM networking is static, looking something like this:

{
  defaultEgress: "block",
  allowedDomains: ["github.com", "npmjs.org"]
}

This is better than nothing, but we can do better. For many sandboxes, we might want to apply a policy on start, but then override it with another once specific operations have been performed.

For instance, we can boot a sandbox, grab our dependencies via NPM and Github, and then lock down egress after that. This ensures that we open up the network for as little time as possible.

To achieve this, we can use outboundHandlers, which allows us to define arbitrary outbound handlers that can be applied programmatically using the setOutboundHandler method. Each of these also takes params, allowing you to customize behavior from code. In this case, we will allow some hostnames with the custom “allowHosts” policy, then turn off HTTP.

class MySandboxedApp extends Sandbox {
  static outboundHandlers = {
    async allowHosts(req, env, { params }) {
     const url = new URL(request.url);
     const allowedHostname = params.allowedHostnames.includes(url.hostname);

      if (allowedHostname) {
        return await fetch(newRequest);
      } else {
        return new Response(null, { status: 403, statusText: "Forbidden" });
      }
    }
    
    async noHttp(req) {
      return new Response(null, { status: 403, statusText: "Forbidden" });
    }
  }
}

async setUpSandboxes(req, env) {
  const sandbox = await env.SANDBOX.getByName(userId);
  await sandbox.setOutboundHandler("allowHosts", {
    allowedHostnames: ["github.com", "npmjs.org"]
  });
  await sandbox.gitClone(userRepoURL)
  await sandbox.exec("npm install")
  await sandbox.setOutboundHandler("noHttp");
}

This could be extended even further. Your agent might ask the end user a question like “Do you want to allow POST requests to cloudflare.com?” based on whatever tools it needs at that time. With dynamic outbound Workers, you can easily modify the sandbox rules on the fly to provide this level of control.

TLS support with MITM Proxying

To do anything useful with requests beyond allowing or denying them, you need to have access to the content. This means that if you’re making HTTPS requests, they need to be decrypted by the Workers proxy.

To achieve this, a unique ephemeral certificate authority (CA) and private key are created for each Sandbox instance, and the CA is placed into the sandbox. By default, sandbox instances will trust this CA, while standard container instances can opt into trusting it, for instance by calling sudo update-ca-certificates.

export class MyContainer extends Container {
  interceptHttps = true;
}

MyContainer.outbound = (req, env, ctx) => {
  // All HTTP(S) requests will trigger this hook.
  return fetch(req);
};

TLS traffic is proxied by a Cloudflare isolated network process by performing a TLS handshake. It creates a leaf CA from an ephemeral and unique private key and uses the SNI extracted in the ClientHello. It will then invoke in the same machine the configured Worker to handle the HTTPS request.

Our ephemeral private key and CA will never leave our container runtime sidecar process, and is never shared across other container sidecar processes.

With this in place, outbound Workers act as a truly transparent proxy. The sandbox doesn't need any awareness of specific protocols or domains — all HTTP and HTTPS traffic flows through the outbound handler for filtering or modification.

Under the hood

To enable the functionality shown above in both Container and Sandbox, we added new methods to the ctx.container object: interceptOutboundHttp and interceptOutboundHttps, which intercept outgoing requests on specific hostnames (with basic glob matching), IP ranges, and it can be used to intercept all outbound requests. These methods are called with a WorkerEntrypoint, which gets set up as the front door to the outbound Worker.

export class MyWorker extends WorkerEntrypoint {
 fetch() {
   return new Response(this.ctx.props.message);
 }
}

// ... inside your container DurableObject ...
this.ctx.container.start({ enableInternet: false });
const outboundWorker = this.ctx.exports.MyWorker({ props: { message: 'hello' } });
await this.ctx.container.interceptOutboundHttp('15.0.0.1:80', outboundWorker);

// From now on, all HTTP requests to 15.0.0.1:80 return "hello"
await this.waitForContainerToBeHealthy();

// You can decide to return another message now...
const secondOutboundWorker = this.ctx.exports.MyWorker({ props: { message: 'switcheroo' } });
await this.ctx.container.interceptOutboundHttp('15.0.0.1:80', secondOutboundWorker);
// all HTTP requests to 15.0.0.1 now show "switcheroo", even on connections that were
// open before this interceptOutboundHttp

// You can even set hostnames, CIDRs, for both IPv4 and IPv6
await this.ctx.container.interceptOutboundHttp('example.com', secondOutboundWorker);
await this.ctx.container.interceptOutboundHttp('*.example.com', secondOutboundWorker);
await this.ctx.container.interceptOutboundHttp('123.123.123.123/23', secoundOutboundWorker);

All proxying to Workers happens locally on the same machine that runs the sandbox VM. Even though communication between container and Worker is “authless”, it is secure.

These methods can be called at any time, before or after starting the container, even while connections are still open. Connections that send multiple HTTP requests will automatically pick up a new entrypoint, so updating outbound Workers will not break existing TCP connections or interrupt HTTP requests.

Local development with wrangler dev also has support for egress interception. To make it possible, we automatically spawn a sidecar process inside the local container’s network namespace. We called this sidecar component proxy-everything. Once proxy-everything is attached, it applies the appropriate TPROXY nftable rules, routing matching traffic from the local Container to workerd, Cloudflare’s open source JavaScript runtime, which runs the outbound Worker. This allows the local development experience to mirror what happens in prod, so testing and development remain simple.

Giving outbound Workers a try

If you haven’t tried Cloudflare Sandboxes, check out the Getting Started guide. If you are a current user of Containers or Sandboxes, start using outbound Workers now by reading the documentation and upgrading to @cloudflare/containers@0.3.0 or @cloudflare/sandbox@0.8.9.

Watch on Cloudflare TV

Containers are available in public beta for simple, global, and programmable compute

Gabi Villalonga Simón — Tue, 24 Jun 2025 16:00:22 GMT

We’re excited to announce that Cloudflare Containers are now available in beta for all users on paid plans.

You can now run new kinds of applications alongside your Workers. From media and data processing at the edge, to backend services in any language, to CLI tools in batch workloads — Containers open up a world of possibilities.

Containers are tightly integrated with Workers and the rest of the developer platform, which means that:

Your workflow stays simple: just define a Container in a few lines of code, and run wrangler deploy, just like you would with a Worker.
Containers are global: as with Workers, you just deploy to Region:Earth. No need to manage configs across 5 different regions for a global app.
You can use the right tool for the job: routing requests between Workers and Containers is easy. Use a Worker when you need to be ultra light-weight and scalable. Use a Container when you need more power and flexibility.
Containers are programmable: container instances are spun up on-demand and controlled by Workers code. If you need custom logic, just write some JavaScript instead of spending time chaining together API calls or writing Kubernetes operators.

Want to try it today? Deploy your first Container-enabled Worker:

A tour of Containers

Let’s take a deeper look at Containers, using an example use case: code sandboxing.

Let’s imagine that you want to run user-generated (or AI-generated) code as part of a platform you’re building. To do this, you want to spin up containers on demand. Each user needs their own isolated container, the users are distributed globally, and you need to start each container quickly so the users aren’t waiting.

You can set this up easily on Cloudflare Containers.

Configuring a Container

In your Worker, use the Container class and wrangler.jsonc to declare some basic configuration, such as your Container’s default port, a sleep timeout, and which image to use, then route to it via the Worker.

For each unique ID passed to the Container’s binding, Cloudflare will spin up a new Container instance and route requests to it. When a new instance is requested, Cloudflare picks the best location across our global network where we’ve pre-provisioned a ready-to-go container. This means that you can deploy a container close to an end user no matter where they are. And the initial container start takes just a few seconds. You don’t have to worry about routing, provisioning, or scaling.

This example Worker will route requests to a unique container instance for each sandbox ID given at the path /sandbox/ID and will be handled by standard Worker JavaScript otherwise:

export class MyContainer extends Container {
  defaultPort = 8080; // The default port for the container to listen on
  sleepAfter = '5m'; // Sleep the container if no requests are made in this timeframe
}

export default {
  async fetch(request, env) {
    const pathname = new URL(request.url).pathname;

    // handle request with an on-demand container instance
    if (pathname.startsWith('/sandbox/')) {
      const sessionId = pathname.split("/")[2]
      const containerInstance = getContainer(env.CONTAINER_SANDBOX, sessionId)
      return await containerInstance.fetch(request);
    }

    // handle request with my Worker code otherwise
    return myWorkerRequestHandler(request);
  },
};

Familiar and easy development workflow with `wrangler dev`

To configure which container image to use, you can provide an image URL in wrangler config or a path to a local Dockerfile.

This config tells wrangler to use a locally defined image:

"containers": [
  {
    "class_name": "ContainerSandbox",
    "image": "./Dockerfile",
    "max_instances": 80,
    "instance_type": "basic"
  }
]

When developing your application, you just run wrangler dev and the container image will be automatically built and routable via your local Worker. This makes it easy to iterate on container code while making changes to your Worker at the same time. When you want to rebuild your image, just press “R” on your keyboard from your terminal running wrangler dev, and the Container is rebuilt and restarted.

Shipping your Container-enabled Worker to production with `wrangler deploy`

When it’s time to deploy, just run wrangler deploy. Wrangler will push your image to your account, then it will be provisioned in various locations across Cloudflare’s global network.

You don’t have to worry about “artifact management”, or distribution, or auth, or jump through hoops to integrate your container with Workers. You just write your code and deploy it.

Observability is built-in

Once your Container is in production, you have the visibility you need into how things are going, with end-to-end observability.

From the Cloudflare dashboard, you can easily track status and resource usage across Container instances with built-in metrics:

And if you need to dive deeper, you can dig into logs, which will be retained in the Cloudflare UI for seven days or pushed to an external sink of your choice:

Try it yourself

Want to give it a shot? Check out this example Worker for running sandboxed code in a Container, and deploy it with one click.

Or better yet, if you have an image sitting around that you’ve been dying to deploy to Cloudflare, you can get started with our docs here.

A world of possibilities

We’re excited about all the new types of applications that are now possible to build on Workers. We’ve heard many of you tell us over the years that you would love to run your entire application on Cloudflare, if only you could deploy this one piece that needs to run in a container.

Today, you can run libraries that you couldn't run in Workers before. For instance, try this Worker that uses FFmpeg to convert video to a GIF.

Or you can run a container as part of a cron job. Or deploy a static frontend with a containerized backend. Or even run a Cloudflare Agent that uses a Container to run Claude Code on your behalf.

The integration with the rest of the Developer Platform makes Containers even more powerful: use Durable Objects for state management, Workflows, Queues, and Agents to compose complex behaviors, R2 to store Container data or media, and more.

Pricing and packaging

As with the rest of our Cloudflare developer products, we wanted to apply the same principles to our developer platform with transparent pricing that scales up and down with your usage.

Today, you can select from the following instances at launch (and yes, we plan to add larger instances over time):

Name	Memory	CPU	Disk
dev	256 MiB	1/16 vCPU	2 GB
basic	1 GiB	1/4 vCPU	4 GB
standard	4 GiB	1/2 vCPU	4 GB

You only pay for what you use — charges start when a request is sent to the container or when it is manually started. Charges stop after the container instance goes to sleep, which can happen automatically after a timeout. This makes it easy to scale to zero, and allows you to get high utilization even with bursty traffic.

Containers are billed for every 10ms that they are actively running at the following rates, with monthly amounts included in Workers Standard:

Memory: $0.0000025 per GiB-second, with 25 GiB-hours included
CPU: $0.000020 per vCPU-second, with 375 vCPU-minutes included
Disk $0.00000007 per GB-second, with 200 GB-hours included

Egress from Containers is priced at the following rates, with monthly amounts included in Workers Standard:

North America and Europe: $0.025 per GB with 1 TB included
Australia, New Zealand, Taiwan, and Korea: $0.050 per GB with 500 GB included
Everywhere else: $0.040 per GB with 500 GB included

See our previous blog post for a more in-depth look into pricing with an example app.

What’s coming next?

With today’s release, we’ve only just begun to scratch the surface of what Containers will do on Workers. This is the first step of many towards our vision of a simple, global, and highly programmable Container platform.

We’re already thinking about what’s next, and wanted to give you a preview:

Higher limits and larger instances – We currently limit your concurrent instances to 40 total GiB of memory and 40 total vCPU. This is enough for some workloads, but many users will want to go higher — a lot higher. Select customers are already scaling well into the thousands of concurrent containers, and we want to bring this ability to more users. We will be raising our limits over the coming months to allow for more total containers and larger instance sizes.
Global autoscaling and latency-aware routing – Currently, containers are addressed by ID and started on-demand. For many use cases, users want to route to one of many stateless container instances deployed across the globe, then autoscale live instances automatically. Autoscaling will be activated with a single line of code, and will enable easy routing to the nearest ready instance.

class MyBackend extends Container {
  defaultPort = 8080;
  autoscale = true; // global autoscaling on - new instances spin up when memory or CPU utilization is high
}

// routes requests to the nearest ready container and load balance globally
async fetch(request, env) {
  return getContainer(env.MY_BACKEND).fetch(request);
}

More ways to communicate between Containers and Workers – We will be adding more ways for your Worker to communicate with your container and vice versa. We will add an exec command to run shell commands in your instance and handlers for HTTP requests from the container to Workers. This will allow you to more easily extend your containers with functionality from the entire developer platform, reach out to other containers, and programmatically set up each container instance.

class MyContainer extends Container {
  // sets up container-to-worker communication with handlers
  handlers = {
    "example.cf": "handleRequestFromContainer"
  };

  handleRequestFromContainer(req) {
    return new Response("You are responding from Workers to a Container request to a specific hostname")
  }

  // use exec to run commands in your container instance
  async cloneRepo(repoUrl) {
    let command = this.exec(`git clone ${repoUrl}`)
    await command.print()
  }  
}

Further integrations with the Developer Platform – We will continue to integrate with the developer platform with first-party APIs for our various services. We want it to be dead simple to mount R2 buckets, reach Hyperdrive, access KV, and more.

And we are just getting started. Stay tuned for more updates this summer and over the course of the entire year.

Try Containers today

The first step is to deploy your own container. Run npm create cloudflare@latest -- --template=cloudflare/templates/containers-template or click the button below to deploy your first Container to Workers.

We’re excited to see all the ways you will use Containers. From new languages and tools, to simplified Cloudflare-only architectures, to advanced programmatic control over container creation, you now have the ability to do even more on the Developer Platform. It is just a wrangler deploy away.

Simple, scalable, and global: Containers are coming to Cloudflare Workers in June 2025

Mike Nomitch — Fri, 11 Apr 2025 14:00:00 GMT

It is almost the end of Developer Week and we haven’t talked about containers: until now. As some of you may know, we’ve been working on a container platform behind the scenes for some time.

In late June, we plan to release Containers in open beta, and today we’ll give you a sneak peek at what makes it unique.

Workers are the simplest way to ship software around the world with little overhead. But sometimes you need to do more. You might want to:

Run user-generated code in any language
Execute a CLI tool that needs a full Linux environment
Use several gigabytes of memory or multiple CPU cores
Port an existing application from AWS, GCP, or Azure without a major rewrite

Cloudflare Containers let you do all of that while being simple, scalable, and global.

Through a deep integration with Workers and an architecture built on Durable Objects, Workers can be your:

API Gateway: Letting you control routing, authentication, caching, and rate-limiting before requests reach a container
Service Mesh: Creating private connections between containers with a programmable routing layer
Orchestrator: Allowing you to write custom scheduling, scaling, and health checking logic for your containers

Instead of having to deploy new services, write custom Kubernetes operators, or wade through control plane configuration to extend the platform, you just write code.

Let’s see what it looks like.

Deploying different application types

A stateful workload: executing AI-generated code

First, let’s take a look at a stateful example.

Imagine you are building a platform where end-users can run code generated by an LLM. This code is untrusted, so each user needs their own secure sandbox. Additionally, you want users to be able to run multiple requests in sequence, potentially writing to local files or saving in-memory state.

To do this, you need to create a container on-demand for each user session, then route subsequent requests to that container. Here’s how you can accomplish this:

First, you write some basic Wrangler config, then you route requests to containers via your Worker:

import { Container } from "cloudflare:workers";

export default {
  async fetch(request, env) {
    const url = new URL(request.url);

    if (url.pathname.startsWith("/execute-code")) {
      const { sessionId, messages } = await request.json();
      // pass in prompt to get the code from Llama 4
      const codeToExecute = await env.AI.run("@cf/meta/llama-4-scout-17b-16e-instruct", { messages });

      // get a different container for each user session
      const id = env.CODE_EXECUTOR.idFromName(sessionId);
      const sandbox = env.CODE_EXECUTOR.get(id);

      // execute a request on the container
      return sandbox.fetch("/execute-code", { method: "POST", body: codeToExecute });
    }

    // ... rest of Worker ...
  },
};

// define your container using the Container class from cloudflare:workers
export class CodeExecutor extends Container {
  defaultPort = 8080;
  sleepAfter = "1m";
}

Then, deploy your code with a single command: wrangler deploy. This builds your container image, pushes it to Cloudflare’s registry, readies containers to boot quickly across the globe, and deploys your Worker.

$ wrangler deploy

That’s it.

How does it work?

Your Worker creates and starts up containers on-demand. Each time you call env.CODE_EXECUTOR.get(id) with a unique ID, it sends requests to a unique container instance. The container will automatically boot on the first fetch, then put itself to sleep after a configurable timeout, in this case 1 minute. You only pay for the time that the container is actively running.

When you request a new container, we boot one in a Cloudflare location near the incoming request. This means that low-latency workloads are well-served no matter the region. Cloudflare takes care of all the pre-warming and caching so you don’t have to think about it.

This allows each user to run code in their own secure environment.

Stateless and global: FFmpeg everywhere

Stateless and autoscaling applications work equally well on Cloudflare Containers.

Imagine you want to run a container that takes a video file and turns it into an animated GIF using FFmpeg. Unlike the previous example, any container can serve any request, but you still don’t want to send bytes across an ocean and back unnecessarily. So, ideally the app can be deployed everywhere.

To do this, you declare a container in Wrangler config and turn on autoscaling. This specific configuration ensures that one instance is always running and if CPU usage increases beyond 75% of capacity, additional instances are added:

"containers": [
  {
    "class_name": "GifMaker",
    "image": "./Dockerfile", // container source code can be alongside Worker code
    "instance_type": "basic",
    "autoscaling": {
      "minimum_instances": 1,
      "cpu_target": 75,
    }
  }
],
// ...rest of wrangler.jsonc...

To route requests, you just call env.GIF_MAKER.fetch and requests are automatically sent to the closest container:

import { Container } from "cloudflare:workers";

export class GifMaker extends Container {
  defaultPort: 1337,
}

export default {
  async fetch(request, env) {
    const url = new URL(request.url);

    if (url.pathname === "/make-gif") {
      return env.GIF_MAKER.fetch(request)
    }

    // ... rest of Worker ...
  },
};

Going beyond the basics

From the examples above, you can see that getting a basic container service running on Workers just takes a few lines of config and a little Workers code. There’s no need to worry about capacity, artifact registries, regions, or scaling.

For more advanced use, we’ve designed Cloudflare Containers to run on top of Durable Objects and work in tandem with Workers. Let’s take a look at the underlying architecture and see some of the advanced use cases it enables.

Durable Objects as programmable sidecars

Routing to containers is enabled using Durable Objects under the hood. In the examples above, the Container class from cloudflare:workers just wraps a container-enabled Durable Object and provides helper methods for common patterns. In the rest of this post, we’ll look at examples using Durable Objects directly, as this should shed light on the platform’s underlying design.

Each Durable Object acts as a programmable sidecar that can proxy requests to the container and manages its lifecycle. This allows you to control and extend your containers in ways that are hard on other platforms.

You can manually start, stop, and execute commands on a specific container by calling RPC methods on its Durable Object, which now has a new object at this.ctx.container:

class MyContainer extends DurableObject {
  // these RPC methods are callable from a Worker
  async customBoot(entrypoint, envVars) {
    this.ctx.container.start({ entrypoint, env: envVars });
  }

  async stopContainer() {
    const SIGTERM = 15;
    this.ctx.container.signal(SIGTERM);
  }

  async startBackupScript() {
    await this.ctx.container.exec(["./backup"]);
  }
}

You can also monitor your container and run hooks in response to Container status changes.

For instance, say you have a CI job that runs builds in a Container. You want to post a message to a Queue based on the exit status. You can easily program this behavior:

class BuilderContainer extends DurableObject {
  constructor(ctx, env) {
    super(ctx, env)
    async function onContainerExit() {
      await this.env.QUEUE.send({ status: "success", message: "Build Complete" });
    }

    async function onContainerError(err) {
      await this.env.QUEUE.send({ status: "error", message: err});
    }

    this.ctx.container.start();
    this.ctx.container.monitor().then(onContainerExit).catch(onContainerError); 
  }

  async isRunning() { return this.ctx.container.running; }
}

And lastly, if you have state that needs to be loaded into a container each time it runs, you can use status hooks to persist state from the container before it sleeps and to reload state into the container after it starts:

import { startAndWaitForPort } from "./helpers"

class MyContainer extends DurableObject {
  constructor(ctx, env) {
    super(ctx, env)
    this.ctx.blockConcurrencyWhile(async () => {
      this.ctx.storage.sql.exec('CREATE TABLE IF NOT EXISTS state (value TEXT)');
      this.ctx.storage.sql.exec("INSERT INTO state (value) SELECT '' WHERE NOT EXISTS 
(SELECT * FROM state)");
      await startAndWaitForPort(this.ctx.container, 8080);
      await this.setupContainer();
      this.ctx.container.monitor().then(this.onContainerExit); 
    });
  }

  async setupContainer() {
    const initialState = this.ctx.storage.sql.exec('SELECT * FROM state LIMIT 1').one().value;
    return this.ctx.container
      .getTcpPort(8080)
      .fetch("http://container/state", { body: initialState, method: 'POST' });
  }

  async onContainerExit() {
    const response = await this.ctx.container
      .getTcpPort(8080)
      .fetch('http://container/state');
    const newState = await response.text();
    this.ctx.storage.sql.exec('UPDATE state SET value = ?', newState);
  }
}

Building around your Containers with Workers

Not only do Durable Objects allow you to have fine-grained control over the Container lifecycle, the whole Workers platform allows you to extend routing and scheduling behavior as you see fit.

Using Workers as an API gateway

Workers provide programmable ingress logic from over 300 locations around the world. In this sense, they provide similar functionality to an API gateway.

For instance, let’s say you want to route requests to a different version of a container based on information in a header. This is accomplished in a few lines of code:

export default {
  async fetch(request, env) {
    const isExperimental = request.headers.get("x-version") === "experimental";
    
    if (isExperimental) {
      return env.MY_SERVICE_EXPERIMENTAL.fetch(request);
    } else {
      return env.MY_SERVICE_STANDARD.fetch(request);
    }
  },
};

Or you want to rate limit and authenticate requests to the container:

async fetch(request, env) {
  const url = new URL(request.url);

  if (url.pathname.startsWith('/api/')) {
    const token = request.headers.get("token");

    const isAuthenticated = await authenticateRequest(token);
    if (!isAuthenticated) {
      return new Response("Not authenticated", { status: 401 });
    }

    const { withinRateLimit } = await env.MY_RATE_LIMITER.limit({ key: token });
    if (!withinRateLimit) {
      return new Response("Rate limit exceeded for token", { status: 429 });
    }

    return env.MY_APP.fetch(request);
  }
  // ...
}

Using Workers as a service mesh

By default, Containers are private and can only be accessed via Workers, which can connect to one of many container ports. From within the container, you can expose a plain HTTP port, but requests will still be encrypted from the end user to the moment we send the data to the container’s TCP port in the host. Due to the communication being relayed through the Cloudflare network, the container does not need to set up TLS certificates to have secure connections in its open ports.

You can connect to the container through a WebSocket from the client too. See this repository for a full example of using Websockets.

Just as the Durable Object can act as proxy to the container, it can act as a proxy from the container as well. When setting up a container, you can toggle Internet access off and ensure that outgoing requests pass through Workers.

// ... when starting the container...
this.ctx.container.start({ 
  workersAddress: '10.0.0.2:8080',
  enableInternet: false, // 'enableInternet' is false by default
});

// ... container requests to '10.0.0.2:8080' securely route to a different service...
override async onContainerRequest(request: Request) {
  const containerId = this.env.SUB_SERVICE.idFromName(request.headers['X-Account-Id']);
  return this.env.SUB_SERVICE.get(containerId).fetch(request);
}

You can ensure all traffic in and out of your container is secured and encrypted end to end without having to deal with networking yourself.

This allows you to protect and connect containers within Cloudflare’s network… or even when connecting to external private networks.

Using Workers as an orchestrator

You might require custom scheduling and scaling logic that goes beyond what Cloudflare provides out of the box.

We don’t want you having to manage complex chains of API calls or writing an operator to get the logic you need. Just write some Worker code.

For instance, imagine your containers have a long startup period that involves loading data from an external source. You need to pre-warm containers manually, and need control over the specific region to prewarm. Additionally, you need to set up manual health checks that are accessible via Workers. You’re able to achieve this fairly simply with Workers and Durable Objects.

import { Container, DurableObject } from "cloudflare:workers";

// A singleton Durable Object to manage and scale containers

class ContainerManager extends DurableObject {
  scale(region, instanceCount) {
    for (let i = 0; i < instanceCount; i++) {
      const containerId = env.CONTAINER.idFromName(`instance-${region}-${i}`);
      // spawns a new container with a location hint
      await env.CONTAINER.get(containerId, { locationHint: region }).start();
    }
  }

  async setHealthy(containerId, isHealthy) {
    await this.ctx.storage.put(containerId, isHealthy);
  }
}

// A Container class for the underlying compute

class MyContainer extends Container {
  defaultPort = 8080;

  async onContainerStart() {
    // run healthcheck every 500ms
    await this.scheduleEvery(0.5, 'healthcheck');
  }

  async healthcheck() {
    const manager = this.env.MANAGER.get(
      this.env.MANAGER.idFromName("manager")
    );
    const id = this.ctx.id.toString();

    await this.container.fetch("/_health")
      .then(() => manager.setHealthy(id, true))
      .catch(() => manager.setHealthy(id, false));
  }
}

The ContainerManager Durable Object exposes the scale RPC call, which you can call as needed with a region and instanceCount which scales up the number of active Container instances in a given region using a location hint. The this.schedule code executes a manually defined healthcheck method on the Container and tracks its state in the Manager for use by other logic in your system.

These building blocks let users handle complex scheduling logic themselves. For a more detailed example using standard Durable Objects, see this repository.

We are excited to see the patterns you come up with when orchestrating complex applications built with containers, and trust that between Workers and Durable Objects, you’ll have the tools you need.

Integrating with more of Cloudflare’s Developer Platform

Since it is Developer Week 2025, we would be remiss to not talk about Workflows, which just went GA, and Agents, which just got even better.

Let’s finish up by taking a quick look at how you can integrate Containers with these two tools.

Running a short-lived job with Workflows & R2

You need to download a large file from R2, compress it, and upload it. You want to ensure that this succeeds, but don’t want to write retry logic and error handling yourself. Additionally, you don’t want to deal with rotating R2 API tokens or worry about network connections — it should be secure by default.

This is a perfect opportunity for a Workflow using Containers. The container can do the heavy lifting of compressing files, Workers can stream the data to and from R2, and the Workflow can ensure durable execution.

export class EncoderWorkflow extends WorkflowEntrypoint {
  async run(event: WorkflowEvent, step: WorkflowStep) {
    const id = this.env.ENCODER.idFromName(event.instanceId);
    const container = this.env.ENCODER.get(id);

    await step.do('init container', async () => {
      await container.init();
    });

    await step.do('compress the object with zstd', async () => {
      await container.ensureHealthy();
      const object = await this.env.ARTIFACTS.get(event.payload.r2Path);
      const result = await container.fetch('http://encoder/zstd', {
        method: 'POST', body: object.body 
      });
      await this.env.ARTIFACTS.put(`results${event.payload.r2Path}`, result.body);
    });

    await step.do('cleanup container', async () => {
      await container.destroy();
    });
  }
}

Calling a Container from an Agent

Lastly, imagine you have an AI agent that needs to spin up cloud infrastructure (you like to live dangerously). To do this, you want to use Terraform, but since it’s run from the command line, you can’t run it on Workers.

By defining a tool, you can enable your Agent to run the shell commands from a container:

// Make tools that call to a container from an agent

const createExternalResources = tool({
  description: "runs Terraform in a container to create resources",
  parameters: z.object({ sessionId: z.number(), config: z.string() }),
  execute: async ({ sessionId, config }) => {
    return this.env.TERRAFORM_RUNNER.get(sessionId).applyConfig(config);
  },
});

// Expose RPC Methods that call to the container

class TerraformRunner extends DurableObject {
  async applyConfig(config) {
    await this.ctx.container.getTcpPort(8080).fetch(APPLY_URL, {
      method: 'POST',
      body: JSON.stringify({ config }),
    });
  }

  // ...rest of DO...
}

Containers are so much more powerful when combined with other tools. Workers make it easy to do so in a secure and simple way.

Pay for what you use and use the right tool

The deep integration between Workers and Containers also makes it easy to pick the right tool for the job with regards to cost.

With Cloudflare Containers, you only pay for what you use. Charges start when a request is sent to the container or it is manually started. Charges stop after the container goes to sleep, which can happen automatically after a configurable timeout. This makes it easy to scale to zero, and allows you to get high utilization even with highly-variable traffic.

Containers are billed for every 10ms that they are actively running at the following rates:

Memory: $0.0000025 per GB-second
CPU: $0.000020 per vCPU-second
Disk $0.00000007 per GB-second

After 1 TB of free data transfer per month, egress from a Container will be priced per-region. We'll be working out the details between now and the beta, and will be launching with clear, transparent pricing across all dimensions so you know where you stand.

Workers are lighter weight than containers and save you money by not charging when waiting on I/O. This means that if you can, running on a Worker helps you save on cost. Luckily, on Cloudflare it is easy to route requests to the right tool.

Cost comparison

Comparing containers and functions services on paper is always going to be an apples to oranges exercise, and results can vary so much depending on use case. But to share a real example of our own, a year ago when Cloudflare acquired Baselime, Baselime was a heavy user of AWS Lambda. By moving to Cloudflare, they lowered their cloud compute bill by 80%.

Below we wanted to share one representative example that compares costs for an application that uses both containers and serverless functions together. It’d be easy for us to come up with a contrived example that uses containers sub-optimally on another platform, for the wrong types of workloads. We won’t do that here. We know that navigating cloud costs can be challenging, and that cost is a critical part of deciding what type of compute to use for which pieces of your application.

In the example below, we’ll compare Cloudflare Containers + Workers against Google Cloud Run, a very well-regarded container platform that we’ve been impressed by.

Example application

Imagine that you run an application that serves 50 million requests per month, and each request consumes an average 500 ms of wall-time. Requests to this application are not all the same though — half the requests require a container, and the other half can be served just using serverless functions.

Requests per month	Wall-time (duration)	Compute required	Cloudflare	Google Cloud
25 million	500ms	Container + serverless functions	Containers + Workers	Google Cloud Run + Google Cloud Run Functions
25 million	500ms	Serverless functions	Workers	Google Cloud Run Functions

Container pricing

On both Cloud Run and Cloudflare Containers, a container can serve multiple requests. On some platforms, such as AWS Lambda, each container instance is limited to a single request, pushing cost up significantly as request count grows. In this scenario, 50 requests can run simultaneously on a container with 4 GB memory and half of a vCPU. This means that to serve 25 million requests of 500ms each, we need 625,000 seconds worth of compute

In this example, traffic is bursty and we want to avoid paying for idle-time, so we’ll use Cloud Run’s request-based pricing.

	Price per vCPU second	Price per GB-second of memory	Price per 1m requests	Monthly Price for Compute + Requests
Cloudflare Containers	$0.000020	$0.0000025	$0.30	$20.00
Google Cloud Run	$0.000024	$0.0000025	$0.40	$23.75

^{* Comparison does not include free tiers for either provider and uses a single Tier 1 GCP region}

Compute pricing for both platforms are comparable. But as we showed earlier in this post, Containers on Cloudflare run anywhere, on-demand, without configuring and managing regions. Each container has a programmable sidecar with its own database, backed by Durable Objects. It’s the depth of integration with the rest of the platform that makes containers on Cloudflare uniquely programmable.

Function pricing

The other requests can be served with less compute, and code written in JavaScript, TypeScript, Python or Rust, so we’ll use Workers and Cloud Run Functions.

These 25 million requests also run for 500 ms each, and each request spends 480 ms waiting on I/O. This means that Workers will only charge for 20 ms of “CPU-time”, the time that the Worker actually spends using compute. This ratio of low CPU time to high wall time is extremely common when building AI apps that make inference requests, or even when just building REST APIs and other business logic. Most time is spent waiting on I/O. Based on our data, we typically see Workers use less than 5 ms of CPU time per request vs seconds of wall time (waiting on APIs or I/O).

The Cloud Run Function will use an instance with 0.083 vCPU and 128 MB memory and charge on both CPU-s and GiB-s for the full 500 ms of wall-time.

	Total Price for “wall-time”	Total Price for “CPU-time”	Total Price for Compute + Requests
Cloudflare Workers	N/A	$0.83	$8.33
Google Cloud Run Functions	$1.44	N/A	$11.44

^{* Comparison does not include free tiers and uses a single Tier 1 GCP region.}

This comparison assumes you have configured Google Cloud Run Functions with a max of 20 concurrent requests per instance. On Google Cloud Run Functions, the maximum number of concurrent requests an instance can handle varies based on the efficiency of your function, and your own tolerance for tail latency that can be introduced by traffic spikes.

Workers automatically scale horizontally, don’t require you to configure concurrency settings (and hope to get it right), and can run in over 300 locations.

A holistic view of costs

The most important cost metric is the total cost of developing and running an application. And the only way to get the best results is to use the right compute for the job. So the question boils down to friction and integration. How easily can you integrate the ideal building blocks together?

As more and more software makes use of generative AI, and makes inference requests to LLMs, modern applications must communicate and integrate with a myriad of services. Most systems are increasingly real-time and chatty, often holding open long-lived connections, performing tasks in parallel. Running an instance of an application in a VM or container and calling it a day might have worked 10 years ago, but when we talk to developers in 2025, they are most often bringing many forms of compute to the table for particular use cases.

This shows the importance of picking a platform where you can seamlessly shift traffic from one source of compute to another. If you want to rate-limit, serve server-side rendered pages, API responses and static assets, handle authentication and authorization, make inference requests to AI models, run core business logic via Workflows, or ingest streaming data, just handle the request in Workers. Save the heavier compute only for where it is actually the only option. With Cloudflare Workers and Containers, this is as simple as an if-else statement in your Worker. This makes it easy to pick the right tool for the job.

Coming June 2025

We are collecting feedback and putting the finishing touches on our APIs now, and will release the open beta to the public in late June 2025.

From day one of building Cloudflare Workers, it’s been our goal to build an integrated platform, where Cloudflare products work together as a system, rather than just as a collection of separate products. We’ve taken this same approach with Containers, and aim to make Cloudflare not only the best place to deploy containers across the globe, but the best place to deploy the types of complete applications that developers are building, that use containers in tandem with serverless functions, Workflows, Agents, and much more.

We’re excited to get this into your hands soon. Stay on the lookout this summer.