Cloudflare has announced a major expansion of its Workers AI platform: a model marketplace where developers can browse, deploy, and bill inference from over 50 models across language, vision, and audio categories.
Edge Inference with Zero Egress Fees
Models run on Cloudflare's global GPU fleet, with inference happening in the datacenter closest to the user. There are no egress charges — only per-token or per-inference costs, which Cloudflare claims are 40-60% below comparable hyperscalers.
Featured Launch Partners
Mistral AI, Stability AI, Black Forest Labs (FLUX), and Nous Research are among the launch partners. Models are exposed via the same OpenAI-compatible REST API Workers developers already use.
BYO Model Support
Enterprises can upload their own fine-tuned GGUF or SafeTensors checkpoints and serve them across the Cloudflare edge with one-click deployment, paying only for GPU time consumed.
Pricing and model catalog are live at developers.cloudflare.com/workers-ai/marketplace.