Run inference on region: Earth

  • Build and deploy ambitious AI applications to Cloudflare's global network

Full-stack AI Building Blocks

Serverless GPUS
Serverless AI on GPUs

Run generative AI tasks on our global network of NVIDIA GPUs with no extra setup.

models_Illustration
Models Included

Choose from a variety of popular models in our catalog including Llama-2, Whisper, and ResNet50.

Runs everywhere
Available everywhere

Run AI models from Workers, Pages, or anywhere via our REST API.

vectorize_Illustration
Supercharge with Vectorize

Generate and store embeddings in a globally distributed vector database.

Ai Gateway
AI Gateway

Improve reliability and scalability with caching, rate limiting, and analytics.

r2_Illustration
Train with R2

Build multi-cloud training architectures with free egress.

Zero to production in minutes

workers Ai

Less boilerplate. More fun.

Choose a template from our curated catalog of off-the-shelf models, that allow you to perform tasks including image classification, sentiment analysis, speech recognition, text generation, or translation.

Add a vector database without breaking the bank

Speed up and scale your AI Workflows with Vectorize. Generate and store new or existing embeddings to enable search on top of your own data for repeated use with machine learning models.

vectorize AI
code AI

Grab your model and go

All it takes is a few lines of code with Workers AI and Vectorize to run an AI inference task on Pages using your favorite framework, Workers, or any stack via an API. Pick your model and go.

Cloudflare powers millions of Internet properties

Enhance and protect your AI applications

Build reliable, secure, cost-effective AI architectures

Ai Gateway

No more surprise bills from your AI vendors

The AI Gateway adds a layer of control and protection in LLM applications
• Apply rate-limits and caching to protect back-end infrastructure and avoid surprise bills.
• Gain visibility into how many people are using the service.

Train where it's cheapest with egress-free data

Cost-effective storage for training models and AI-generated assets with R2
• Egress-free storage makes multi-cloud architectures for training LLMs affordable.
• Limitless storage for the ever-growing assets generated by users.

R2 AI

Get started with a template

Icon Tile Location Pin
Workers AI + Vectorize Tutorial

Build a retrieval augmented generation (RAG app) with Workers AI and Vectorize. View Github Resources >

View Tutorial  
Icon Tile Location Pin
Workers + ChatGPT

Build a ChatGGPT search plugin with Notion and Pinecone. View Github Resources >

View Tutorial  
Icon Tile Location Pin
Workers + LangChain

Build an LLM search app powered by Workers and Langchain. View Github Resources >

View Tutorial  

SiteGPT

"We use Cloudflare for everything – storage, cache, queues, and most importantly for training data and deploying the app on the edge, so I can ensure the product is reliable and fast. It's also been the most affordable option, with competitors costing more for a single day's worth of requests than Cloudflare costs in a month."

- Bhanu Teja Pachipulusu
Founder

Security Shield Protection Icon

Get started with Cloudflare AI today