Building a Cloudflare AI Gateway integration for LlamaIndex
/ 2 min read
Table of Contents
Single LLM providers are fragile. OpenAI outage? Your app dies. Anthropic rate limits? Users wait. It’s annoying.
I wanted multi-provider orchestration without the complexity. Cloudflare AI Gateway looked perfect—automatic fallback, caching, load balancing. But no LlamaIndex integration.
So I built one.
The Problem
Cloudflare AI Gateway is smart. It handles:
- Automatic provider fallback
- Built-in caching and rate limiting
- Load balancing across providers
- Unified API interface
But LlamaIndex LLMs make direct HTTP requests to provider APIs. Cloudflare expects a different format.
Cloudflare provides an OpenAI-compatible API, but I wanted something more flexible. Compatible often means limited. And Cloudflare also provides a Vercel ai-sdk integration, but I save my life by using Python.
The Solution
A wrapper that sits between LlamaIndex LLMs and their HTTP clients. Intercepts requests, transforms them for Cloudflare, handles responses.
from llama_index.llms.cloudflare_ai_gateway import CloudflareAIGatewayfrom llama_index.llms.openai import OpenAIfrom llama_index.llms.anthropic import Anthropicfrom llama_index.core.llms import ChatMessage
# Create regular LlamaIndex LLMsopenai_llm = OpenAI(model="gpt-4o-mini", api_key="your-key")anthropic_llm = Anthropic(model="claude-3-5-sonnet", api_key="your-key")
# Wrap with Cloudflare AI Gatewayllm = CloudflareAIGateway( llms=[openai_llm, anthropic_llm], # Try OpenAI first, then Anthropic account_id="your-cloudflare-account-id", gateway="your-gateway-name", api_key="your-cloudflare-api-key",)
# Use exactly like any LlamaIndex LLMmessages = [ChatMessage(role="user", content="What is 2+2?")]response = llm.chat(messages)
Drop-in replacement. Zero code changes.
What It Does
Core features:
- Automatic provider detection and configuration
- Built-in fallback when providers fail
- Streaming support (chat and completion)
- Async/await compatible
Tested providers:
Also supported:
Try It
Still a planned PR (#19395), but functional:
git clone <repository-url>cd llama-index-llms-cloudflare-ai-gatewaypip install -e .
export OPENAI_API_KEY="your-key"export ANTHROPIC_API_KEY="your-key"export CLOUDFLARE_ACCOUNT_ID="your-id"export CLOUDFLARE_API_KEY="your-key"export CLOUDFLARE_GATEWAY="your-gateway"
uv run pytest tests/
May not be production-ready, but good enough to experiment with. Check out the LlamaIndex integrations repository for other LLM providers.