Building a Cloudflare AI Gateway integration for LlamaIndex

Single LLM providers are fragile. OpenAI outage? Your app dies. Anthropic rate limits? Users wait. It’s annoying.

I wanted multi-provider orchestration without the complexity. Cloudflare AI Gateway looked perfect—automatic fallback, caching, load balancing. But no LlamaIndex integration.

So I built one.

The Problem

Cloudflare AI Gateway is smart. It handles:

Automatic provider fallback
Built-in caching and rate limiting
Load balancing across providers
Unified API interface

But LlamaIndex LLMs make direct HTTP requests to provider APIs. Cloudflare expects a different format.

Cloudflare provides an OpenAI-compatible API, but I wanted something more flexible. Compatible often means limited. And Cloudflare also provides a Vercel ai-sdk integration, but I save my life by using Python.

The Solution

A wrapper that sits between LlamaIndex LLMs and their HTTP clients. Intercepts requests, transforms them for Cloudflare, handles responses.

from llama_index.llms.cloudflare_ai_gateway import CloudflareAIGateway
from llama_index.llms.openai import OpenAI
from llama_index.llms.anthropic import Anthropic
from llama_index.core.llms import ChatMessage

# Create regular LlamaIndex LLMs
openai_llm = OpenAI(model="gpt-4o-mini", api_key="your-key")
anthropic_llm = Anthropic(model="claude-3-5-sonnet", api_key="your-key")

# Wrap with Cloudflare AI Gateway
llm = CloudflareAIGateway(
    llms=[openai_llm, anthropic_llm],  # Try OpenAI first, then Anthropic
    account_id="your-cloudflare-account-id",
    gateway="your-gateway-name",
    api_key="your-cloudflare-api-key",
)

# Use exactly like any LlamaIndex LLM
messages = [ChatMessage(role="user", content="What is 2+2?")]
response = llm.chat(messages)

Drop-in replacement. Zero code changes.

What It Does

Core features:

Automatic provider detection and configuration
Built-in fallback when providers fail
Streaming support (chat and completion)
Async/await compatible

Tested providers:

OpenAI, Anthropic

Also supported:

Try It

Still a planned PR (#19395), but functional:

git clone <repository-url>
cd llama-index-llms-cloudflare-ai-gateway
pip install -e .

export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
export CLOUDFLARE_ACCOUNT_ID="your-id"
export CLOUDFLARE_API_KEY="your-key"
export CLOUDFLARE_GATEWAY="your-gateway"

uv run pytest tests/

May not be production-ready, but good enough to experiment with. Check out the LlamaIndex integrations repository for other LLM providers.

The Problem

The Solution

What It Does

Try It

References