skip to content
PsiACE
Table of Contents

Single LLM providers are fragile. OpenAI outage? Your app dies. Anthropic rate limits? Users wait. It’s annoying.

I wanted multi-provider orchestration without the complexity. Cloudflare AI Gateway looked perfect—automatic fallback, caching, load balancing. But no LlamaIndex integration.

So I built one.

The Problem

Cloudflare AI Gateway is smart. It handles:

  • Automatic provider fallback
  • Built-in caching and rate limiting
  • Load balancing across providers
  • Unified API interface

But LlamaIndex LLMs make direct HTTP requests to provider APIs. Cloudflare expects a different format.

Cloudflare provides an OpenAI-compatible API, but I wanted something more flexible. Compatible often means limited. And Cloudflare also provides a Vercel ai-sdk integration, but I save my life by using Python.

The Solution

A wrapper that sits between LlamaIndex LLMs and their HTTP clients. Intercepts requests, transforms them for Cloudflare, handles responses.

from llama_index.llms.cloudflare_ai_gateway import CloudflareAIGateway
from llama_index.llms.openai import OpenAI
from llama_index.llms.anthropic import Anthropic
from llama_index.core.llms import ChatMessage
# Create regular LlamaIndex LLMs
openai_llm = OpenAI(model="gpt-4o-mini", api_key="your-key")
anthropic_llm = Anthropic(model="claude-3-5-sonnet", api_key="your-key")
# Wrap with Cloudflare AI Gateway
llm = CloudflareAIGateway(
llms=[openai_llm, anthropic_llm], # Try OpenAI first, then Anthropic
account_id="your-cloudflare-account-id",
gateway="your-gateway-name",
api_key="your-cloudflare-api-key",
)
# Use exactly like any LlamaIndex LLM
messages = [ChatMessage(role="user", content="What is 2+2?")]
response = llm.chat(messages)

Drop-in replacement. Zero code changes.

What It Does

Core features:

  • Automatic provider detection and configuration
  • Built-in fallback when providers fail
  • Streaming support (chat and completion)
  • Async/await compatible

Tested providers:

Also supported:

Try It

Still a planned PR (#19395), but functional:

Terminal window
git clone <repository-url>
cd llama-index-llms-cloudflare-ai-gateway
pip install -e .
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
export CLOUDFLARE_ACCOUNT_ID="your-id"
export CLOUDFLARE_API_KEY="your-key"
export CLOUDFLARE_GATEWAY="your-gateway"
uv run pytest tests/

May not be production-ready, but good enough to experiment with. Check out the LlamaIndex integrations repository for other LLM providers.


References