跳到主要内容

LiteLLM Proxy (LLM 网关)

属性详情
描述LiteLLM Proxy 是一个 OpenAI 兼容的网关,它允许您通过统一的 API 与多个 LLM 提供商进行交互。只需在模型名称前加上 litellm_proxy/ 前缀,即可通过代理路由您的请求。
LiteLLM 上的提供商路由litellm_proxy/ (将此前缀添加到模型名称前,以便将任何请求路由到 litellm_proxy - 例如 litellm_proxy/your-model-name
设置 LiteLLM 网关LiteLLM 网关 ↗
支持的端点/chat/completions、/completions、/embeddings、/audio/speech、/audio/transcriptions、/images、/rerank

必需变量

os.environ["LITELLM_PROXY_API_KEY"] = "" # "sk-1234" your litellm proxy api key 
os.environ["LITELLM_PROXY_API_BASE"] = "" # "http://localhost:4000" your litellm proxy api base

用法 (非流式)

import os 
import litellm
from litellm import completion

os.environ["LITELLM_PROXY_API_KEY"] = ""

# set custom api base to your proxy
# either set .env or litellm.api_base
# os.environ["LITELLM_PROXY_API_BASE"] = ""
litellm.api_base = "your-openai-proxy-url"


messages = [{ "content": "Hello, how are you?","role": "user"}]

# litellm proxy call
response = completion(model="litellm_proxy/your-model-name", messages)

用法 - 按请求传递 api_baseapi_key

如果您需要动态设置 api_base,只需在 completions 中传递即可 - completions(...,api_base="your-proxy-api-base")

import os 
import litellm
from litellm import completion

os.environ["LITELLM_PROXY_API_KEY"] = ""

messages = [{ "content": "Hello, how are you?","role": "user"}]

# litellm proxy call
response = completion(
model="litellm_proxy/your-model-name",
messages=messages,
api_base = "your-litellm-proxy-url",
api_key = "your-litellm-proxy-api-key"
)

用法 - 流式

import os 
import litellm
from litellm import completion

os.environ["LITELLM_PROXY_API_KEY"] = ""

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(
model="litellm_proxy/your-model-name",
messages=messages,
api_base = "your-litellm-proxy-url",
stream=True
)

for chunk in response:
print(chunk)

嵌入

import litellm

response = litellm.embedding(
model="litellm_proxy/your-embedding-model",
input="Hello world",
api_base="your-litellm-proxy-url",
api_key="your-litellm-proxy-api-key"
)

图像生成

import litellm

response = litellm.image_generation(
model="litellm_proxy/dall-e-3",
prompt="A beautiful sunset over mountains",
api_base="your-litellm-proxy-url",
api_key="your-litellm-proxy-api-key"
)

音频转录

import litellm

response = litellm.transcription(
model="litellm_proxy/whisper-1",
file="your-audio-file",
api_base="your-litellm-proxy-url",
api_key="your-litellm-proxy-api-key"
)

文本转语音

import litellm

response = litellm.speech(
model="litellm_proxy/tts-1",
input="Hello world",
api_base="your-litellm-proxy-url",
api_key="your-litellm-proxy-api-key"
)

重排序 (Rerank)

import litellm

import litellm

response = litellm.rerank(
model="litellm_proxy/rerank-english-v2.0",
query="What is machine learning?",
documents=[
"Machine learning is a field of study in artificial intelligence",
"Biology is the study of living organisms"
],
api_base="your-litellm-proxy-url",
api_key="your-litellm-proxy-api-key"
)

与其他库集成

LiteLLM Proxy 可与 Langchain、LlamaIndex、OpenAI JS、Anthropic SDK、Instructor 等无缝协作。

了解如何将 LiteLLM proxy 与这些库一起使用 →

将所有 SDK 请求发送到 LiteLLM Proxy

当从任何已使用 LiteLLM SDK 的库/代码库调用 LiteLLM Proxy 时使用此方法。

这些标志将把所有请求通过您的 LiteLLM 代理路由,无论指定了哪个模型。

启用后,请求将使用 LITELLM_PROXY_API_BASELITELLM_PROXY_API_KEY 进行身份验证。

选项 1:在代码中全局设置

# Set the flag globally for all requests
litellm.use_litellm_proxy = True

response = litellm.completion(
model="vertex_ai/gemini-2.0-flash-001",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)

选项 2:通过环境变量控制

# Control proxy usage through environment variable
os.environ["USE_LITELLM_PROXY"] = "True"

response = litellm.completion(
model="vertex_ai/gemini-2.0-flash-001",
messages=[{"role": "user", "content": "Hello, how are you?"}]
)

选项 3:按请求设置

# Enable proxy for specific requests only
response = litellm.completion(
model="vertex_ai/gemini-2.0-flash-001",
messages=[{"role": "user", "content": "Hello, how are you?"}],
use_litellm_proxy=True
)