基于标签的路由
根据标签路由请求。这适用于
- 为用户实现免费/付费层级
- 按团队控制模型访问,例如团队 A 可以访问 gpt-4 部署 A,团队 B 可以访问 gpt-4 部署 B (团队的 LLM 访问控制)
快速开始
1. 在 config.yaml 中定义标签
- 包含
tags=["free"]
的请求将被路由到openai/fake
- 包含
tags=["paid"]
的请求将被路由到openai/gpt-4o
model_list:
- model_name: gpt-4
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
tags: ["free"] # 👈 Key Change
- model_name: gpt-4
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
tags: ["paid"] # 👈 Key Change
- model_name: gpt-4
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
api_base: https://exampleopenaiendpoint-production.up.railway.app/
tags: ["default"] # OPTIONAL - All untagged requests will get routed to this
router_settings:
enable_tag_filtering: True # 👈 Key Change
general_settings:
master_key: sk-1234
2. 使用 tags=["free"]
发送请求
此请求包含 "tags"["free"],将其路由到 openai/fake
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Hello, Claude gm!"}
],
"tags": ["free"]
}'
预期响应
当此功能生效时,预期会看到以下响应头
x-litellm-model-api-base: https://exampleopenaiendpoint-production.up.railway.app/
响应
{
"id": "chatcmpl-33c534e3d70148218e2d62496b81270b",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "\n\nHello there, how may I assist you today?",
"role": "assistant",
"tool_calls": null,
"function_call": null
}
}
],
"created": 1677652288,
"model": "gpt-3.5-turbo-0125",
"object": "chat.completion",
"system_fingerprint": "fp_44709d6fcb",
"usage": {
"completion_tokens": 12,
"prompt_tokens": 9,
"total_tokens": 21
}
}
3. 使用 tags=["paid"]
发送请求
此请求包含 "tags"["paid"],将其路由到 openai/gpt-4
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Hello, Claude gm!"}
],
"tags": ["paid"]
}'
预期响应
当此功能生效时,预期会看到以下响应头
x-litellm-model-api-base: https://api.openai.com
响应
{
"id": "chatcmpl-9maCcqQYTqdJrtvfakIawMOIUbEZx",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "Good morning! How can I assist you today?",
"role": "assistant",
"tool_calls": null,
"function_call": null
}
}
],
"created": 1721365934,
"model": "gpt-4o-2024-05-13",
"object": "chat.completion",
"system_fingerprint": "fp_c4e5b6fa31",
"usage": {
"completion_tokens": 10,
"prompt_tokens": 12,
"total_tokens": 22
}
}
通过请求头调用
您也可以通过请求头 x-litellm-tags
进行调用
curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-H 'x-litellm-tags: free,my-custom-tag' \
-d '{
"model": "gpt-4",
"messages": [
{
"role": "user",
"content": "Hey, how'\''s it going 123456?"
}
]
}'
设置默认标签
如果您希望所有未打标签的请求都被路由到特定的部署,请使用此选项
- 在您的 yaml 中设置默认标签
model_list:
- model_name: fake-openai-endpoint
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
tags: ["default"] # 👈 Key Change - All untagged requests will get routed to this
model_info:
id: "default-model" # used for identifying model in response headers
- 启动代理
$ litellm --config /path/to/config.yaml
- 发送不带标签的请求
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "fake-openai-endpoint",
"messages": [
{"role": "user", "content": "Hello, Claude gm!"}
]
}'
当此功能生效时,预期会看到以下响应头
x-litellm-model-id: default-model
✨ 基于团队的标签路由(企业版)
LiteLLM 代理支持基于团队的标签路由,允许您将特定标签与团队关联并相应地路由请求。例如:团队 A 可以访问 gpt-4 部署 A,团队 B 可以访问 gpt-4 部署 B (团队的 LLM 访问控制)
这是一个企业版功能,请点击此处联系我们以获取免费试用
以下是如何使用 curl 命令设置和使用基于团队的标签路由
在您的代理配置中启用标签过滤
在您的
proxy_config.yaml
中,确保您有以下设置model_list:
- model_name: fake-openai-endpoint
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
tags: ["teamA"] # 👈 Key Change
model_info:
id: "team-a-model" # used for identifying model in response headers
- model_name: fake-openai-endpoint
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
tags: ["teamB"] # 👈 Key Change
model_info:
id: "team-b-model" # used for identifying model in response headers
- model_name: fake-openai-endpoint
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
tags: ["default"] # OPTIONAL - All untagged requests will get routed to this
router_settings:
enable_tag_filtering: True # 👈 Key Change
general_settings:
master_key: sk-1234创建带有标签的团队
使用
/team/new
端点创建带有特定标签的团队# Create Team A
curl -X POST http://0.0.0.0:4000/team/new \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{"tags": ["teamA"]}'# Create Team B
curl -X POST http://0.0.0.0:4000/team/new \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{"tags": ["teamB"]}'这些命令将返回包含每个团队的
team_id
的 JSON 响应。为团队成员生成密钥
使用
/key/generate
端点创建与特定团队关联的密钥# Generate key for Team A
curl -X POST http://0.0.0.0:4000/key/generate \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{"team_id": "team_a_id_here"}'# Generate key for Team B
curl -X POST http://0.0.0.0:4000/key/generate \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{"team_id": "team_b_id_here"}'将
team_a_id_here
和team_b_id_here
替换为从步骤 2 获得的实际团队 ID。验证路由
检查响应中的
x-litellm-model-id
头,以确认请求已根据团队的标签路由到正确的模型。您可以使用 curl 的-i
标志来包含响应头使用团队 A 的密钥发送请求(包括头)
curl -i -X POST http://0.0.0.0:4000/chat/completions \
-H "Authorization: Bearer team_a_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "fake-openai-endpoint",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'在响应头中,您应该看到
x-litellm-model-id: team-a-model
类似地,使用团队 B 的密钥时,您应该看到
x-litellm-model-id: team-b-model
按照这些步骤并使用这些 curl 命令,您可以在 LiteLLM 代理设置中实现和测试基于团队的标签路由,确保不同的团队根据其分配的标签被路由到适当的模型或部署。