虚拟密钥
跟踪开销,并通过代理的虚拟密钥控制模型访问
设置
要求
- 需要一个 postgres 数据库(例如 Supabase, Neon 等)
- 在你的环境变量中设置
DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname>
- 设置一个
master key
,这是你的代理管理员密钥 - 你可以使用它来创建其他密钥 (🚨 必须以sk-
开头)。- 在 config.yaml 中设置 在
general_settings:master_key
下设置你的 master key,示例如下 - 设置环境变量 设置
LITELLM_MASTER_KEY
- 在 config.yaml 中设置 在
(代理 Dockerfile 会检查是否设置了 DATABASE_URL
,然后初始化数据库连接)
export DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname>
然后你可以通过访问 /key/generate
端点来生成密钥。
快速入门 - 生成密钥
步骤 1:保存 postgres 数据库 URL
model_list:
- model_name: gpt-4
litellm_params:
model: ollama/llama2
- model_name: gpt-3.5-turbo
litellm_params:
model: ollama/llama2
general_settings:
master_key: sk-1234
database_url: "postgresql://<user>:<password>@<host>:<port>/<dbname>" # 👈 KEY CHANGE
步骤 2:启动 litellm
litellm --config /path/to/config.yaml
步骤 3:生成密钥
curl 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["gpt-3.5-turbo", "gpt-4"], "metadata": {"user": "ishaan@berri.ai"}}'
开销跟踪
按...获取开销
- 密钥 - 通过
/key/info
Swagger - 用户 - 通过
/user/info
Swagger - 团队 - 通过
/team/info
Swagger - ⏳ 最终用户 - 通过
/end_user/info
- 就最终用户成本跟踪在此议题上发表评论
如何计算?
每个模型的成本存储在这里,并由 completion_cost
函数计算。
如何跟踪?
开销会自动在 "LiteLLM_VerificationTokenTable" 中为密钥进行跟踪。如果密钥关联了 'user_id' 或 'team_id',则该用户的开销会在 "LiteLLM_UserTable" 中跟踪,团队的开销会在 "LiteLLM_TeamTable" 中跟踪。
- 密钥开销
- 用户开销
- 团队开销
你可以使用 /key/info
端点获取密钥的开销。
curl 'http://0.0.0.0:4000/key/info?key=<user-key>' \
-X GET \
-H 'Authorization: Bearer <your-master-key>'
当使用 litellm 的 completion_cost() 函数调用 /completions, /chat/completions, /embeddings 时,开销(以美元计)会自动更新。查看代码。
示例响应
{
"key": "sk-tXL0wt5-lOOVK9sfY2UacA",
"info": {
"token": "sk-tXL0wt5-lOOVK9sfY2UacA",
"spend": 0.0001065, # 👈 SPEND
"expires": "2023-11-24T23:19:11.131000Z",
"models": [
"gpt-3.5-turbo",
"gpt-4",
"claude-2"
],
"aliases": {
"mistral-7b": "gpt-3.5-turbo"
},
"config": {}
}
}
1. 创建用户
curl --location 'http://localhost:4000/user/new' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{user_email: "krrish@berri.ai"}'
预期响应
{
...
"expires": "2023-12-22T09:53:13.861000Z",
"user_id": "my-unique-id", # 👈 unique id
"max_budget": 0.0
}
2. 为该用户创建密钥
curl 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["gpt-3.5-turbo", "gpt-4"], "user_id": "my-unique-id"}'
返回一个密钥 - sk-...
。
3. 查看用户开销
curl 'http://0.0.0.0:4000/user/info?user_id=my-unique-id' \
-X GET \
-H 'Authorization: Bearer <your-master-key>'
预期响应
{
...
"spend": 0 # 👈 SPEND
}
如果你希望密钥由多人拥有(例如用于生产应用),请使用团队。
1. 创建团队
curl --location 'http://localhost:4000/team/new' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"team_alias": "my-awesome-team"}'
预期响应
{
...
"expires": "2023-12-22T09:53:13.861000Z",
"team_id": "my-unique-id", # 👈 unique id
"max_budget": 0.0
}
2. 为该团队创建密钥
curl 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["gpt-3.5-turbo", "gpt-4"], "team_id": "my-unique-id"}'
返回一个密钥 - sk-...
。
3. 查看团队开销
curl 'http://0.0.0.0:4000/team/info?team_id=my-unique-id' \
-X GET \
-H 'Authorization: Bearer <your-master-key>'
预期响应
{
...
"spend": 0 # 👈 SPEND
}
模型别名
如果用户预计使用给定模型(即 gpt3-5),并且你想要
- 尝试升级请求(即 GPT4)
- 或降级请求(即 Mistral)
你可以这样做
步骤 1:在 config.yaml 中创建一个模型组(保存模型名称、api 密钥等)
model_list:
- model_name: my-free-tier
litellm_params:
model: huggingface/HuggingFaceH4/zephyr-7b-beta
api_base: http://0.0.0.0:8001
- model_name: my-free-tier
litellm_params:
model: huggingface/HuggingFaceH4/zephyr-7b-beta
api_base: http://0.0.0.0:8002
- model_name: my-free-tier
litellm_params:
model: huggingface/HuggingFaceH4/zephyr-7b-beta
api_base: http://0.0.0.0:8003
- model_name: my-paid-tier
litellm_params:
model: gpt-4
api_key: my-api-key
步骤 2:生成密钥
curl -X POST "https://0.0.0.0:4000/key/generate" \
-H "Authorization: Bearer <your-master-key>" \
-H "Content-Type: application/json" \
-d '{
"models": ["my-free-tier"],
"aliases": {"gpt-3.5-turbo": "my-free-tier"}, # 👈 KEY CHANGE
"duration": "30min"
}'
- 如何升级/降级请求? 更改别名映射
步骤 3:测试密钥
curl -X POST "https://0.0.0.0:4000/key/generate" \
-H "Authorization: Bearer <user-key>" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
]
}'
高级
在自定义请求头中传递 LiteLLM 密钥
使用此设置让 LiteLLM 代理在自定义请求头而不是默认的 "Authorization"
请求头中查找虚拟密钥
步骤 1 在 litellm config.yaml 中定义 litellm_key_header_name
名称
model_list:
- model_name: fake-openai-endpoint
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
general_settings:
master_key: sk-1234
litellm_key_header_name: "X-Litellm-Key" # 👈 Key Change
步骤 2 测试
在此请求中,litellm 将使用 X-Litellm-Key
请求头中的虚拟密钥
- curl
- OpenAI Python SDK
curl http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Litellm-Key: Bearer sk-1234" \
-H "Authorization: Bearer bad-key" \
-d '{
"model": "fake-openai-endpoint",
"messages": [
{"role": "user", "content": "Hello, Claude gm!"}
]
}'
预期响应
由于在 X-Litellm-Key
中传递的密钥有效,预计会从 litellm 代理收到成功响应
{"id":"chatcmpl-f9b2b79a7c30477ab93cd0e717d1773e","choices":[{"finish_reason":"stop","index":0,"message":{"content":"\n\nHello there, how may I assist you today?","role":"assistant","tool_calls":null,"function_call":null}}],"created":1677652288,"model":"gpt-3.5-turbo-0125","object":"chat.completion","system_fingerprint":"fp_44709d6fcb","usage":{"completion_tokens":12,"prompt_tokens":9,"total_tokens":21}
client = openai.OpenAI(
api_key="not-used",
base_url="https://api-gateway-url.com/llmservc/api/litellmp",
default_headers={
"Authorization": f"Bearer {API_GATEWAY_TOKEN}", # (optional) For your API Gateway
"X-Litellm-Key": f"Bearer sk-1234" # For LiteLLM Proxy
}
)
启用/禁用虚拟密钥
禁用密钥
curl -L -X POST 'http://0.0.0.0:4000/key/block' \
-H 'Authorization: Bearer LITELLM_MASTER_KEY' \
-H 'Content-Type: application/json' \
-d '{"key": "KEY-TO-BLOCK"}'
预期响应
{
...
"blocked": true
}
启用密钥
curl -L -X POST 'http://0.0.0.0:4000/key/unblock' \
-H 'Authorization: Bearer LITELLM_MASTER_KEY' \
-H 'Content-Type: application/json' \
-d '{"key": "KEY-TO-UNBLOCK"}'
{
...
"blocked": false
}
自定义 /key/generate
如果你需要在生成代理 API 密钥之前添加自定义逻辑(例如验证 team_id
)
1. 编写自定义 custom_generate_key_fn
custom_generate_key_fn 函数的输入是一个参数:data
(类型: GenerateKeyRequest)
你的 custom_generate_key_fn
的输出应该是一个具有以下结构的字典
{
"decision": False,
"message": "This violates LiteLLM Proxy Rules. No team id provided.",
}
decision (类型: bool): 一个布尔值,指示是否允许生成密钥 (True) 或不允许 (False)。
message (类型: str, 可选): 一个可选消息,提供关于该决定的附加信息。当 decision 为 False 时包含此字段。
async def custom_generate_key_fn(data: GenerateKeyRequest)-> dict:
"""
Asynchronous function for generating a key based on the input data.
Args:
data (GenerateKeyRequest): The input data for key generation.
Returns:
dict: A dictionary containing the decision and an optional message.
{
"decision": False,
"message": "This violates LiteLLM Proxy Rules. No team id provided.",
}
"""
# decide if a key should be generated or not
print("using custom auth function!")
data_json = data.json() # type: ignore
# Unpacking variables
team_id = data_json.get("team_id")
duration = data_json.get("duration")
models = data_json.get("models")
aliases = data_json.get("aliases")
config = data_json.get("config")
spend = data_json.get("spend")
user_id = data_json.get("user_id")
max_parallel_requests = data_json.get("max_parallel_requests")
metadata = data_json.get("metadata")
tpm_limit = data_json.get("tpm_limit")
rpm_limit = data_json.get("rpm_limit")
if team_id is not None and team_id == "litellm-core-infra@gmail.com":
# only team_id="litellm-core-infra@gmail.com" can make keys
return {
"decision": True,
}
else:
print("Failed custom auth")
return {
"decision": False,
"message": "This violates LiteLLM Proxy Rules. No team id provided.",
}
2. 传递文件路径(相对于 config.yaml)
传递 config.yaml 的文件路径
例如,如果它们都在同一个目录中 - ./config.yaml
和 ./custom_auth.py
,它看起来像这样
model_list:
- model_name: "openai-model"
litellm_params:
model: "gpt-3.5-turbo"
litellm_settings:
drop_params: True
set_verbose: True
general_settings:
custom_key_generate: custom_auth.custom_generate_key_fn
/key/generate 参数上限
如果你需要为每个密钥的 max_budget
、budget_duration
或任何 key/generate
参数设置默认上限,请使用此选项。
设置 litellm_settings:upperbound_key_generate_params
litellm_settings:
upperbound_key_generate_params:
max_budget: 100 # Optional[float], optional): upperbound of $100, for all /key/generate requests
budget_duration: "10d" # Optional[str], optional): upperbound of 10 days for budget_duration values
duration: "30d" # Optional[str], optional): upperbound of 30 days for all /key/generate requests
max_parallel_requests: 1000 # (Optional[int], optional): Max number of requests that can be made in parallel. Defaults to None.
tpm_limit: 1000 #(Optional[int], optional): Tpm limit. Defaults to None.
rpm_limit: 1000 #(Optional[int], optional): Rpm limit. Defaults to None.
预期行为
- 发送
max_budget=200
的/key/generate
请求 - 密钥将以
max_budget=100
创建,因为 100 是上限
/key/generate 参数默认值
如果你需要控制每个密钥的默认 max_budget
或任何 key/generate
参数,请使用此选项。
当 /key/generate
请求未指定 max_budget
时,将使用 default_key_generate_params
中指定的 max_budget
设置 litellm_settings:default_key_generate_params
litellm_settings:
default_key_generate_params:
max_budget: 1.5000
models: ["azure-gpt-3.5"]
duration: # blank means `null`
metadata: {"setting":"default"}
team_id: "core-infra"
✨ 密钥轮换
轮换现有 API 密钥,同时可选地更新其参数。
curl 'http://localhost:4000/key/sk-1234/regenerate' \
-X POST \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"max_budget": 100,
"metadata": {
"team": "core-infra"
},
"models": [
"gpt-4",
"gpt-3.5-turbo"
]
}'
阅读更多
临时增加预算
使用 /key/update
端点增加现有密钥的预算。
curl -L -X POST 'http://localhost:4000/key/update' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{"key": "sk-b3Z3Lqdb_detHXSUp4ol4Q", "temp_budget_increase": 100, "temp_budget_expiry": "10d"}'
限制密钥生成
使用此设置控制谁可以生成密钥。在允许他人在 UI 上创建密钥时很有用。
litellm_settings:
key_generation_settings:
team_key_generation:
allowed_team_member_roles: ["admin"]
required_params: ["tags"] # require team admins to set tags for cost-tracking when generating a team key
personal_key_generation: # maps to 'Default Team' on UI
allowed_user_roles: ["proxy_admin"]
规范
key_generation_settings: Optional[StandardKeyGenerationConfig] = None
类型
class StandardKeyGenerationConfig(TypedDict, total=False):
team_key_generation: TeamUIKeyGenerationConfig
personal_key_generation: PersonalUIKeyGenerationConfig
class TeamUIKeyGenerationConfig(TypedDict):
allowed_team_member_roles: List[str] # either 'user' or 'admin'
required_params: List[str] # require params on `/key/generate` to be set if a team key (team_id in request) is being generated
class PersonalUIKeyGenerationConfig(TypedDict):
allowed_user_roles: List[LitellmUserRoles]
required_params: List[str] # require params on `/key/generate` to be set if a personal key (no team_id in request) is being generated
class LitellmUserRoles(str, enum.Enum):
"""
Admin Roles:
PROXY_ADMIN: admin over the platform
PROXY_ADMIN_VIEW_ONLY: can login, view all own keys, view all spend
ORG_ADMIN: admin over a specific organization, can create teams, users only within their organization
Internal User Roles:
INTERNAL_USER: can login, view/create/delete their own keys, view their spend
INTERNAL_USER_VIEW_ONLY: can login, view their own keys, view their own spend
Team Roles:
TEAM: used for JWT auth
Customer Roles:
CUSTOMER: External users -> these are customers
"""
# Admin Roles
PROXY_ADMIN = "proxy_admin"
PROXY_ADMIN_VIEW_ONLY = "proxy_admin_viewer"
# Organization admins
ORG_ADMIN = "org_admin"
# Internal User Roles
INTERNAL_USER = "internal_user"
INTERNAL_USER_VIEW_ONLY = "internal_user_viewer"
# Team Roles
TEAM = "team"
# Customer Roles - External users of proxy
CUSTOMER = "customer"
下一步 - 为每个虚拟密钥设置预算、速率限制
按照本文档使用 LiteLLM 为每个虚拟密钥设置预算和速率限制器