跳到主要内容

虚拟密钥

跟踪开销,并通过代理的虚拟密钥控制模型访问

设置

要求

  • 需要一个 postgres 数据库(例如 Supabase, Neon 等)
  • 在你的环境变量中设置 DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname>
  • 设置一个 master key,这是你的代理管理员密钥 - 你可以使用它来创建其他密钥 (🚨 必须以 sk- 开头)。
    • 在 config.yaml 中设置general_settings:master_key 下设置你的 master key,示例如下
    • 设置环境变量 设置 LITELLM_MASTER_KEY

(代理 Dockerfile 会检查是否设置了 DATABASE_URL,然后初始化数据库连接)

export DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname>

然后你可以通过访问 /key/generate 端点来生成密钥。

查看代码

快速入门 - 生成密钥

步骤 1:保存 postgres 数据库 URL

model_list:
- model_name: gpt-4
litellm_params:
model: ollama/llama2
- model_name: gpt-3.5-turbo
litellm_params:
model: ollama/llama2

general_settings:
master_key: sk-1234
database_url: "postgresql://<user>:<password>@<host>:<port>/<dbname>" # 👈 KEY CHANGE

步骤 2:启动 litellm

litellm --config /path/to/config.yaml

步骤 3:生成密钥

curl 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["gpt-3.5-turbo", "gpt-4"], "metadata": {"user": "ishaan@berri.ai"}}'

开销跟踪

按...获取开销

如何计算?

每个模型的成本存储在这里,并由 completion_cost 函数计算。

如何跟踪?

开销会自动在 "LiteLLM_VerificationTokenTable" 中为密钥进行跟踪。如果密钥关联了 'user_id' 或 'team_id',则该用户的开销会在 "LiteLLM_UserTable" 中跟踪,团队的开销会在 "LiteLLM_TeamTable" 中跟踪。

你可以使用 /key/info 端点获取密钥的开销。

curl 'http://0.0.0.0:4000/key/info?key=<user-key>' \
-X GET \
-H 'Authorization: Bearer <your-master-key>'

当使用 litellm 的 completion_cost() 函数调用 /completions, /chat/completions, /embeddings 时,开销(以美元计)会自动更新。查看代码

示例响应

{
"key": "sk-tXL0wt5-lOOVK9sfY2UacA",
"info": {
"token": "sk-tXL0wt5-lOOVK9sfY2UacA",
"spend": 0.0001065, # 👈 SPEND
"expires": "2023-11-24T23:19:11.131000Z",
"models": [
"gpt-3.5-turbo",
"gpt-4",
"claude-2"
],
"aliases": {
"mistral-7b": "gpt-3.5-turbo"
},
"config": {}
}
}

模型别名

如果用户预计使用给定模型(即 gpt3-5),并且你想要

  • 尝试升级请求(即 GPT4)
  • 或降级请求(即 Mistral)

你可以这样做

步骤 1:在 config.yaml 中创建一个模型组(保存模型名称、api 密钥等)

model_list:
- model_name: my-free-tier
litellm_params:
model: huggingface/HuggingFaceH4/zephyr-7b-beta
api_base: http://0.0.0.0:8001
- model_name: my-free-tier
litellm_params:
model: huggingface/HuggingFaceH4/zephyr-7b-beta
api_base: http://0.0.0.0:8002
- model_name: my-free-tier
litellm_params:
model: huggingface/HuggingFaceH4/zephyr-7b-beta
api_base: http://0.0.0.0:8003
- model_name: my-paid-tier
litellm_params:
model: gpt-4
api_key: my-api-key

步骤 2:生成密钥

curl -X POST "https://0.0.0.0:4000/key/generate" \
-H "Authorization: Bearer <your-master-key>" \
-H "Content-Type: application/json" \
-d '{
"models": ["my-free-tier"],
"aliases": {"gpt-3.5-turbo": "my-free-tier"}, # 👈 KEY CHANGE
"duration": "30min"
}'
  • 如何升级/降级请求? 更改别名映射

步骤 3:测试密钥

curl -X POST "https://0.0.0.0:4000/key/generate" \
-H "Authorization: Bearer <user-key>" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
]
}'

高级

在自定义请求头中传递 LiteLLM 密钥

使用此设置让 LiteLLM 代理在自定义请求头而不是默认的 "Authorization" 请求头中查找虚拟密钥

步骤 1 在 litellm config.yaml 中定义 litellm_key_header_name 名称

model_list:
- model_name: fake-openai-endpoint
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/

general_settings:
master_key: sk-1234
litellm_key_header_name: "X-Litellm-Key" # 👈 Key Change

步骤 2 测试

在此请求中,litellm 将使用 X-Litellm-Key 请求头中的虚拟密钥

curl http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Litellm-Key: Bearer sk-1234" \
-H "Authorization: Bearer bad-key" \
-d '{
"model": "fake-openai-endpoint",
"messages": [
{"role": "user", "content": "Hello, Claude gm!"}
]
}'

预期响应

由于在 X-Litellm-Key 中传递的密钥有效,预计会从 litellm 代理收到成功响应

{"id":"chatcmpl-f9b2b79a7c30477ab93cd0e717d1773e","choices":[{"finish_reason":"stop","index":0,"message":{"content":"\n\nHello there, how may I assist you today?","role":"assistant","tool_calls":null,"function_call":null}}],"created":1677652288,"model":"gpt-3.5-turbo-0125","object":"chat.completion","system_fingerprint":"fp_44709d6fcb","usage":{"completion_tokens":12,"prompt_tokens":9,"total_tokens":21}

启用/禁用虚拟密钥

禁用密钥

curl -L -X POST 'http://0.0.0.0:4000/key/block' \
-H 'Authorization: Bearer LITELLM_MASTER_KEY' \
-H 'Content-Type: application/json' \
-d '{"key": "KEY-TO-BLOCK"}'

预期响应

{
...
"blocked": true
}

启用密钥

curl -L -X POST 'http://0.0.0.0:4000/key/unblock' \
-H 'Authorization: Bearer LITELLM_MASTER_KEY' \
-H 'Content-Type: application/json' \
-d '{"key": "KEY-TO-UNBLOCK"}'
{
...
"blocked": false
}

自定义 /key/generate

如果你需要在生成代理 API 密钥之前添加自定义逻辑(例如验证 team_id

1. 编写自定义 custom_generate_key_fn

custom_generate_key_fn 函数的输入是一个参数:data (类型: GenerateKeyRequest)

你的 custom_generate_key_fn 的输出应该是一个具有以下结构的字典

{
"decision": False,
"message": "This violates LiteLLM Proxy Rules. No team id provided.",
}

  • decision (类型: bool): 一个布尔值,指示是否允许生成密钥 (True) 或不允许 (False)。

  • message (类型: str, 可选): 一个可选消息,提供关于该决定的附加信息。当 decision 为 False 时包含此字段。

async def custom_generate_key_fn(data: GenerateKeyRequest)-> dict:
"""
Asynchronous function for generating a key based on the input data.

Args:
data (GenerateKeyRequest): The input data for key generation.

Returns:
dict: A dictionary containing the decision and an optional message.
{
"decision": False,
"message": "This violates LiteLLM Proxy Rules. No team id provided.",
}
"""

# decide if a key should be generated or not
print("using custom auth function!")
data_json = data.json() # type: ignore

# Unpacking variables
team_id = data_json.get("team_id")
duration = data_json.get("duration")
models = data_json.get("models")
aliases = data_json.get("aliases")
config = data_json.get("config")
spend = data_json.get("spend")
user_id = data_json.get("user_id")
max_parallel_requests = data_json.get("max_parallel_requests")
metadata = data_json.get("metadata")
tpm_limit = data_json.get("tpm_limit")
rpm_limit = data_json.get("rpm_limit")

if team_id is not None and team_id == "litellm-core-infra@gmail.com":
# only team_id="litellm-core-infra@gmail.com" can make keys
return {
"decision": True,
}
else:
print("Failed custom auth")
return {
"decision": False,
"message": "This violates LiteLLM Proxy Rules. No team id provided.",
}

2. 传递文件路径(相对于 config.yaml)

传递 config.yaml 的文件路径

例如,如果它们都在同一个目录中 - ./config.yaml./custom_auth.py,它看起来像这样

model_list: 
- model_name: "openai-model"
litellm_params:
model: "gpt-3.5-turbo"

litellm_settings:
drop_params: True
set_verbose: True

general_settings:
custom_key_generate: custom_auth.custom_generate_key_fn

/key/generate 参数上限

如果你需要为每个密钥的 max_budgetbudget_duration 或任何 key/generate 参数设置默认上限,请使用此选项。

设置 litellm_settings:upperbound_key_generate_params

litellm_settings:
upperbound_key_generate_params:
max_budget: 100 # Optional[float], optional): upperbound of $100, for all /key/generate requests
budget_duration: "10d" # Optional[str], optional): upperbound of 10 days for budget_duration values
duration: "30d" # Optional[str], optional): upperbound of 30 days for all /key/generate requests
max_parallel_requests: 1000 # (Optional[int], optional): Max number of requests that can be made in parallel. Defaults to None.
tpm_limit: 1000 #(Optional[int], optional): Tpm limit. Defaults to None.
rpm_limit: 1000 #(Optional[int], optional): Rpm limit. Defaults to None.

预期行为

  • 发送 max_budget=200/key/generate 请求
  • 密钥将以 max_budget=100 创建,因为 100 是上限

/key/generate 参数默认值

如果你需要控制每个密钥的默认 max_budget 或任何 key/generate 参数,请使用此选项。

/key/generate 请求未指定 max_budget 时,将使用 default_key_generate_params 中指定的 max_budget

设置 litellm_settings:default_key_generate_params

litellm_settings:
default_key_generate_params:
max_budget: 1.5000
models: ["azure-gpt-3.5"]
duration: # blank means `null`
metadata: {"setting":"default"}
team_id: "core-infra"

✨ 密钥轮换

信息

轮换现有 API 密钥,同时可选地更新其参数。


curl 'http://localhost:4000/key/sk-1234/regenerate' \
-X POST \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"max_budget": 100,
"metadata": {
"team": "core-infra"
},
"models": [
"gpt-4",
"gpt-3.5-turbo"
]
}'

阅读更多

👉 API 参考文档

临时增加预算

使用 /key/update 端点增加现有密钥的预算。

curl -L -X POST 'http://localhost:4000/key/update' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{"key": "sk-b3Z3Lqdb_detHXSUp4ol4Q", "temp_budget_increase": 100, "temp_budget_expiry": "10d"}'

API 参考

限制密钥生成

使用此设置控制谁可以生成密钥。在允许他人在 UI 上创建密钥时很有用。

litellm_settings:
key_generation_settings:
team_key_generation:
allowed_team_member_roles: ["admin"]
required_params: ["tags"] # require team admins to set tags for cost-tracking when generating a team key
personal_key_generation: # maps to 'Default Team' on UI
allowed_user_roles: ["proxy_admin"]

规范

key_generation_settings: Optional[StandardKeyGenerationConfig] = None

类型

class StandardKeyGenerationConfig(TypedDict, total=False):
team_key_generation: TeamUIKeyGenerationConfig
personal_key_generation: PersonalUIKeyGenerationConfig

class TeamUIKeyGenerationConfig(TypedDict):
allowed_team_member_roles: List[str] # either 'user' or 'admin'
required_params: List[str] # require params on `/key/generate` to be set if a team key (team_id in request) is being generated


class PersonalUIKeyGenerationConfig(TypedDict):
allowed_user_roles: List[LitellmUserRoles]
required_params: List[str] # require params on `/key/generate` to be set if a personal key (no team_id in request) is being generated


class LitellmUserRoles(str, enum.Enum):
"""
Admin Roles:
PROXY_ADMIN: admin over the platform
PROXY_ADMIN_VIEW_ONLY: can login, view all own keys, view all spend
ORG_ADMIN: admin over a specific organization, can create teams, users only within their organization

Internal User Roles:
INTERNAL_USER: can login, view/create/delete their own keys, view their spend
INTERNAL_USER_VIEW_ONLY: can login, view their own keys, view their own spend


Team Roles:
TEAM: used for JWT auth


Customer Roles:
CUSTOMER: External users -> these are customers

"""

# Admin Roles
PROXY_ADMIN = "proxy_admin"
PROXY_ADMIN_VIEW_ONLY = "proxy_admin_viewer"

# Organization admins
ORG_ADMIN = "org_admin"

# Internal User Roles
INTERNAL_USER = "internal_user"
INTERNAL_USER_VIEW_ONLY = "internal_user_viewer"

# Team Roles
TEAM = "team"

# Customer Roles - External users of proxy
CUSTOMER = "customer"

下一步 - 为每个虚拟密钥设置预算、速率限制

按照本文档使用 LiteLLM 为每个虚拟密钥设置预算和速率限制器

端点参考 (规范)

密钥

👉 API 参考文档

用户

👉 API 参考文档

团队

👉 API 参考文档