虚拟密钥

跟踪开销，并通过代理的虚拟密钥控制模型访问

信息

设置

要求

需要一个 postgres 数据库（例如 Supabase, Neon 等）
在你的环境变量中设置 DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname>
设置一个 master key，这是你的代理管理员密钥 - 你可以使用它来创建其他密钥 (🚨 必须以 sk- 开头)。
- 在 config.yaml 中设置 在 general_settings:master_key 下设置你的 master key，示例如下
- 设置环境变量 设置 LITELLM_MASTER_KEY

（代理 Dockerfile 会检查是否设置了 DATABASE_URL，然后初始化数据库连接）

export DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname>

然后你可以通过访问 /key/generate 端点来生成密钥。

查看代码

快速入门 - 生成密钥

步骤 1：保存 postgres 数据库 URL

model_list:
  - model_name: gpt-4
    litellm_params:
        model: ollama/llama2
  - model_name: gpt-3.5-turbo
    litellm_params:
        model: ollama/llama2

general_settings: 
  master_key: sk-1234 
  database_url: "postgresql://<user>:<password>@<host>:<port>/<dbname>" # 👈 KEY CHANGE

步骤 2：启动 litellm

litellm --config /path/to/config.yaml

步骤 3：生成密钥

curl 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["gpt-3.5-turbo", "gpt-4"], "metadata": {"user": "ishaan@berri.ai"}}'

开销跟踪

按...获取开销

密钥 - 通过 /key/info Swagger
用户 - 通过 /user/info Swagger
团队 - 通过 /team/info Swagger
⏳ 最终用户 - 通过 /end_user/info - 就最终用户成本跟踪在此议题上发表评论

如何计算？

每个模型的成本存储在这里，并由 completion_cost 函数计算。

如何跟踪？

开销会自动在 "LiteLLM_VerificationTokenTable" 中为密钥进行跟踪。如果密钥关联了 'user_id' 或 'team_id'，则该用户的开销会在 "LiteLLM_UserTable" 中跟踪，团队的开销会在 "LiteLLM_TeamTable" 中跟踪。

密钥开销
用户开销
团队开销

你可以使用 /key/info 端点获取密钥的开销。

curl 'http://0.0.0.0:4000/key/info?key=<user-key>' \
     -X GET \
     -H 'Authorization: Bearer <your-master-key>'

当使用 litellm 的 completion_cost() 函数调用 /completions, /chat/completions, /embeddings 时，开销（以美元计）会自动更新。查看代码。

示例响应

{
    "key": "sk-tXL0wt5-lOOVK9sfY2UacA",
    "info": {
        "token": "sk-tXL0wt5-lOOVK9sfY2UacA",
        "spend": 0.0001065, # 👈 SPEND
        "expires": "2023-11-24T23:19:11.131000Z",
        "models": [
            "gpt-3.5-turbo",
            "gpt-4",
            "claude-2"
        ],
        "aliases": {
            "mistral-7b": "gpt-3.5-turbo"
        },
        "config": {}
    }
}

1. 创建用户

curl --location 'https://:4000/user/new' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{user_email: "krrish@berri.ai"}' 

预期响应

{
    ...
    "expires": "2023-12-22T09:53:13.861000Z",
    "user_id": "my-unique-id", # 👈 unique id
    "max_budget": 0.0
}

2. 为该用户创建密钥

curl 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["gpt-3.5-turbo", "gpt-4"], "user_id": "my-unique-id"}'

返回一个密钥 - sk-...。

3. 查看用户开销

curl 'http://0.0.0.0:4000/user/info?user_id=my-unique-id' \
     -X GET \
     -H 'Authorization: Bearer <your-master-key>'

预期响应

{
  ...
  "spend": 0 # 👈 SPEND
}

如果你希望密钥由多人拥有（例如用于生产应用），请使用团队。

1. 创建团队

curl --location 'https://:4000/team/new' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"team_alias": "my-awesome-team"}' 

预期响应

{
    ...
    "expires": "2023-12-22T09:53:13.861000Z",
    "team_id": "my-unique-id", # 👈 unique id
    "max_budget": 0.0
}

2. 为该团队创建密钥

curl 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["gpt-3.5-turbo", "gpt-4"], "team_id": "my-unique-id"}'

返回一个密钥 - sk-...。

3. 查看团队开销

curl 'http://0.0.0.0:4000/team/info?team_id=my-unique-id' \
     -X GET \
     -H 'Authorization: Bearer <your-master-key>'

预期响应

{
  ...
  "spend": 0 # 👈 SPEND
}

模型别名

如果用户预计使用给定模型（即 gpt3-5），并且你想要

尝试升级请求（即 GPT4）
或降级请求（即 Mistral）

你可以这样做

步骤 1：在 config.yaml 中创建一个模型组（保存模型名称、api 密钥等）

model_list:
  - model_name: my-free-tier
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8001
  - model_name: my-free-tier
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8002
  - model_name: my-free-tier
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8003
  - model_name: my-paid-tier
    litellm_params:
        model: gpt-4
        api_key: my-api-key

步骤 2：生成密钥

curl -X POST "https://0.0.0.0:4000/key/generate" \
-H "Authorization: Bearer <your-master-key>" \
-H "Content-Type: application/json" \
-d '{
    "models": ["my-free-tier"], 
    "aliases": {"gpt-3.5-turbo": "my-free-tier"}, # 👈 KEY CHANGE
    "duration": "30min"
}'

如何升级/降级请求？ 更改别名映射

步骤 3：测试密钥

curl -X POST "https://0.0.0.0:4000/key/generate" \
-H "Authorization: Bearer <user-key>" \
-H "Content-Type: application/json" \
-d '{
    "model": "gpt-3.5-turbo", 
    "messages": [
        {
            "role": "user",
            "content": "this is a test request, write a short poem"
        }
    ]
}'

高级

在自定义请求头中传递 LiteLLM 密钥

使用此设置让 LiteLLM 代理在自定义请求头而不是默认的 "Authorization" 请求头中查找虚拟密钥

步骤 1 在 litellm config.yaml 中定义 litellm_key_header_name 名称

model_list:
  - model_name: fake-openai-endpoint
    litellm_params:
      model: openai/fake
      api_key: fake-key
      api_base: https://exampleopenaiendpoint-production.up.railway.app/

general_settings: 
  master_key: sk-1234 
  litellm_key_header_name: "X-Litellm-Key" # 👈 Key Change

步骤 2 测试

在此请求中，litellm 将使用 X-Litellm-Key 请求头中的虚拟密钥

curl
OpenAI Python SDK

curl https://:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-Litellm-Key: Bearer sk-1234" \
  -H "Authorization: Bearer bad-key" \
  -d '{
    "model": "fake-openai-endpoint",
    "messages": [
      {"role": "user", "content": "Hello, Claude gm!"}
    ]
  }'

预期响应

由于在 X-Litellm-Key 中传递的密钥有效，预计会从 litellm 代理收到成功响应

{"id":"chatcmpl-f9b2b79a7c30477ab93cd0e717d1773e","choices":[{"finish_reason":"stop","index":0,"message":{"content":"\n\nHello there, how may I assist you today?","role":"assistant","tool_calls":null,"function_call":null}}],"created":1677652288,"model":"gpt-3.5-turbo-0125","object":"chat.completion","system_fingerprint":"fp_44709d6fcb","usage":{"completion_tokens":12,"prompt_tokens":9,"total_tokens":21}

client = openai.OpenAI(
    api_key="not-used",
    base_url="https://api-gateway-url.com/llmservc/api/litellmp",
    default_headers={
        "Authorization": f"Bearer {API_GATEWAY_TOKEN}", # (optional) For your API Gateway
        "X-Litellm-Key": f"Bearer sk-1234"              # For LiteLLM Proxy
    }
)

启用/禁用虚拟密钥

禁用密钥

curl -L -X POST 'http://0.0.0.0:4000/key/block' \
-H 'Authorization: Bearer LITELLM_MASTER_KEY' \
-H 'Content-Type: application/json' \
-d '{"key": "KEY-TO-BLOCK"}'

预期响应

{
  ...
  "blocked": true
}

启用密钥

curl -L -X POST 'http://0.0.0.0:4000/key/unblock' \
-H 'Authorization: Bearer LITELLM_MASTER_KEY' \
-H 'Content-Type: application/json' \
-d '{"key": "KEY-TO-UNBLOCK"}'

{
  ...
  "blocked": false
}

自定义 /key/generate

如果你需要在生成代理 API 密钥之前添加自定义逻辑（例如验证 team_id）

1. 编写自定义 `custom_generate_key_fn`

custom_generate_key_fn 函数的输入是一个参数：data (类型: GenerateKeyRequest)

你的 custom_generate_key_fn 的输出应该是一个具有以下结构的字典

{
    "decision": False,
    "message": "This violates LiteLLM Proxy Rules. No team id provided.",
}

decision (类型: bool): 一个布尔值，指示是否允许生成密钥 (True) 或不允许 (False)。
message (类型: str, 可选): 一个可选消息，提供关于该决定的附加信息。当 decision 为 False 时包含此字段。

async def custom_generate_key_fn(data: GenerateKeyRequest)-> dict:
        """
        Asynchronous function for generating a key based on the input data.

        Args:
            data (GenerateKeyRequest): The input data for key generation.

        Returns:
            dict: A dictionary containing the decision and an optional message.
            {
                "decision": False,
                "message": "This violates LiteLLM Proxy Rules. No team id provided.",
            }
        """
        
        # decide if a key should be generated or not
        print("using custom auth function!")
        data_json = data.json()  # type: ignore

        # Unpacking variables
        team_id = data_json.get("team_id")
        duration = data_json.get("duration")
        models = data_json.get("models")
        aliases = data_json.get("aliases")
        config = data_json.get("config")
        spend = data_json.get("spend")
        user_id = data_json.get("user_id")
        max_parallel_requests = data_json.get("max_parallel_requests")
        metadata = data_json.get("metadata")
        tpm_limit = data_json.get("tpm_limit")
        rpm_limit = data_json.get("rpm_limit")

        if team_id is not None and team_id == "litellm-core-infra@gmail.com":
            # only team_id="litellm-core-infra@gmail.com" can make keys
            return {
                "decision": True,
            }
        else:
            print("Failed custom auth")
            return {
                "decision": False,
                "message": "This violates LiteLLM Proxy Rules. No team id provided.",
            }

2. 传递文件路径（相对于 config.yaml）

传递 config.yaml 的文件路径

例如，如果它们都在同一个目录中 - ./config.yaml 和 ./custom_auth.py，它看起来像这样

model_list: 
  - model_name: "openai-model"
    litellm_params: 
      model: "gpt-3.5-turbo"

litellm_settings:
  drop_params: True
  set_verbose: True

general_settings:
  custom_key_generate: custom_auth.custom_generate_key_fn

/key/generate 参数上限

如果你需要为每个密钥的 max_budget、budget_duration 或任何 key/generate 参数设置默认上限，请使用此选项。

设置 litellm_settings:upperbound_key_generate_params

litellm_settings:
  upperbound_key_generate_params:
    max_budget: 100 # Optional[float], optional): upperbound of $100, for all /key/generate requests
    budget_duration: "10d" # Optional[str], optional): upperbound of 10 days for budget_duration values
    duration: "30d" # Optional[str], optional): upperbound of 30 days for all /key/generate requests
    max_parallel_requests: 1000 # (Optional[int], optional): Max number of requests that can be made in parallel. Defaults to None.
    tpm_limit: 1000 #(Optional[int], optional): Tpm limit. Defaults to None.
    rpm_limit: 1000 #(Optional[int], optional): Rpm limit. Defaults to None.

预期行为

发送 max_budget=200 的 /key/generate 请求
密钥将以 max_budget=100 创建，因为 100 是上限

/key/generate 参数默认值

如果你需要控制每个密钥的默认 max_budget 或任何 key/generate 参数，请使用此选项。

当 /key/generate 请求未指定 max_budget 时，将使用 default_key_generate_params 中指定的 max_budget

设置 litellm_settings:default_key_generate_params

litellm_settings:
  default_key_generate_params:
    max_budget: 1.5000
    models: ["azure-gpt-3.5"]
    duration:     # blank means `null`
    metadata: {"setting":"default"}
    team_id: "core-infra"

✨ 密钥轮换

信息

这是企业版功能。

企业版定价

获取 7 天免费试用密钥

轮换现有 API 密钥，同时可选地更新其参数。

curl 'https://:4000/key/sk-1234/regenerate' \
  -X POST \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "max_budget": 100,
    "metadata": {
      "team": "core-infra"
    },
    "models": [
      "gpt-4",
      "gpt-3.5-turbo"
    ]
  }'

阅读更多

将轮换的密钥写入密钥管理器

👉 API 参考文档

临时增加预算

使用 /key/update 端点增加现有密钥的预算。

curl -L -X POST 'https://:4000/key/update' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{"key": "sk-b3Z3Lqdb_detHXSUp4ol4Q", "temp_budget_increase": 100, "temp_budget_expiry": "10d"}'

API 参考

限制密钥生成

使用此设置控制谁可以生成密钥。在允许他人在 UI 上创建密钥时很有用。

litellm_settings:
  key_generation_settings:
    team_key_generation:
      allowed_team_member_roles: ["admin"]
      required_params: ["tags"] # require team admins to set tags for cost-tracking when generating a team key
    personal_key_generation: # maps to 'Default Team' on UI 
      allowed_user_roles: ["proxy_admin"]

规范

key_generation_settings: Optional[StandardKeyGenerationConfig] = None

类型

class StandardKeyGenerationConfig(TypedDict, total=False):
    team_key_generation: TeamUIKeyGenerationConfig
    personal_key_generation: PersonalUIKeyGenerationConfig

class TeamUIKeyGenerationConfig(TypedDict):
    allowed_team_member_roles: List[str] # either 'user' or 'admin'
    required_params: List[str] # require params on `/key/generate` to be set if a team key (team_id in request) is being generated


class PersonalUIKeyGenerationConfig(TypedDict):
    allowed_user_roles: List[LitellmUserRoles] 
    required_params: List[str] # require params on `/key/generate` to be set if a personal key (no team_id in request) is being generated


class LitellmUserRoles(str, enum.Enum):
    """
    Admin Roles:
    PROXY_ADMIN: admin over the platform
    PROXY_ADMIN_VIEW_ONLY: can login, view all own keys, view all spend
    ORG_ADMIN: admin over a specific organization, can create teams, users only within their organization

    Internal User Roles:
    INTERNAL_USER: can login, view/create/delete their own keys, view their spend
    INTERNAL_USER_VIEW_ONLY: can login, view their own keys, view their own spend


    Team Roles:
    TEAM: used for JWT auth


    Customer Roles:
    CUSTOMER: External users -> these are customers

    """

    # Admin Roles
    PROXY_ADMIN = "proxy_admin"
    PROXY_ADMIN_VIEW_ONLY = "proxy_admin_viewer"

    # Organization admins
    ORG_ADMIN = "org_admin"

    # Internal User Roles
    INTERNAL_USER = "internal_user"
    INTERNAL_USER_VIEW_ONLY = "internal_user_viewer"

    # Team Roles
    TEAM = "team"

    # Customer Roles - External users of proxy
    CUSTOMER = "customer"

下一步 - 为每个虚拟密钥设置预算、速率限制

按照本文档使用 LiteLLM 为每个虚拟密钥设置预算和速率限制器

虚拟密钥

设置​

快速入门 - 生成密钥​

开销跟踪​

模型别名​

高级​

在自定义请求头中传递 LiteLLM 密钥​

启用/禁用虚拟密钥​

自定义 /key/generate​

1. 编写自定义 custom_generate_key_fn​

2. 传递文件路径（相对于 config.yaml）​

/key/generate 参数上限​

/key/generate 参数默认值​

✨ 密钥轮换​

临时增加预算​

限制密钥生成​

规范​

类型​

下一步 - 为每个虚拟密钥设置预算、速率限制​

端点参考 (规范)​

密钥​

👉 API 参考文档​

用户​

👉 API 参考文档​

团队​

👉 API 参考文档​

设置

快速入门 - 生成密钥

开销跟踪

模型别名

高级

在自定义请求头中传递 LiteLLM 密钥

启用/禁用虚拟密钥

自定义 /key/generate

1. 编写自定义 `custom_generate_key_fn`

2. 传递文件路径（相对于 config.yaml）

/key/generate 参数上限

/key/generate 参数默认值

✨ 密钥轮换

临时增加预算

限制密钥生成

规范

类型

下一步 - 为每个虚拟密钥设置预算、速率限制

端点参考 (规范)

密钥

👉 API 参考文档

用户

👉 API 参考文档

团队

👉 API 参考文档