跳到主要内容

日志

使用以下方式记录代理的输入、输出和异常:

  • Langfuse
  • OpenTelemetry
  • GCS、s3、Azure (Blob) 存储桶
  • Lunary
  • MLflow
  • 自定义回调 - 自定义代码和 API 端点
  • Langsmith
  • DataDog
  • DynamoDB

获取 LiteLLM 调用 ID

LiteLLM 为每个请求生成一个唯一的 call_id。这个 call_id 可用于在整个系统中跟踪请求。这对于在日志系统中(例如本页提到的任一系统)查找特定请求的信息非常有用。

curl -i -sSL --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "what llm are you"}]
}' | grep 'x-litellm'

其输出为

x-litellm-call-id: b980db26-9512-45cc-b1da-c511a363b83f
x-litellm-model-id: cb41bc03f4c33d310019bae8c5afdb1af0a8f97b36a234405a9807614988457c
x-litellm-model-api-base: https://x-example-1234.openai.azure.com
x-litellm-version: 1.40.21
x-litellm-response-cost: 2.85e-05
x-litellm-key-tpm-limit: None
x-litellm-key-rpm-limit: None

其中一些请求头对于故障排除可能有用,但 x-litellm-call-id 对于在系统组件(包括日志工具)中跟踪请求最为有用。

日志功能

基于虚拟密钥和团队的条件日志记录

使用此功能可以

  1. 有条件地为某些虚拟密钥/团队启用日志记录
  2. 为不同的虚拟密钥/团队设置不同的日志提供商

👉 入门 - 基于团队/密钥的日志记录

编辑用户 API 密钥信息

从日志中编辑用户 API 密钥信息(哈希令牌、用户 ID、团队 ID 等)。

目前支持 Langfuse、OpenTelemetry、Logfire、ArizeAI 日志记录。

litellm_settings: 
callbacks: ["langfuse"]
redact_user_api_key_info: true

编辑消息、响应内容

设置 litellm.turn_off_message_logging=True。这将阻止消息和响应记录到您的日志提供商,但请求元数据(例如支出)仍将被跟踪。

1. 设置 config.yaml

model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
success_callback: ["langfuse"]
turn_off_message_logging: True # 👈 Key Change

2. 发送请求

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'

3. 检查日志工具 + 支出日志

日志工具

支出日志

禁用消息编辑

如果您已开启 litellm.turn_on_message_logging,可以通过设置请求头 LiteLLM-Disable-Message-Redaction: true 来覆盖特定请求的设置。

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'LiteLLM-Disable-Message-Redaction: true' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'

关闭所有跟踪/日志记录

在某些用例中,您可能希望关闭所有跟踪/日志记录。您可以通过在请求体中传递 no-log=True 来实现此目的。

信息

通过在 config.yaml 文件中设置 global_disable_no_log_param:true 来禁用此功能。

litellm_settings:
global_disable_no_log_param: True
curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <litellm-api-key>' \
-d '{
"model": "openai/gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What'\''s in this image?"
}
]
}
],
"max_tokens": 300,
"no-log": true # 👈 Key Change
}'

预期的控制台日志

LiteLLM.Info: "no-log request, skipping logging"

哪些内容会被记录?

可在 kwargs["standard_logging_object"] 下找到。这是一个标准负载,为每个响应记录。

👉 标准日志负载规范

Langfuse

我们将使用 --config 设置 litellm.success_callback = ["langfuse"],这将把所有成功的 LLM 调用记录到 Langfuse。请确保在您的环境中设置 LANGFUSE_PUBLIC_KEYLANGFUSE_SECRET_KEY

步骤 1 安装 Langfuse

pip install langfuse>=2.0.0

步骤 2: 创建一个 config.yaml 文件并设置 litellm_settings: success_callback

model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
success_callback: ["langfuse"]

步骤 3: 设置记录到 Langfuse 所需的环境变量

export LANGFUSE_PUBLIC_KEY="pk_kk"
export LANGFUSE_SECRET_KEY="sk_ss"
# Optional, defaults to https://cloud.langfuse.com
export LANGFUSE_HOST="https://xxx.langfuse.com"

步骤 4: 启动代理,发起测试请求

启动代理

litellm --config config.yaml --debug

测试请求

litellm --test

Langfuse 上的预期输出

将元数据记录到 Langfuse

metadata 作为请求体的一部分传递

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
"metadata": {
"generation_name": "ishaan-test-generation",
"generation_id": "gen-id22",
"trace_id": "trace-id22",
"trace_user_id": "user-id2"
}
}'

自定义标签

tags 作为请求体的一部分设置

import openai
client = openai.OpenAI(
api_key="sk-1234",
base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
model="llama3",
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
user="palantir",
extra_body={
"metadata": {
"tags": ["jobID:214590dsff09fds", "taskName:run_page_classification"]
}
}
)

print(response)

LiteLLM 标签 - cache_hit, cache_key

如果您想控制 LiteLLM 代理将哪些 LiteLLM 特定的字段作为标签记录,请使用此功能。默认情况下,LiteLLM 代理不记录 LiteLLM 特定的字段

LiteLLM 特定字段描述示例值
cache_hit指示是否发生缓存命中 (True) 或未发生 (False)true, false
cache_key用于此请求的缓存密钥d2b758c****
proxy_base_url代理服务器的基础 URL,即您服务器上的环境变量 PROXY_BASE_URL 的值https://proxy.example.com
user_api_key_aliasLiteLLM 虚拟密钥的别名。prod-app1
user_api_key_user_id与用户 API 密钥关联的唯一 ID。user_123, user_456
user_api_key_user_email与用户 API 密钥关联的电子邮件。user@example.com, admin@example.com
user_api_key_team_alias与 API 密钥关联的团队的别名。team_alpha, dev_team

用法

指定 langfuse_default_tags 来控制哪些 LiteLLM 字段被记录到 Langfuse

config.yaml 示例

model_list:
- model_name: gpt-4
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/

litellm_settings:
success_callback: ["langfuse"]

# 👇 Key Change
langfuse_default_tags: ["cache_hit", "cache_key", "proxy_base_url", "user_api_key_alias", "user_api_key_user_id", "user_api_key_user_email", "user_api_key_team_alias", "semantic-similarity", "proxy_base_url"]

查看 LiteLLM 发送给提供商的 POST 请求

当您想查看 LiteLLM 发送给 LLM API 的原始 curl 请求时使用此功能

metadata 作为请求体的一部分传递

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
"metadata": {
"log_raw_request": true
}
}'

Langfuse 上的预期输出

您将在 Langfuse 元数据中看到 raw_request。这是 LiteLLM 发送给您的 LLM API 提供商的原始 CURL 命令

OpenTelemetry

信息

[可选]通过在您的环境中设置以下变量来自定义 OTEL 服务名称和 OTEL TRACER 名称

OTEL_TRACER_NAME=<your-trace-name>     # default="litellm"
OTEL_SERVICE_NAME=<your-service-name>` # default="litellm"

步骤 1: 设置回调和环境变量

将以下内容添加到您的环境变量中

OTEL_EXPORTER="console"

otel 添加到您的 litellm_config.yaml 作为回调

litellm_settings:
callbacks: ["otel"]

步骤 2: 启动代理,发起测试请求

启动代理

litellm --config config.yaml --detailed_debug

测试请求

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'

步骤 3: 期望在您的服务器日志/控制台中看到以下记录

这是 OTEL 日志记录的 Span

{
"name": "litellm-acompletion",
"context": {
"trace_id": "0x8d354e2346060032703637a0843b20a3",
"span_id": "0xd8d3476a2eb12724",
"trace_state": "[]"
},
"kind": "SpanKind.INTERNAL",
"parent_id": null,
"start_time": "2024-06-04T19:46:56.415888Z",
"end_time": "2024-06-04T19:46:56.790278Z",
"status": {
"status_code": "OK"
},
"attributes": {
"model": "llama3-8b-8192"
},
"events": [],
"links": [],
"resource": {
"attributes": {
"service.name": "litellm"
},
"schema_url": ""
}
}

🎉 期望在您的 OTEL 收集器中看到此跟踪记录

编辑消息、响应内容

otel 设置 message_logging=False,将不会记录消息/响应

litellm_settings:
callbacks: ["otel"]

## 👇 Key Change
callback_settings:
otel:
message_logging: False

Traceparent 请求头

跨服务的上下文传播 Traceparent HTTP Header

❓ 当您想在分布式跟踪系统中传递有关传入请求的信息时使用此功能

✅ 关键变化:在您的请求中传递 traceparent 请求头在此处阅读有关 traceparent 请求头的更多信息

traceparent: 00-80e1afed08e019fc1110464cfa66635c-7a085853722dc6d2-01

示例用法

  1. 向 LiteLLM 代理发起带有 traceparent 请求头的请求
import openai
import uuid

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
example_traceparent = f"00-80e1afed08e019fc1110464cfa66635c-02e80198930058d4-01"
extra_headers = {
"traceparent": example_traceparent
}
_trace_id = example_traceparent.split("-")[1]

print("EXTRA HEADERS: ", extra_headers)
print("Trace ID: ", _trace_id)

response = client.chat.completions.create(
model="llama3",
messages=[
{"role": "user", "content": "this is a test request, write a short poem"}
],
extra_headers=extra_headers,
)

print(response)
# EXTRA HEADERS:  {'traceparent': '00-80e1afed08e019fc1110464cfa66635c-02e80198930058d4-01'}
# Trace ID: 80e1afed08e019fc1110464cfa66635c
  1. 在 OTEL Logger 上查找跟踪 ID

在您的 OTEL Collector 上搜索 Trace=80e1afed08e019fc1110464cfa66635c

Traceparent HTTP Header 转发到 LLM API

如果您想将 traceparent 请求头转发到您自托管的 LLM(例如 vLLM),请使用此功能

在您的 config.yaml 中设置 forward_traceparent_to_llm_provider: True。这将把 traceparent 请求头转发到您的 LLM API

危险

仅将其用于自托管的 LLM,这可能导致 Bedrock、VertexAI 调用失败

litellm_settings:
forward_traceparent_to_llm_provider: True

Google Cloud Storage 存储桶

将 LLM 日志记录到 Google Cloud Storage 存储桶

信息

✨ 这是企业版专属功能 在此处开始使用企业版

属性详情
描述将 LLM 输入/输出记录到云存储桶
负载测试基准基准测试
关于 Cloud Storage 的 Google 文档Google Cloud Storage

用法

  1. gcs_bucket 添加到 LiteLLM Config.yaml
model_list:
- litellm_params:
api_base: https://exampleopenaiendpoint-production.up.railway.app/
api_key: my-fake-key
model: openai/my-fake-model
model_name: fake-openai-endpoint

litellm_settings:
callbacks: ["gcs_bucket"] # 👈 KEY CHANGE # 👈 KEY CHANGE
  1. 设置所需的环境变量
GCS_BUCKET_NAME="<your-gcs-bucket-name>"
GCS_PATH_SERVICE_ACCOUNT="/Users/ishaanjaffer/Downloads/adroit-crow-413218-a956eef1a2a8.json" # Add path to service account.json
  1. 启动代理
litellm --config /path/to/config.yaml
  1. 测试一下!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}
'

GCS 存储桶上的预期日志

GCS 存储桶上记录的字段

标准日志对象将记录到 GCS 存储桶

从 Google Cloud Console 获取 service_account.json

  1. 前往 Google Cloud Console
  2. 搜索 IAM 与管理员
  3. 点击服务账号
  4. 选择一个服务账号
  5. 点击“密钥” -> 添加密钥 -> 创建新密钥 -> JSON
  6. 保存 JSON 文件并将路径添加到 GCS_PATH_SERVICE_ACCOUNT

Google Cloud Storage - PubSub 主题

将 LLM 日志/支出日志记录到 Google Cloud Storage PubSub 主题

信息

✨ 这是企业版专属功能 在此处开始使用企业版

属性详情
描述将 LiteLLM SpendLogs Table 记录到 Google Cloud Storage PubSub 主题

何时使用 gcs_pubsub

  • 如果您的 LiteLLM 数据库的支出日志已超过 100 万条,并且您想将 SpendLogs 发送到可由 GCS BigQuery 消费的 PubSub 主题

用法

  1. gcs_pubsub 添加到 LiteLLM Config.yaml
model_list:
- litellm_params:
api_base: https://exampleopenaiendpoint-production.up.railway.app/
api_key: my-fake-key
model: openai/my-fake-model
model_name: fake-openai-endpoint

litellm_settings:
callbacks: ["gcs_pubsub"] # 👈 KEY CHANGE # 👈 KEY CHANGE
  1. 设置所需的环境变量
GCS_PUBSUB_TOPIC_ID="litellmDB"
GCS_PUBSUB_PROJECT_ID="reliableKeys"
  1. 启动代理
litellm --config /path/to/config.yaml
  1. 测试一下!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}
'

s3 存储桶

我们将使用 --config 来设置

  • litellm.success_callback = ["s3"]

这将把所有成功的 LLM 调用记录到 s3 存储桶

步骤 1 在 .env 中设置 AWS 凭据

AWS_ACCESS_KEY_ID = ""
AWS_SECRET_ACCESS_KEY = ""
AWS_REGION_NAME = ""

步骤 2: 创建一个 config.yaml 文件并设置 litellm_settings: success_callback

model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
success_callback: ["s3"]
s3_callback_params:
s3_bucket_name: logs-bucket-litellm # AWS Bucket Name for S3
s3_region_name: us-west-2 # AWS Region Name for S3
s3_aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID # us os.environ/<variable name> to pass environment variables. This is AWS Access Key ID for S3
s3_aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY # AWS Secret Access Key for S3
s3_path: my-test-path # [OPTIONAL] set path in bucket you want to write logs to
s3_endpoint_url: https://s3.amazonaws.com # [OPTIONAL] S3 endpoint URL, if you want to use Backblaze/cloudflare s3 buckets

步骤 3: 启动代理,发起测试请求

启动代理

litellm --config config.yaml --debug

测试请求

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "Azure OpenAI GPT-4 East",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'

您的日志应可在指定的 s3 存储桶中找到

对象键中的团队别名前缀

这是一个预览功能

您可以通过在 config.yaml 文件中设置 team_alias 将团队别名添加到对象键。这将使用团队别名作为对象键的前缀。

litellm_settings:
callbacks: ["s3"]
enable_preview_features: true
s3_callback_params:
s3_bucket_name: logs-bucket-litellm
s3_region_name: us-west-2
s3_aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
s3_aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
s3_path: my-test-path
s3_endpoint_url: https://s3.amazonaws.com
s3_use_team_prefix: true

在 s3 存储桶中,您将看到对象键为 my-test-path/my-team-alias/...

Azure Blob 存储

将 LLM 日志记录到 Azure Data Lake Storage

信息

✨ 这是企业版专属功能 在此处开始使用企业版

属性详情
描述将 LLM 输入/输出记录到 Azure Blob 存储(存储桶)
关于 Data Lake Storage 的 Azure 文档Azure Data Lake Storage

用法

  1. azure_storage 添加到 LiteLLM Config.yaml
model_list:
- model_name: fake-openai-endpoint
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/

litellm_settings:
callbacks: ["azure_storage"] # 👈 KEY CHANGE # 👈 KEY CHANGE
  1. 设置所需的环境变量
# Required Environment Variables for Azure Storage
AZURE_STORAGE_ACCOUNT_NAME="litellm2" # The name of the Azure Storage Account to use for logging
AZURE_STORAGE_FILE_SYSTEM="litellm-logs" # The name of the Azure Storage File System to use for logging. (Typically the Container name)

# Authentication Variables
# Option 1: Use Storage Account Key
AZURE_STORAGE_ACCOUNT_KEY="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # The Azure Storage Account Key to use for Authentication

# Option 2: Use Tenant ID + Client ID + Client Secret
AZURE_STORAGE_TENANT_ID="985efd7cxxxxxxxxxx" # The Application Tenant ID to use for Authentication
AZURE_STORAGE_CLIENT_ID="abe66585xxxxxxxxxx" # The Application Client ID to use for Authentication
AZURE_STORAGE_CLIENT_SECRET="uMS8Qxxxxxxxxxx" # The Application Client Secret to use for Authentication
  1. 启动代理
litellm --config /path/to/config.yaml
  1. 测试一下!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}
'

Azure Data Lake Storage 上的预期日志

Azure Data Lake Storage 上记录的字段

标准日志对象将记录到 Azure Data Lake Storage

DataDog

LiteLLM 支持记录到以下 Datadog 集成

我们将使用 --config 设置 litellm.callbacks = ["datadog"],这将把所有成功的 LLM 调用记录到 DataDog

步骤 1: 创建一个 config.yaml 文件并设置 litellm_settings: success_callback

model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
callbacks: ["datadog"] # logs llm success + failure logs on datadog
service_callback: ["datadog"] # logs redis, postgres failures on datadog

步骤 2: 设置 Datadog 所需的环境变量

DD_API_KEY="5f2d0f310***********" # your datadog API Key
DD_SITE="us5.datadoghq.com" # your datadog base url
DD_SOURCE="litellm_dev" # [OPTIONAL] your datadog source. use to differentiate dev vs. prod deployments

步骤 3: 启动代理,发起测试请求

启动代理

litellm --config config.yaml --debug

测试请求

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
"metadata": {
"your-custom-metadata": "custom-field",
}
}'

Datadog 上的预期输出

Datadog Tracing

使用 ddtrace-run 在 LiteLLM 代理上启用 Datadog Tracing

USE_DDTRACE=true 传递给 docker run 命令。当 USE_DDTRACE=true 时,代理将运行 ddtrace-run litellm 作为 ENTRYPOINT,而不是仅仅运行 litellm

docker run \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
-e USE_DDTRACE=true \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-latest \
--config /app/config.yaml --detailed_debug

设置 DD 变量(例如 DD_SERVICE

LiteLLM 支持自定义以下 Datadog 环境变量

环境变量描述默认值必需
DD_API_KEY您的 Datadog API 密钥,用于身份验证✅ 是
DD_SITE您的 Datadog 站点(例如,“us5.datadoghq.com”)✅ 是
DD_ENV您的日志的环境标签(例如,“production”,“staging”)"unknown"❌ 否
DD_SERVICE您的日志的服务名称"litellm-server"❌ 否
DD_SOURCE您的日志的源名称"litellm"❌ 否
DD_VERSION您的日志的版本标签"unknown"❌ 否
HOSTNAME您的日志的主机名标签""❌ 否
POD_NAMEPod 名称标签(对 Kubernetes 部署有用)"unknown"❌ 否

Lunary

步骤 1: 安装依赖项并设置您的环境变量

安装依赖项

pip install litellm lunary

https://app.lunary.ai/settings 获取您的 Lunary 公钥

export LUNARY_PUBLIC_KEY="<your-public-key>"

步骤 2: 创建一个 config.yaml 并设置 lunary 回调

model_list:
- model_name: "*"
litellm_params:
model: "*"
litellm_settings:
success_callback: ["lunary"]
failure_callback: ["lunary"]

步骤 3: 启动 LiteLLM 代理

litellm --config config.yaml

步骤 4: 发起请求

curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful math tutor. Guide the user through the solution step by step."
},
{
"role": "user",
"content": "how can I solve 8x + 7 = -23"
}
]
}'

MLflow

步骤 1: 安装依赖项

安装依赖项。

pip install litellm mlflow

步骤 2: 创建一个带有 mlflow 回调的 config.yaml

model_list:
- model_name: "*"
litellm_params:
model: "*"
litellm_settings:
success_callback: ["mlflow"]
failure_callback: ["mlflow"]

步骤 3: 启动 LiteLLM 代理

litellm --config config.yaml

步骤 4: 发起请求

curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'

步骤 5: 查看跟踪

运行以下命令启动 MLflow UI 并查看记录的跟踪。

mlflow ui

自定义回调类[异步]

当您想在 python 中运行自定义回调时使用此功能

步骤 1 - 创建您的自定义 litellm 回调类

我们为此使用了 litellm.integrations.custom_logger有关 LiteLLM 自定义回调的更多详情请参见 此处

在 python 文件中定义您的自定义回调类。

这是一个用于跟踪 key, user, model, prompt, response, tokens, cost 的自定义记录器示例。我们创建一个名为 custom_callbacks.py 的文件并初始化 proxy_handler_instance

from litellm.integrations.custom_logger import CustomLogger
import litellm

# This file includes the custom callbacks for LiteLLM Proxy
# Once defined, these can be passed in proxy_config.yaml
class MyCustomHandler(CustomLogger):
def log_pre_api_call(self, model, messages, kwargs):
print(f"Pre-API Call")

def log_post_api_call(self, kwargs, response_obj, start_time, end_time):
print(f"Post-API Call")

def log_success_event(self, kwargs, response_obj, start_time, end_time):
print("On Success")

def log_failure_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Failure")

async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Success!")
# log: key, user, model, prompt, response, tokens, cost
# Access kwargs passed to litellm.completion()
model = kwargs.get("model", None)
messages = kwargs.get("messages", None)
user = kwargs.get("user", None)

# Access litellm_params passed to litellm.completion(), example access `metadata`
litellm_params = kwargs.get("litellm_params", {})
metadata = litellm_params.get("metadata", {}) # headers passed to LiteLLM proxy, can be found here

# Calculate cost using litellm.completion_cost()
cost = litellm.completion_cost(completion_response=response_obj)
response = response_obj
# tokens used in response
usage = response_obj["usage"]

print(
f"""
Model: {model},
Messages: {messages},
User: {user},
Usage: {usage},
Cost: {cost},
Response: {response}
Proxy Metadata: {metadata}
"""
)
return

async def async_log_failure_event(self, kwargs, response_obj, start_time, end_time):
try:
print(f"On Async Failure !")
print("\nkwargs", kwargs)
# Access kwargs passed to litellm.completion()
model = kwargs.get("model", None)
messages = kwargs.get("messages", None)
user = kwargs.get("user", None)

# Access litellm_params passed to litellm.completion(), example access `metadata`
litellm_params = kwargs.get("litellm_params", {})
metadata = litellm_params.get("metadata", {}) # headers passed to LiteLLM proxy, can be found here

# Access Exceptions & Traceback
exception_event = kwargs.get("exception", None)
traceback_event = kwargs.get("traceback_exception", None)

# Calculate cost using litellm.completion_cost()
cost = litellm.completion_cost(completion_response=response_obj)
print("now checking response obj")

print(
f"""
Model: {model},
Messages: {messages},
User: {user},
Cost: {cost},
Response: {response_obj}
Proxy Metadata: {metadata}
Exception: {exception_event}
Traceback: {traceback_event}
"""
)
except Exception as e:
print(f"Exception: {e}")

proxy_handler_instance = MyCustomHandler()

# Set litellm.callbacks = [proxy_handler_instance] on the proxy
# need to set litellm.callbacks = [proxy_handler_instance] # on the proxy

步骤 2 - 在 config.yaml 中传递您的自定义回调类

我们将 步骤 1 中定义的自定义回调类传递给 config.yaml。将 callbacks 设置为 python_filename.logger_instance_name

在下面的配置中,我们传递了

  • python_filename: custom_callbacks.py
  • logger_instance_name: proxy_handler_instance。这在步骤 1 中定义

callbacks: custom_callbacks.proxy_handler_instance

model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo

litellm_settings:
callbacks: custom_callbacks.proxy_handler_instance # sets litellm.callbacks = [proxy_handler_instance]

步骤 3 - 启动代理 + 测试请求

litellm --config proxy_config.yaml
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Authorization: Bearer sk-1234' \
--data ' {
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "good morning good sir"
}
],
"user": "ishaan-app",
"temperature": 0.2
}'

代理上的结果日志

On Success
Model: gpt-3.5-turbo,
Messages: [{'role': 'user', 'content': 'good morning good sir'}],
User: ishaan-app,
Usage: {'completion_tokens': 10, 'prompt_tokens': 11, 'total_tokens': 21},
Cost: 3.65e-05,
Response: {'id': 'chatcmpl-8S8avKJ1aVBg941y5xzGMSKrYCMvN', 'choices': [{'finish_reason': 'stop', 'index': 0, 'message': {'content': 'Good morning! How can I assist you today?', 'role': 'assistant'}}], 'created': 1701716913, 'model': 'gpt-3.5-turbo-0613', 'object': 'chat.completion', 'system_fingerprint': None, 'usage': {'completion_tokens': 10, 'prompt_tokens': 11, 'total_tokens': 21}}
Proxy Metadata: {'user_api_key': None, 'headers': Headers({'host': '0.0.0.0:4000', 'user-agent': 'curl/7.88.1', 'accept': '*/*', 'authorization': 'Bearer sk-1234', 'content-length': '199', 'content-type': 'application/x-www-form-urlencoded'}), 'model_group': 'gpt-3.5-turbo', 'deployment': 'gpt-3.5-turbo-ModelID-gpt-3.5-turbo'}

记录代理请求对象、请求头、URL

您可以按照以下方式访问发送到代理的每个请求的 urlheadersrequest body

class MyCustomHandler(CustomLogger):
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Success!")

litellm_params = kwargs.get("litellm_params", None)
proxy_server_request = litellm_params.get("proxy_server_request")
print(proxy_server_request)

预期输出

{
"url": "http://testserver/chat/completions",
"method": "POST",
"headers": {
"host": "testserver",
"accept": "*/*",
"accept-encoding": "gzip, deflate",
"connection": "keep-alive",
"user-agent": "testclient",
"authorization": "Bearer None",
"content-length": "105",
"content-type": "application/json"
},
"body": {
"model": "Azure OpenAI GPT-4 Canada",
"messages": [
{
"role": "user",
"content": "hi"
}
],
"max_tokens": 10
}
}

记录在 config.yaml 中设置的 model_info

以下是如何记录在您的代理 config.yaml 中设置的 model_info。关于在 config.yaml 中设置 model_info 的信息

class MyCustomHandler(CustomLogger):
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Success!")

litellm_params = kwargs.get("litellm_params", None)
model_info = litellm_params.get("model_info")
print(model_info)

预期输出

{'mode': 'embedding', 'input_cost_per_token': 0.002}
记录来自代理的响应

/chat/completions/embeddings 的响应都可以作为 response_obj 使用

注意:对于 /chat/completionsstream=True非流式 响应都可以作为 response_obj 使用

class MyCustomHandler(CustomLogger):
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Success!")
print(response_obj)

预期输出 /chat/completion[适用于 流式非流式 响应]

ModelResponse(
id='chatcmpl-8Tfu8GoMElwOZuj2JlHBhNHG01PPo',
choices=[
Choices(
finish_reason='stop',
index=0,
message=Message(
content='As an AI language model, I do not have a physical body and therefore do not possess any degree or educational qualifications. My knowledge and abilities come from the programming and algorithms that have been developed by my creators.',
role='assistant'
)
)
],
created=1702083284,
model='chatgpt-v-2',
object='chat.completion',
system_fingerprint=None,
usage=Usage(
completion_tokens=42,
prompt_tokens=5,
total_tokens=47
)
)

预期输出 /embeddings

{
'model': 'ada',
'data': [
{
'embedding': [
-0.035126980394124985, -0.020624293014407158, -0.015343423001468182,
-0.03980357199907303, -0.02750781551003456, 0.02111034281551838,
-0.022069307044148445, -0.019442008808255196, -0.00955679826438427,
-0.013143060728907585, 0.029583381488919258, -0.004725852981209755,
-0.015198921784758568, -0.014069183729588985, 0.00897879246622324,
0.01521205808967352,
# ... (truncated for brevity)
]
}
]
}

自定义回调 API[异步]

将 LiteLLM 日志发送到自定义 API 端点

信息

这是企业版专属功能 在此处开始使用企业版

属性详情
描述将 LLM 输入/输出记录到自定义 API 端点
记录的负载List[StandardLoggingPayload] LiteLLM 将 StandardLoggingPayload 对象列表记录到您的端点

如果您希望,可以使用此功能

  • 使用非 Python 编程语言编写的自定义回调
  • 让您的回调在不同的微服务上运行

用法

  1. 在 LiteLLM config.yaml 中设置 success_callback: ["generic_api"]
litellm config.yaml
model_list:
- model_name: openai/gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY

litellm_settings:
success_callback: ["generic_api"]
  1. 为自定义 API 端点设置环境变量
环境变量详情必需
GENERIC_LOGGER_ENDPOINT我们应该发送回调日志的端点 + 路由
GENERIC_LOGGER_HEADERS可选:设置要发送到自定义 API 端点的请求头否,这是可选的
.env
GENERIC_LOGGER_ENDPOINT="https://webhook-test.com/30343bc33591bc5e6dc44217ceae3e0a"


# Optional: Set headers to be sent to the custom API endpoint
GENERIC_LOGGER_HEADERS="Authorization=Bearer <your-api-key>"
# if multiple headers, separate by commas
GENERIC_LOGGER_HEADERS="Authorization=Bearer <your-api-key>,X-Custom-Header=custom-header-value"
  1. 启动代理
litellm --config /path/to/config.yaml
  1. 发起测试请求
curl -i --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'

Langsmith

  1. 在 LiteLLM config.yaml 中设置 success_callback: ["langsmith"]

如果您使用的是自定义 LangSmith 实例,可以将 LANGSMITH_BASE_URL 环境变量设置为指向您的实例。

litellm_settings:
success_callback: ["langsmith"]

environment_variables:
LANGSMITH_API_KEY: "lsv2_pt_xxxxxxxx"
LANGSMITH_PROJECT: "litellm-proxy"

LANGSMITH_BASE_URL: "https://api.smith.langchain.com" # (Optional - only needed if you have a custom Langsmith instance)
  1. 启动代理
litellm --config /path/to/config.yaml
  1. 测试一下!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "Hello, Claude gm!"
}
],
}
'

期望在 Langfuse 上看到您的日志

Arize AI

  1. 在 LiteLLM config.yaml 中设置 success_callback: ["arize"]
model_list:
- model_name: gpt-4
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/

litellm_settings:
callbacks: ["arize"]

environment_variables:
ARIZE_SPACE_KEY: "d0*****"
ARIZE_API_KEY: "141a****"
ARIZE_ENDPOINT: "https://otlp.arize.com/v1" # OPTIONAL - your custom arize GRPC api endpoint
ARIZE_HTTP_ENDPOINT: "https://otlp.arize.com/v1" # OPTIONAL - your custom arize HTTP api endpoint. Set either this or ARIZE_ENDPOINT
  1. 启动代理
litellm --config /path/to/config.yaml
  1. 测试一下!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "Hello, Claude gm!"
}
],
}
'

期望在 Langfuse 上看到您的日志

Langtrace

  1. 在 LiteLLM config.yaml 中设置 success_callback: ["langtrace"]
model_list:
- model_name: gpt-4
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/

litellm_settings:
callbacks: ["langtrace"]

environment_variables:
LANGTRACE_API_KEY: "141a****"
  1. 启动代理
litellm --config /path/to/config.yaml
  1. 测试一下!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "Hello, Claude gm!"
}
],
}
'

Galileo

[BETA]

www.rungalileo.io 上记录 LLM 输入/输出

信息

Beta 集成

所需环境变量

export GALILEO_BASE_URL=""  # For most users, this is the same as their console URL except with the word 'console' replaced by 'api' (e.g. http://www.console.galileo.myenterprise.com -> http://www.api.galileo.myenterprise.com)
export GALILEO_PROJECT_ID=""
export GALILEO_USERNAME=""
export GALILEO_PASSWORD=""

快速入门

  1. 添加到 Config.yaml
model_list:
- litellm_params:
api_base: https://exampleopenaiendpoint-production.up.railway.app/
api_key: my-fake-key
model: openai/my-fake-model
model_name: fake-openai-endpoint

litellm_settings:
success_callback: ["galileo"] # 👈 KEY CHANGE
  1. 启动代理
litellm --config /path/to/config.yaml
  1. 测试一下!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}
'

🎉 就是这样 - 期望在您的 Galileo Dashboard 上看到您的日志

OpenMeter

根据客户的 LLM API 使用量使用 OpenMeter 进行计费

所需环境变量

# from https://openmeter.cloud
export OPENMETER_API_ENDPOINT="" # defaults to https://openmeter.cloud
export OPENMETER_API_KEY=""
快速入门
  1. 添加到 Config.yaml
model_list:
- litellm_params:
api_base: https://openai-function-calling-workers.tasslexyz.workers.dev/
api_key: my-fake-key
model: openai/my-fake-model
model_name: fake-openai-endpoint

litellm_settings:
success_callback: ["openmeter"] # 👈 KEY CHANGE
  1. 启动代理
litellm --config /path/to/config.yaml
  1. 测试一下!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}
'

DynamoDB

我们将使用 --config 来设置

  • litellm.success_callback = ["dynamodb"]
  • litellm.dynamodb_table_name = "your-table-name"

这将把所有成功的 LLM 调用记录到 DynamoDB

步骤 1 在 .env 中设置 AWS 凭据

AWS_ACCESS_KEY_ID = ""
AWS_SECRET_ACCESS_KEY = ""
AWS_REGION_NAME = ""

步骤 2: 创建一个 config.yaml 文件并设置 litellm_settings: success_callback

model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
success_callback: ["dynamodb"]
dynamodb_table_name: your-table-name

步骤 3: 启动代理,发起测试请求

启动代理

litellm --config config.yaml --debug

测试请求

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "Azure OpenAI GPT-4 East",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'

您的日志应可在 DynamoDB 上找到

记录到 DynamoDB 的数据 /chat/completions

{
"id": {
"S": "chatcmpl-8W15J4480a3fAQ1yQaMgtsKJAicen"
},
"call_type": {
"S": "acompletion"
},
"endTime": {
"S": "2023-12-15 17:25:58.424118"
},
"messages": {
"S": "[{'role': 'user', 'content': 'This is a test'}]"
},
"metadata": {
"S": "{}"
},
"model": {
"S": "gpt-3.5-turbo"
},
"modelParameters": {
"S": "{'temperature': 0.7, 'max_tokens': 100, 'user': 'ishaan-2'}"
},
"response": {
"S": "ModelResponse(id='chatcmpl-8W15J4480a3fAQ1yQaMgtsKJAicen', choices=[Choices(finish_reason='stop', index=0, message=Message(content='Great! What can I assist you with?', role='assistant'))], created=1702641357, model='gpt-3.5-turbo-0613', object='chat.completion', system_fingerprint=None, usage=Usage(completion_tokens=9, prompt_tokens=11, total_tokens=20))"
},
"startTime": {
"S": "2023-12-15 17:25:56.047035"
},
"usage": {
"S": "Usage(completion_tokens=9, prompt_tokens=11, total_tokens=20)"
},
"user": {
"S": "ishaan-2"
}
}

记录到 DynamoDB 的数据 /embeddings

{
"id": {
"S": "4dec8d4d-4817-472d-9fc6-c7a6153eb2ca"
},
"call_type": {
"S": "aembedding"
},
"endTime": {
"S": "2023-12-15 17:25:59.890261"
},
"messages": {
"S": "['hi']"
},
"metadata": {
"S": "{}"
},
"model": {
"S": "text-embedding-ada-002"
},
"modelParameters": {
"S": "{'user': 'ishaan-2'}"
},
"response": {
"S": "EmbeddingResponse(model='text-embedding-ada-002-v2', data=[{'embedding': [-0.03503197431564331, -0.020601635798811913, -0.015375726856291294,
}
}

Sentry

如果 API 调用失败(LLM/数据库),您可以将其记录到 Sentry

步骤 1 安装 Sentry

pip install --upgrade sentry-sdk

步骤 2: 保存您的 Sentry_DSN 并添加 litellm_settings: failure_callback

export SENTRY_DSN="your-sentry-dsn"
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
# other settings
failure_callback: ["sentry"]
general_settings:
database_url: "my-bad-url" # set a fake url to trigger a sentry exception

步骤 3: 启动代理,发起测试请求

启动代理

litellm --config config.yaml --debug

测试请求

litellm --test

Athina

Athina 允许您记录 LLM 输入/输出,用于监控、分析和可观测性。

我们将使用 --config 设置 litellm.success_callback = ["athina"],这将把所有成功的 LLM 调用记录到 Athina

步骤 1 设置 Athina API 密钥

ATHINA_API_KEY = "your-athina-api-key"

步骤 2: 创建一个 config.yaml 文件并设置 litellm_settings: success_callback

model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
success_callback: ["athina"]

步骤 3: 启动代理,发起测试请求

启动代理

litellm --config config.yaml --debug

测试请求

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "which llm are you"
}
]
}'