日志记录
使用以下方式记录代理(Proxy)的输入、输出和异常:
- Langfuse
- OpenTelemetry
- GCS, S3, Azure (Blob) 存储桶
- AWS SQS
- Lunary
- MLflow
- Deepeval
- 自定义回调 - 自定义代码和 API 端点
- Langsmith
- DataDog
- Azure Sentinel
- DynamoDB
- 等。
获取 LiteLLM 调用 ID (Call ID)
LiteLLM 为每个请求生成一个唯一的 call_id。此 call_id 可用于在整个系统中跟踪请求。这对于在日志记录系统(如本页提到的系统)中查找特定请求的信息非常有用。
curl -i -sSL --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "what llm are you"}]
}' | grep 'x-litellm'
输出结果为
x-litellm-call-id: b980db26-9512-45cc-b1da-c511a363b83f
x-litellm-model-id: cb41bc03f4c33d310019bae8c5afdb1af0a8f97b36a234405a9807614988457c
x-litellm-model-api-base: https://x-example-1234.openai.azure.com
x-litellm-version: 1.40.21
x-litellm-response-cost: 2.85e-05
x-litellm-key-tpm-limit: None
x-litellm-key-rpm-limit: None
其中的许多标头对于故障排除很有用,但 x-litellm-call-id 是在系统组件(包括日志记录工具)中跟踪请求时最有用的一项。
日志记录功能
编辑/屏蔽消息与响应内容
设置 litellm.turn_off_message_logging=True。这将阻止消息和响应被记录到您的日志记录提供商,但请求元数据(例如费用)仍将被跟踪。这对于处理敏感数据时的隐私/合规性很有用。
- 全局设置
- 单次请求设置
1. 设置 config.yaml
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
success_callback: ["langfuse"]
turn_off_message_logging: True # 👈 Key Change
2. 发送请求
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
动态请求消息编辑功能目前处于 BETA 阶段。
传入请求标头以启用针对该请求的消息编辑功能。
x-litellm-enable-message-redaction: true
config.yaml 示例
**1. 设置 config.yaml**
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
2. 设置单次请求标头
curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-zV5HlSIm8ihj1F9C_ZbB1g' \
-H 'x-litellm-enable-message-redaction: true' \
-d '{
"model": "gpt-3.5-turbo-testing",
"messages": [
{
"role": "user",
"content": "Hey, how'\''s it going 1234?"
}
]
}'
3. 检查日志记录工具 + 费用日志
日志记录工具
消费日志
编辑 UserAPIKeyInfo
从日志中编辑有关用户 API 密钥的信息(哈希令牌、user_id、团队 ID 等)。
目前支持 Langfuse、OpenTelemetry、Logfire、ArizeAI 日志记录。
litellm_settings:
callbacks: ["langfuse"]
redact_user_api_key_info: true
禁用消息编辑
如果您开启了 litellm.turn_on_message_logging,可以通过设置请求标头 LiteLLM-Disable-Message-Redaction: true 来覆盖特定请求的设置。
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'LiteLLM-Disable-Message-Redaction: true' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
关闭所有跟踪/日志记录
对于某些用例,您可能希望关闭所有跟踪/日志记录。可以通过在请求正文中传入 no-log=True 来实现。
通过在 config.yaml 文件中设置 global_disable_no_log_param:true 来禁用此功能。
litellm_settings:
global_disable_no_log_param: True
- Curl 请求
- OpenAI
curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <litellm-api-key>' \
-d '{
"model": "openai/gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What'\''s in this image?"
}
]
}
],
"max_tokens": 300,
"no-log": true # 👈 Key Change
}'
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
extra_body={
"no-log": True # 👈 Key Change
}
)
print(response)
预期的控制台日志
LiteLLM.Info: "no-log request, skipping logging"
✨ 动态禁用特定回调
这是企业版功能。
对于某些用例,您可能希望禁用针对特定请求的回调。可以通过在请求标头中传入 x-litellm-disable-callbacks: <callback_name> 来实现。
在请求标头 x-litellm-disable-callbacks 中发送要禁用的回调列表。
- Curl 请求
- OpenAI Python SDK
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--header 'x-litellm-disable-callbacks: langfuse' \
--data '{
"model": "claude-sonnet-4-20250514",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
import openai
client = openai.OpenAI(
api_key="sk-1234",
base_url="http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model="claude-sonnet-4-20250514",
messages=[
{
"role": "user",
"content": "what llm are you"
}
],
extra_headers={
"x-litellm-disable-callbacks": "langfuse"
}
)
print(response)
✨ 基于虚拟密钥和团队的条件日志记录
使用此功能可以
- 针对部分虚拟密钥/团队有条件地启用日志记录
- 为不同的虚拟密钥/团队设置不同的日志记录提供商
记录的内容是什么?
位于 kwargs["standard_logging_object"] 下。这是一个标准负载,记录在每次响应中。
Langfuse
我们将使用 --config 来设置 litellm.success_callback = ["langfuse"],这将把所有成功的 LLM 调用记录到 Langfuse。请确保在您的环境中设置了 LANGFUSE_PUBLIC_KEY 和 LANGFUSE_SECRET_KEY。
第 1 步:安装 langfuse
uv add langfuse>=2.0.0
第 2 步:创建 config.yaml 文件并设置 litellm_settings: success_callback
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
success_callback: ["langfuse"]
第 3 步:设置记录到 langfuse 所需的环境变量
export LANGFUSE_PUBLIC_KEY="pk_kk"
export LANGFUSE_SECRET_KEY="sk_ss"
# Optional, defaults to https://cloud.langfuse.com
export LANGFUSE_HOST="https://xxx.langfuse.com"
第 4 步:启动代理,进行测试请求
启动代理
litellm --config config.yaml --debug
测试请求
litellm --test
Langfuse 上的预期输出
记录元数据到 Langfuse
- Curl 请求
- OpenAI v1.0.0+
- Langchain
将 metadata 作为请求主体的一部分传递
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
"metadata": {
"generation_name": "ishaan-test-generation",
"generation_id": "gen-id22",
"trace_id": "trace-id22",
"trace_user_id": "user-id2"
}
}'
将 extra_body={"metadata": { }} 设置为要传递的 metadata
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
extra_body={
"metadata": {
"generation_name": "ishaan-generation-openai-client",
"generation_id": "openai-client-gen-id22",
"trace_id": "openai-client-trace-id22",
"trace_user_id": "openai-client-user-id2"
}
}
)
print(response)
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:4000",
model = "gpt-3.5-turbo",
temperature=0.1,
extra_body={
"metadata": {
"generation_name": "ishaan-generation-langchain-client",
"generation_id": "langchain-client-gen-id22",
"trace_id": "langchain-client-trace-id22",
"trace_user_id": "langchain-client-user-id2"
}
}
)
messages = [
SystemMessage(
content="You are a helpful assistant that im using to make a test request to."
),
HumanMessage(
content="test from litellm. tell me why it's amazing in 1 sentence"
),
]
response = chat(messages)
print(response)
自定义标签
在请求正文中设置 tags
- OpenAI Python v1.0.0+
- Curl 请求
- Langchain
import openai
client = openai.OpenAI(
api_key="sk-1234",
base_url="http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model="llama3",
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
user="palantir",
extra_body={
"metadata": {
"tags": ["jobID:214590dsff09fds", "taskName:run_page_classification"]
}
}
)
print(response)
将 metadata 作为请求主体的一部分传递
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{
"model": "llama3",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
"user": "palantir",
"metadata": {
"tags": ["jobID:214590dsff09fds", "taskName:run_page_classification"]
}
}'
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
import os
os.environ["OPENAI_API_KEY"] = "sk-1234"
chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:4000",
model = "llama3",
user="palantir",
extra_body={
"metadata": {
"tags": ["jobID:214590dsff09fds", "taskName:run_page_classification"]
}
}
)
messages = [
SystemMessage(
content="You are a helpful assistant that im using to make a test request to."
),
HumanMessage(
content="test from litellm. tell me why it's amazing in 1 sentence"
),
]
response = chat(messages)
print(response)
LiteLLM 标签 - cache_hit, cache_key
如果您想控制哪些 LiteLLM 特有字段被 LiteLLM 代理记录为标签,请使用此功能。默认情况下,LiteLLM 代理不记录任何 LiteLLM 特有字段。
| LiteLLM 特有字段 | 描述 | 示例值 |
|---|---|---|
cache_hit | 指示是否命中缓存 (True) 或未命中 (False) | true, false |
cache_key | 此请求使用的缓存键 | d2b758c**** |
proxy_base_url | 代理服务器的基础 URL,即您服务器上环境变量 PROXY_BASE_URL 的值 | https://proxy.example.com |
user_api_key_alias | LiteLLM 虚拟密钥的别名。 | prod-app1 |
user_api_key_user_id | 与用户 API 密钥关联的唯一 ID。 | user_123, user_456 |
user_api_key_user_email | 与用户 API 密钥关联的电子邮箱。 | user@example.com, admin@example.com |
user_api_key_team_alias | 与 API 密钥关联的团队别名。 | team_alpha, dev_team |
用法
指定 langfuse_default_tags 以控制哪些 litellm 字段被记录到 Langfuse
config.yaml 示例
model_list:
- model_name: gpt-4
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
litellm_settings:
success_callback: ["langfuse"]
# 👇 Key Change
langfuse_default_tags: ["cache_hit", "cache_key", "proxy_base_url", "user_api_key_alias", "user_api_key_user_id", "user_api_key_user_email", "user_api_key_team_alias", "semantic-similarity", "proxy_base_url"]
查看 LiteLLM 发送给提供商的 POST 请求
当您想查看 LiteLLM 发送给 LLM API 的原始 curl 请求时,请使用此功能
- Curl 请求
- OpenAI v1.0.0+
- Langchain
将 metadata 作为请求主体的一部分传递
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
"metadata": {
"log_raw_request": true
}
}'
设置 extra_body={"metadata": {"log_raw_request": True }} 到您想要传递的 metadata 中
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
extra_body={
"metadata": {
"log_raw_request": True
}
}
)
print(response)
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:4000",
model = "gpt-3.5-turbo",
temperature=0.1,
extra_body={
"metadata": {
"log_raw_request": True
}
}
)
messages = [
SystemMessage(
content="You are a helpful assistant that im using to make a test request to."
),
HumanMessage(
content="test from litellm. tell me why it's amazing in 1 sentence"
),
]
response = chat(messages)
print(response)
Langfuse 上的预期输出
您将在 Langfuse 元数据中看到 raw_request。这是 LiteLLM 发送给您 LLM API 提供商的原始 CURL 命令
OpenTelemetry
完整的 OpenTelemetry 参考手册(span 层次结构、每个发送的 span 和属性、指标、semconv 模式和故障排除)位于 可观测性 → OpenTelemetry 集成。下文是面向代理的快速入门。
[可选] 通过设置以下环境变量来自定义 OTEL 服务名称和 OTEL 追踪器名称
OTEL_TRACER_NAME=<your-trace-name> # default="litellm"
OTEL_SERVICE_NAME=<your-service-name>` # default="litellm"
- 记录到控制台
- 记录到 Honeycomb
- 记录到 Traceloop Cloud
- 记录到 OTEL HTTP 收集器
- 记录到 OTEL GRPC 收集器
第 1 步: 设置回调和环境变量
将以下内容添加到您的环境变量中
OTEL_EXPORTER="console"
在您的 litellm_config.yaml 中添加 otel 作为回调
litellm_settings:
callbacks: ["otel"]
第 2 步:启动代理,进行测试请求
启动代理
litellm --config config.yaml --detailed_debug
测试请求
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
第 3 步:预期在服务器日志/控制台中看到以下内容
这是来自 OTEL 日志记录的 Span
{
"name": "litellm-acompletion",
"context": {
"trace_id": "0x8d354e2346060032703637a0843b20a3",
"span_id": "0xd8d3476a2eb12724",
"trace_state": "[]"
},
"kind": "SpanKind.INTERNAL",
"parent_id": null,
"start_time": "2024-06-04T19:46:56.415888Z",
"end_time": "2024-06-04T19:46:56.790278Z",
"status": {
"status_code": "OK"
},
"attributes": {
"model": "llama3-8b-8192"
},
"events": [],
"links": [],
"resource": {
"attributes": {
"service.name": "litellm"
},
"schema_url": ""
}
}
快速入门 - 记录到 Honeycomb
第 1 步: 设置回调和环境变量
将以下内容添加到您的环境变量中
OTEL_EXPORTER="otlp_http"
OTEL_ENDPOINT="https://api.honeycomb.io/v1/traces"
OTEL_HEADERS="x-honeycomb-team=<your-api-key>"
在您的 litellm_config.yaml 中添加 otel 作为回调
litellm_settings:
callbacks: ["otel"]
第 2 步:启动代理,进行测试请求
启动代理
litellm --config config.yaml --detailed_debug
测试请求
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
快速入门 - 记录到 Traceloop
第 1 步: 将以下内容添加到您的环境变量中
OTEL_EXPORTER="otlp_http"
OTEL_ENDPOINT="https://api.traceloop.com"
OTEL_HEADERS="Authorization=Bearer%20<your-api-key>"
第 2 步: 添加 otel 作为回调
litellm_settings:
callbacks: ["otel"]
步骤 3:启动代理,发起测试请求
启动代理
litellm --config config.yaml --detailed_debug
测试请求
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
快速入门 - 记录到 OTEL 收集器
第 1 步: 设置回调和环境变量
将以下内容添加到您的环境变量中
OTEL_EXPORTER="otlp_http"
OTEL_ENDPOINT="http://0.0.0.0:4317"
OTEL_HEADERS="x-honeycomb-team=<your-api-key>" # Optional
在您的 litellm_config.yaml 中添加 otel 作为回调
litellm_settings:
callbacks: ["otel"]
第 2 步:启动代理,进行测试请求
启动代理
litellm --config config.yaml --detailed_debug
测试请求
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
快速入门 - 记录到 OTEL GRPC 收集器
第 1 步: 设置回调和环境变量
将以下内容添加到您的环境变量中
OTEL_EXPORTER="otlp_grpc"
OTEL_ENDPOINT="http:/0.0.0.0:4317"
OTEL_HEADERS="x-honeycomb-team=<your-api-key>" # Optional
注意:OTLP gRPC 需要
grpcio。通过uv add "litellm[grpc]"(或grpcio)安装。
在您的 litellm_config.yaml 中添加 otel 作为回调
litellm_settings:
callbacks: ["otel"]
第 2 步:启动代理,进行测试请求
启动代理
litellm --config config.yaml --detailed_debug
测试请求
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
** 🎉 预期在您的 OTEL 收集器中看到此追踪记录**
编辑消息、响应内容
为 otel 设置 message_logging=False,将不会记录任何消息/响应
litellm_settings:
callbacks: ["otel"]
## 👇 Key Change
callback_settings:
otel:
message_logging: False
Traceparent 标头
跨服务的上下文传播 Traceparent HTTP 标头
❓ 当您想在分布式追踪系统中传递有关传入请求的信息时,请使用此功能
✅ 关键更改:在请求中传递 traceparent 标头。在此处阅读有关 traceparent 标头的更多信息
traceparent: 00-80e1afed08e019fc1110464cfa66635c-7a085853722dc6d2-01
示例用法
- 使用
traceparent标头向 LiteLLM 代理发起请求
import openai
import uuid
client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")
example_traceparent = f"00-80e1afed08e019fc1110464cfa66635c-02e80198930058d4-01"
extra_headers = {
"traceparent": example_traceparent
}
_trace_id = example_traceparent.split("-")[1]
print("EXTRA HEADERS: ", extra_headers)
print("Trace ID: ", _trace_id)
response = client.chat.completions.create(
model="llama3",
messages=[
{"role": "user", "content": "this is a test request, write a short poem"}
],
extra_headers=extra_headers,
)
print(response)
# EXTRA HEADERS: {'traceparent': '00-80e1afed08e019fc1110464cfa66635c-02e80198930058d4-01'}
# Trace ID: 80e1afed08e019fc1110464cfa66635c
- 在 OTEL 记录器上查找 Trace ID
在您的 OTEL 收集器上搜索 Trace=80e1afed08e019fc1110464cfa66635c
将 Traceparent HTTP 标头 转发给 LLM API
如果您想将 traceparent 标头转发给自托管 LLM(如 vLLM),请使用此功能
在您的 config.yaml 中设置 forward_traceparent_to_llm_provider: True。这将把 traceparent 标头转发给您的 LLM API
仅限自托管 LLM 使用,这可能会导致 Bedrock 或 VertexAI 调用失败
litellm_settings:
forward_traceparent_to_llm_provider: True
Google Cloud Storage 存储桶
将 LLM 日志记录到 Google Cloud Storage 存储桶
✨ 这是企业版专属功能 在此处开始使用企业版
| 属性 | 详情 |
|---|---|
| 描述 | 将 LLM 输入/输出记录到云存储存储桶 |
| 负载测试基准 | 基准 |
| 云存储上的 Google 文档 | Google Cloud Storage |
用法
- 添加
gcs_bucket到 LiteLLM Config.yaml
model_list:
- litellm_params:
api_base: https://exampleopenaiendpoint-production.up.railway.app/
api_key: my-fake-key
model: openai/my-fake-model
model_name: fake-openai-endpoint
litellm_settings:
callbacks: ["gcs_bucket"] # 👈 KEY CHANGE # 👈 KEY CHANGE
- 设置所需的环境变量
GCS_BUCKET_NAME="<your-gcs-bucket-name>"
GCS_PATH_SERVICE_ACCOUNT="/Users/ishaanjaffer/Downloads/adroit-crow-413218-a956eef1a2a8.json" # Add path to service account.json
- 启动 Proxy
litellm --config /path/to/config.yaml
- 测试它!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}
'
GCS 存储桶上的预期日志
GCS 存储桶上记录的字段
从 Google Cloud 控制台获取 service_account.json
- 转到 Google Cloud 控制台
- 搜索 IAM 和管理员
- 点击服务帐号
- 选择一个服务帐号
- 点击 '密钥' -> 添加密钥 -> 创建新密钥 -> JSON
- 保存 JSON 文件并将路径添加到
GCS_PATH_SERVICE_ACCOUNT
Google Cloud Storage - PubSub 主题
将 LLM 日志/费用日志记录到 Google Cloud Storage PubSub 主题
✨ 这是企业版专属功能 在此处开始使用企业版
| 属性 | 详情 |
|---|---|
| 描述 | 将 LiteLLM SpendLogs 表 记录到 Google Cloud Storage PubSub 主题 |
何时使用 gcs_pubsub?
- 如果您的 LiteLLM 数据库已超过 100 万条费用日志,并且您希望将
SpendLogs发送到可被 GCS BigQuery 消费的 PubSub 主题时
用法
- 添加
gcs_pubsub到 LiteLLM Config.yaml
model_list:
- litellm_params:
api_base: https://exampleopenaiendpoint-production.up.railway.app/
api_key: my-fake-key
model: openai/my-fake-model
model_name: fake-openai-endpoint
litellm_settings:
callbacks: ["gcs_pubsub"] # 👈 KEY CHANGE # 👈 KEY CHANGE
- 设置所需的环境变量
GCS_PUBSUB_TOPIC_ID="litellmDB"
GCS_PUBSUB_PROJECT_ID="reliableKeys"
- 启动 Proxy
litellm --config /path/to/config.yaml
- 测试它!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}
'
Deepeval
LiteLLM 支持在 Confident AI (Deepeval 平台) 上进行日志记录
用法:
- 在 LiteLLM
config.yaml中添加deepeval
model_list:
- model_name: gpt-4o
litellm_params:
model: gpt-4o
litellm_settings:
success_callback: ["deepeval"]
failure_callback: ["deepeval"]
- 在
.env文件中设置您的环境变量。
CONFIDENT_API_KEY=<your-api-key>
您可以通过登录 Confident AI 平台获取 CONFIDENT_API_KEY。
- 启动您的代理服务器
litellm --config config.yaml --debug
- 发起请求
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful math tutor. Guide the user through the solution step by step."
},
{
"role": "user",
"content": "how can I solve 8x + 7 = -23"
}
]
}'
- 检查平台上的追踪记录
S3 存储桶
我们将使用 --config 来设置
litellm.success_callback = ["s3"]
这将把所有成功的 LLM 调用记录到 S3 存储桶
第 1 步:在 .env 中设置 AWS 凭证
AWS_ACCESS_KEY_ID = ""
AWS_SECRET_ACCESS_KEY = ""
AWS_REGION_NAME = ""
第 2 步:创建 config.yaml 文件并设置 litellm_settings: success_callback
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
success_callback: ["s3_v2"]
s3_callback_params:
s3_bucket_name: logs-bucket-litellm # AWS Bucket Name for S3
s3_region_name: us-west-2 # AWS Region Name for S3
s3_aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID # us os.environ/<variable name> to pass environment variables. This is AWS Access Key ID for S3
s3_aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY # AWS Secret Access Key for S3
s3_path: my-test-path # [OPTIONAL] set path in bucket you want to write logs to
s3_endpoint_url: https://s3.amazonaws.com # [OPTIONAL] S3 endpoint URL, if you want to use Backblaze/cloudflare s3 buckets
s3_use_virtual_hosted_style: false # [OPTIONAL] use virtual-hosted-style URLs (bucket.endpoint/key) instead of path-style (endpoint/bucket/key). Useful for S3-compatible services like MinIO
s3_strip_base64_files: false # [OPTIONAL] remove base64 files before storing in s3
步骤 3:启动代理,发起测试请求
启动代理
litellm --config config.yaml --debug
测试请求
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "Azure OpenAI GPT-4 East",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
您的日志应可在指定的 S3 存储桶中找到
对象键中的团队别名前缀
您可以通过在 config.yaml 文件中设置 team_alias 将团队别名添加到对象键中。这将为对象键加上团队别名作为前缀。
litellm_settings:
callbacks: ["s3_v2"]
s3_callback_params:
s3_bucket_name: logs-bucket-litellm
s3_region_name: us-west-2
s3_aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
s3_aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
s3_path: my-test-path
s3_endpoint_url: https://s3.amazonaws.com
s3_use_team_prefix: true
在 S3 存储桶上,您将看到对象键为 my-test-path/my-team-alias/...
对象键中的密钥别名前缀
您可以通过启用 s3_use_key_prefix 将用户 API 密钥别名添加到 S3 对象键中。
litellm_settings:
callbacks: ["s3_v2"]
s3_callback_params:
s3_bucket_name: logs-bucket-litellm
s3_region_name: us-west-2
s3_aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
s3_aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
s3_path: my-test-path
s3_endpoint_url: https://s3.amazonaws.com
s3_use_key_prefix: true
在 S3 存储桶上,您将看到对象键为 my-test-path/my-key-alias/...
如果同时启用了团队别名和密钥别名,则路径变为 my-test-path/my-team-alias/my-key-alias/...
AWS SQS
| 属性 | 详情 |
|---|---|
| 描述 | 将 LLM 输入/输出记录到 AWS SQS 队列 |
| AWS SQS 文档 | AWS SQS |
| 记录到 SQS 的字段 | LiteLLM 标准日志记录负载,每次 LLM 调用都会记录 |
将 LLM 日志记录到 AWS Simple Queue Service (SQS)
我们将使用 litellm --config 来设置
litellm.callbacks = ["aws_sqs"]
这将把所有成功的 LLM 调用记录到 AWS SQS 队列
第 1 步:在 .env 中设置 AWS 凭证
AWS_ACCESS_KEY_ID = ""
AWS_SECRET_ACCESS_KEY = ""
AWS_REGION_NAME = ""
第 2 步:创建 config.yaml 文件并设置 litellm_settings: callbacks
model_list:
- model_name: gpt-4o
litellm_params:
model: gpt-4o
litellm_settings:
callbacks: ["aws_sqs"]
aws_sqs_callback_params:
# --- 🧱 Required Parameters ---
sqs_queue_url: https://sqs.us-west-2.amazonaws.com/123456789012/my-queue
# The AWS SQS Queue URL to which LiteLLM will send log events.
sqs_region_name: us-west-2
# AWS Region for your SQS queue (e.g., us-east-1, eu-central-1, etc.)
# --- Logging Controls ---
sqs_strip_base64_files: false
# If true, LiteLLM will remove or redact base64-encoded binary data (e.g., PDFs, images, audio)
# from logged messages to avoid large payloads. SQS has a 1 MB payload size limit.
s3_use_team_prefix: false
# If true, Litellm will add the team alias prefix to s3 path
s3_use_key_prefix: false
# If true, Litellm will add the key alias prefix to s3 path
步骤 3:启动代理,发起测试请求
启动代理
litellm --config config.yaml --debug
测试请求
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
Azure Blob 存储
将 LLM 日志记录到 Azure Data Lake Storage
✨ 这是企业版专属功能 在此处开始使用企业版
| 属性 | 详情 |
|---|---|
| 描述 | 将 LLM 输入/输出记录到 Azure Blob 存储(存储桶) |
| Azure Data Lake 存储文档 | Azure Data Lake 存储 |
用法
- 添加
azure_storage到 LiteLLM Config.yaml
model_list:
- model_name: fake-openai-endpoint
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
litellm_settings:
callbacks: ["azure_storage"] # 👈 KEY CHANGE # 👈 KEY CHANGE
- 设置所需的环境变量
# Required Environment Variables for Azure Storage
AZURE_STORAGE_ACCOUNT_NAME="litellm2" # The name of the Azure Storage Account to use for logging
AZURE_STORAGE_FILE_SYSTEM="litellm-logs" # The name of the Azure Storage File System to use for logging. (Typically the Container name)
# Authentication Variables
# Option 1: Use Storage Account Key
AZURE_STORAGE_ACCOUNT_KEY="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # The Azure Storage Account Key to use for Authentication
# Option 2: Use Tenant ID + Client ID + Client Secret
AZURE_STORAGE_TENANT_ID="985efd7cxxxxxxxxxx" # The Application Tenant ID to use for Authentication
AZURE_STORAGE_CLIENT_ID="abe66585xxxxxxxxxx" # The Application Client ID to use for Authentication
AZURE_STORAGE_CLIENT_SECRET="uMS8Qxxxxxxxxxx" # The Application Client Secret to use for Authentication
- 启动 Proxy
litellm --config /path/to/config.yaml
- 测试它!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}
'
Azure Data Lake 存储上的预期日志
Azure Data Lake 存储上记录的字段
标准日志记录对象被记录在 Azure Data Lake 存储上
Datadog
👉 前往此处了解如何将 Datadog LLM 可观测性 与 LiteLLM 代理结合使用
Azure Sentinel
👉 前往此处了解如何将 Azure Sentinel 与 LiteLLM 代理结合使用
Lunary
第 1 步:安装依赖项并设置您的环境变量
安装依赖项
uv add litellm lunary
从 https://app.lunary.ai/settings 获取您的 Lunary 公钥
export LUNARY_PUBLIC_KEY="<your-public-key>"
第 2 步:创建 config.yaml 并设置 lunary 回调
model_list:
- model_name: "*"
litellm_params:
model: "*"
litellm_settings:
success_callback: ["lunary"]
failure_callback: ["lunary"]
第 3 步:启动 LiteLLM 代理
litellm --config config.yaml
第 4 步:发起请求
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful math tutor. Guide the user through the solution step by step."
},
{
"role": "user",
"content": "how can I solve 8x + 7 = -23"
}
]
}'
MLflow
👉 按照此处的教程开始在 LiteLLM 代理服务器上使用 mlflow
自定义回调类 [异步]
当您想在 python 中运行自定义回调时,请使用此功能
第 1 步 - 创建您的自定义 litellm 回调类
我们为此使用 litellm.integrations.custom_logger,有关 litellm 自定义回调的更多详细信息请见此处
在 python 文件中定义您的自定义回调类。
以下是一个用于跟踪 key, user, model, prompt, response, tokens, cost 的自定义记录器示例。我们创建一个名为 custom_callbacks.py 的文件并初始化 proxy_handler_instance
from litellm.integrations.custom_logger import CustomLogger
import litellm
# This file includes the custom callbacks for LiteLLM Proxy
# Once defined, these can be passed in proxy_config.yaml
class MyCustomHandler(CustomLogger):
def log_pre_api_call(self, model, messages, kwargs):
print(f"Pre-API Call")
def log_post_api_call(self, kwargs, response_obj, start_time, end_time):
print(f"Post-API Call")
def log_success_event(self, kwargs, response_obj, start_time, end_time):
print("On Success")
def log_failure_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Failure")
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Success!")
# log: key, user, model, prompt, response, tokens, cost
# Access kwargs passed to litellm.completion()
model = kwargs.get("model", None)
messages = kwargs.get("messages", None)
user = kwargs.get("user", None)
# Access litellm_params passed to litellm.completion(), example access `metadata`
litellm_params = kwargs.get("litellm_params", {})
metadata = litellm_params.get("metadata", {}) # headers passed to LiteLLM proxy, can be found here
# Calculate cost using litellm.completion_cost()
cost = litellm.completion_cost(completion_response=response_obj)
response = response_obj
# tokens used in response
usage = response_obj["usage"]
print(
f"""
Model: {model},
Messages: {messages},
User: {user},
Usage: {usage},
Cost: {cost},
Response: {response}
Proxy Metadata: {metadata}
"""
)
return
async def async_log_failure_event(self, kwargs, response_obj, start_time, end_time):
try:
print(f"On Async Failure !")
print("\nkwargs", kwargs)
# Access kwargs passed to litellm.completion()
model = kwargs.get("model", None)
messages = kwargs.get("messages", None)
user = kwargs.get("user", None)
# Access litellm_params passed to litellm.completion(), example access `metadata`
litellm_params = kwargs.get("litellm_params", {})
metadata = litellm_params.get("metadata", {}) # headers passed to LiteLLM proxy, can be found here
# Access Exceptions & Traceback
exception_event = kwargs.get("exception", None)
traceback_event = kwargs.get("traceback_exception", None)
# Calculate cost using litellm.completion_cost()
cost = litellm.completion_cost(completion_response=response_obj)
print("now checking response obj")
print(
f"""
Model: {model},
Messages: {messages},
User: {user},
Cost: {cost},
Response: {response_obj}
Proxy Metadata: {metadata}
Exception: {exception_event}
Traceback: {traceback_event}
"""
)
except Exception as e:
print(f"Exception: {e}")
proxy_handler_instance = MyCustomHandler()
# Set litellm.callbacks = [proxy_handler_instance] on the proxy
第 2 步 - 在 config.yaml 中传递您的自定义回调类
我们将第 1 步中定义的自定义回调类传递给 config.yaml。将 callbacks 设置为 python_filename.logger_instance_name
在下方的配置中,我们传递
- python_filename:
custom_callbacks.py - logger_instance_name:
proxy_handler_instance。这在第 1 步中定义
callbacks: custom_callbacks.proxy_handler_instance
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
callbacks: custom_callbacks.proxy_handler_instance # sets litellm.callbacks = [proxy_handler_instance]
第 2b 步 - 从 S3/GCS 加载自定义回调(备选方案)
除了使用本地 Python 文件外,您还可以直接从 S3 或 GCS 存储桶加载自定义回调。这对于集中式回调管理或在容器化环境中部署时非常有用。
URL 格式
- S3:
s3://bucket-name/module_name.instance_name - GCS:
gcs://bucket-name/module_name.instance_name
示例 - 从 S3 加载
假设您有一个存储在 S3 存储桶 litellm-proxy 中的 custom_callbacks.py 文件,内容如下
# custom_callbacks.py (stored in S3)
from litellm.integrations.custom_logger import CustomLogger
import litellm
class MyCustomHandler(CustomLogger):
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"Custom UI SSO callback executed!")
# Your custom logic here
async def async_log_failure_event(self, kwargs, response_obj, start_time, end_time):
print(f"Custom UI SSO failure callback!")
# Your failure handling logic
# Instance that will be loaded by LiteLLM
custom_handler = MyCustomHandler()
配置
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
callbacks: ["s3://litellm-proxy/custom_callbacks.custom_handler"]
示例 - 从 GCS 加载
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
callbacks: ["gcs://my-gcs-bucket/custom_callbacks.custom_handler"]
工作原理
- LiteLLM 检测到 S3/GCS URL 前缀
- 将 Python 文件下载到临时位置
- 加载模块并提取指定的实例
- 清理临时文件
- 使用回调实例进行日志记录
此方法允许您
- 在多个代理实例之间集中管理回调文件
- 在不同环境之间共享回调
- 在云存储中对回调文件进行版本控制
第 2c 步 - 在 Helm/Kubernetes 中挂载自定义回调(备选方案)
使用 Helm 或 Kubernetes 部署时,您可以将自定义回调 Python 文件与 config.yaml 一起挂载,使用 subPath 以避免覆盖配置目录。
问题: 将卷挂载到目录(例如 /app/)通常会隐藏该目录中所有现有的文件,包括您的 config.yaml。
解决方案: 在 volumeMounts 中使用 subPath 来挂载单个文件,而不覆盖整个目录。
示例 - Helm values.yaml
# values.yaml
volumes:
- name: callback-files
configMap:
name: litellm-callback-files
volumeMounts:
- name: callback-files
mountPath: /app/custom_callbacks.py # Mount to specific FILE path
subPath: custom_callbacks.py # Required to avoid overwriting directory
使用您的回调文件创建 ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: litellm-callback-files
data:
custom_callbacks.py: |
from litellm.integrations.custom_logger import CustomLogger
class MyCustomHandler(CustomLogger):
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"Success! Model: {kwargs.get('model')}")
proxy_handler_instance = MyCustomHandler()
在 config.yaml 中引用
litellm_settings:
callbacks: custom_callbacks.proxy_handler_instance
工作原理
subPath参数告诉 Kubernetes 仅挂载特定的文件- 这会将
custom_callbacks.py放置在/app/中与您现有的config.yaml并列的位置 - LiteLLM 会自动在与配置相同的目录中找到回调文件
- 没有文件会被覆盖或隐藏
注意: 您可以通过添加更多 volumeMounts 条目(每个条目有自己的 subPath)来挂载多个回调文件。
第 3 步 - 启动代理 + 测试请求
litellm --config proxy_config.yaml
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Authorization: Bearer sk-1234' \
--data ' {
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "good morning good sir"
}
],
"user": "ishaan-app",
"temperature": 0.2
}'
代理上的结果日志
On Success
Model: gpt-3.5-turbo,
Messages: [{'role': 'user', 'content': 'good morning good sir'}],
User: ishaan-app,
Usage: {'completion_tokens': 10, 'prompt_tokens': 11, 'total_tokens': 21},
Cost: 3.65e-05,
Response: {'id': 'chatcmpl-8S8avKJ1aVBg941y5xzGMSKrYCMvN', 'choices': [{'finish_reason': 'stop', 'index': 0, 'message': {'content': 'Good morning! How can I assist you today?', 'role': 'assistant'}}], 'created': 1701716913, 'model': 'gpt-3.5-turbo-0613', 'object': 'chat.completion', 'system_fingerprint': None, 'usage': {'completion_tokens': 10, 'prompt_tokens': 11, 'total_tokens': 21}}
Proxy Metadata: {'user_api_key': None, 'headers': Headers({'host': '0.0.0.0:4000', 'user-agent': 'curl/7.88.1', 'accept': '*/*', 'authorization': 'Bearer sk-1234', 'content-length': '199', 'content-type': 'application/x-www-form-urlencoded'}), 'model_group': 'gpt-3.5-turbo', 'deployment': 'gpt-3.5-turbo-ModelID-gpt-3.5-turbo'}
记录代理请求对象、标头、Url
以下是如何访问每个请求发送到代理的 url、headers、request body
class MyCustomHandler(CustomLogger):
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Success!")
litellm_params = kwargs.get("litellm_params", None)
proxy_server_request = litellm_params.get("proxy_server_request")
print(proxy_server_request)
预期输出
{
"url": "http://testserver/chat/completions",
"method": "POST",
"headers": {
"host": "testserver",
"accept": "*/*",
"accept-encoding": "gzip, deflate",
"connection": "keep-alive",
"user-agent": "testclient",
"authorization": "Bearer None",
"content-length": "105",
"content-type": "application/json"
},
"body": {
"model": "Azure OpenAI GPT-4 Canada",
"messages": [
{
"role": "user",
"content": "hi"
}
],
"max_tokens": 10
}
}
记录 config.yaml 中设置的 model_info
以下是如何记录代理 config.yaml 中设置的 model_info。有关在 config.yaml 上设置 model_info 的信息
class MyCustomHandler(CustomLogger):
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Success!")
litellm_params = kwargs.get("litellm_params", None)
model_info = litellm_params.get("model_info")
print(model_info)
预期输出
{'mode': 'embedding', 'input_cost_per_token': 0.002}
记录来自代理的响应
/chat/completions 和 /embeddings 响应均可作为 response_obj 使用
注意:对于 /chat/completions,stream=True 和 非流式 响应均可作为 response_obj 使用
class MyCustomHandler(CustomLogger):
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Success!")
print(response_obj)
预期输出 /chat/completion [适用于 流式 和 非流式 响应]
ModelResponse(
id='chatcmpl-8Tfu8GoMElwOZuj2JlHBhNHG01PPo',
choices=[
Choices(
finish_reason='stop',
index=0,
message=Message(
content='As an AI language model, I do not have a physical body and therefore do not possess any degree or educational qualifications. My knowledge and abilities come from the programming and algorithms that have been developed by my creators.',
role='assistant'
)
)
],
created=1702083284,
model='chatgpt-v-2',
object='chat.completion',
system_fingerprint=None,
usage=Usage(
completion_tokens=42,
prompt_tokens=5,
total_tokens=47
)
)
预期输出 /embeddings
{
'model': 'ada',
'data': [
{
'embedding': [
-0.035126980394124985, -0.020624293014407158, -0.015343423001468182,
-0.03980357199907303, -0.02750781551003456, 0.02111034281551838,
-0.022069307044148445, -0.019442008808255196, -0.00955679826438427,
-0.013143060728907585, 0.029583381488919258, -0.004725852981209755,
-0.015198921784758568, -0.014069183729588985, 0.00897879246622324,
0.01521205808967352,
# ... (truncated for brevity)
]
}
]
}
自定义回调 API [异步]
将 LiteLLM 日志发送到自定义 API 端点
这是企业版专属功能 在此处开始使用企业版
| 属性 | 详情 |
|---|---|
| 描述 | 将 LLM 输入/输出记录到自定义 API 端点 |
| 已记录的负载 | List[StandardLoggingPayload] LiteLLM 将 StandardLoggingPayload 对象 列表记录到您的端点 |
如果您符合以下情况,请使用此功能
- 想要使用非 Python 编程语言编写的自定义回调
- 想要您的回调在不同的微服务上运行
用法
- 在 litellm config.yaml 上设置
success_callback: ["generic_api"]
model_list:
- model_name: openai/gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
litellm_settings:
success_callback: ["generic_api"]
- 为自定义 API 端点设置环境变量
| 环境变量 | 详情 | 必需 |
|---|---|---|
GENERIC_LOGGER_ENDPOINT | 我们应向其发送回调日志的端点 + 路由 | 是 |
GENERIC_LOGGER_HEADERS | 可选:设置要发送到自定义 API 端点的标头 | 不,这是可选的 |
GENERIC_LOGGER_ENDPOINT="https://webhook-test.com/30343bc33591bc5e6dc44217ceae3e0a"
# Optional: Set headers to be sent to the custom API endpoint
GENERIC_LOGGER_HEADERS="Authorization=Bearer <your-api-key>"
# if multiple headers, separate by commas
GENERIC_LOGGER_HEADERS="Authorization=Bearer <your-api-key>,X-Custom-Header=custom-header-value"
- 启动代理
litellm --config /path/to/config.yaml
- 发起测试请求
curl -i --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
Langsmith
- 在 litellm config.yaml 上设置
success_callback: ["langsmith"]
如果您使用的是自定义 LangSmith 实例,可以将 LANGSMITH_BASE_URL 环境变量设置为指向您的实例。
litellm_settings:
success_callback: ["langsmith"]
environment_variables:
LANGSMITH_API_KEY: "lsv2_pt_xxxxxxxx"
LANGSMITH_PROJECT: "litellm-proxy"
LANGSMITH_BASE_URL: "https://api.smith.langchain.com" # (Optional - only needed if you have a custom Langsmith instance)
- 启动 Proxy
litellm --config /path/to/config.yaml
- 测试它!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "Hello, Claude gm!"
}
],
}
'
预期在 Langfuse 上看到您的日志
Arize AI
- 在 litellm config.yaml 上设置
success_callback: ["arize"]
model_list:
- model_name: gpt-4
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
litellm_settings:
callbacks: ["arize"]
environment_variables:
ARIZE_SPACE_KEY: "d0*****"
ARIZE_API_KEY: "141a****"
ARIZE_ENDPOINT: "https://otlp.arize.com/v1" # OPTIONAL - your custom arize GRPC api endpoint
ARIZE_HTTP_ENDPOINT: "https://otlp.arize.com/v1" # OPTIONAL - your custom arize HTTP api endpoint. Set either this or ARIZE_ENDPOINT
- 启动 Proxy
litellm --config /path/to/config.yaml
- 测试它!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "Hello, Claude gm!"
}
],
}
'
预期在 Langfuse 上看到您的日志
Langtrace
- 在 litellm config.yaml 上设置
success_callback: ["langtrace"]
model_list:
- model_name: gpt-4
litellm_params:
model: openai/fake
api_key: fake-key
api_base: https://exampleopenaiendpoint-production.up.railway.app/
litellm_settings:
callbacks: ["langtrace"]
environment_variables:
LANGTRACE_API_KEY: "141a****"
- 启动 Proxy
litellm --config /path/to/config.yaml
- 测试它!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "Hello, Claude gm!"
}
],
}
'
Galileo
[BETA]
在 www.rungalileo.io 上记录 LLM 输入/输出
Beta 集成
所需环境变量
Galileo Cloud (app.galileo.ai)
export GALILEO_API_KEY=""
export GALILEO_PROJECT_ID=""
export GALILEO_LOG_STREAM_ID="" # optional
export GALILEO_BASE_URL="https://api.galileo.ai" # optional, defaults when GALILEO_API_KEY is set
企业版 / 自托管 Observe
export GALILEO_BASE_URL="" # Replace 'console' with 'api' in your console URL (e.g. https://api.galileo.myenterprise.com)
export GALILEO_PROJECT_ID=""
export GALILEO_USERNAME=""
export GALILEO_PASSWORD=""
快速入门
- 添加到 Config.yaml
model_list:
- litellm_params:
api_base: https://exampleopenaiendpoint-production.up.railway.app/
api_key: my-fake-key
model: openai/my-fake-model
model_name: fake-openai-endpoint
environment_variables:
GALILEO_API_KEY: "os.environ/GALILEO_API_KEY"
GALILEO_PROJECT_ID: "your-project-id"
GALILEO_LOG_STREAM_ID: "your-log-stream-id" # optional
litellm_settings:
success_callback: ["galileo"] # 👈 KEY CHANGE
- 启动 Proxy
litellm --config /path/to/config.yaml
- 测试它!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}
'
🎉 就是这样 - 预期在您的 Galileo 控制面板上看到您的日志
OpenMeter
使用 OpenMeter 根据客户的 LLM API 使用量对其进行计费
所需环境变量
# from https://openmeter.cloud
export OPENMETER_API_ENDPOINT="" # defaults to https://openmeter.cloud
export OPENMETER_API_KEY=""
快速入门
- 添加到 Config.yaml
model_list:
- litellm_params:
api_base: https://openai-function-calling-workers.tasslexyz.workers.dev/
api_key: my-fake-key
model: openai/my-fake-model
model_name: fake-openai-endpoint
litellm_settings:
success_callback: ["openmeter"] # 👈 KEY CHANGE
- 启动 Proxy
litellm --config /path/to/config.yaml
- 测试它!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "fake-openai-endpoint",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
}
'
DynamoDB
我们将使用 --config 来设置
litellm.success_callback = ["dynamodb"]litellm.dynamodb_table_name = "your-table-name"
这将把所有成功的 LLM 调用记录到 DynamoDB
第 1 步:在 .env 中设置 AWS 凭证
AWS_ACCESS_KEY_ID = ""
AWS_SECRET_ACCESS_KEY = ""
AWS_REGION_NAME = ""
第 2 步:创建 config.yaml 文件并设置 litellm_settings: success_callback
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
success_callback: ["dynamodb"]
dynamodb_table_name: your-table-name
步骤 3:启动代理,发起测试请求
启动代理
litellm --config config.yaml --debug
测试请求
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "Azure OpenAI GPT-4 East",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
您的日志应可在 DynamoDB 上找到
记录到 DynamoDB 的数据 /chat/completions
{
"id": {
"S": "chatcmpl-8W15J4480a3fAQ1yQaMgtsKJAicen"
},
"call_type": {
"S": "acompletion"
},
"endTime": {
"S": "2023-12-15 17:25:58.424118"
},
"messages": {
"S": "[{'role': 'user', 'content': 'This is a test'}]"
},
"metadata": {
"S": "{}"
},
"model": {
"S": "gpt-3.5-turbo"
},
"modelParameters": {
"S": "{'temperature': 0.7, 'max_tokens': 100, 'user': 'ishaan-2'}"
},
"response": {
"S": "ModelResponse(id='chatcmpl-8W15J4480a3fAQ1yQaMgtsKJAicen', choices=[Choices(finish_reason='stop', index=0, message=Message(content='Great! What can I assist you with?', role='assistant'))], created=1702641357, model='gpt-3.5-turbo-0613', object='chat.completion', system_fingerprint=None, usage=Usage(completion_tokens=9, prompt_tokens=11, total_tokens=20))"
},
"startTime": {
"S": "2023-12-15 17:25:56.047035"
},
"usage": {
"S": "Usage(completion_tokens=9, prompt_tokens=11, total_tokens=20)"
},
"user": {
"S": "ishaan-2"
}
}
记录到 DynamoDB 的数据 /embeddings
{
"id": {
"S": "4dec8d4d-4817-472d-9fc6-c7a6153eb2ca"
},
"call_type": {
"S": "aembedding"
},
"endTime": {
"S": "2023-12-15 17:25:59.890261"
},
"messages": {
"S": "['hi']"
},
"metadata": {
"S": "{}"
},
"model": {
"S": "text-embedding-ada-002"
},
"modelParameters": {
"S": "{'user': 'ishaan-2'}"
},
"response": {
"S": "EmbeddingResponse(model='text-embedding-ada-002-v2', data=[{'embedding': [-0.03503197431564331, -0.020601635798811913, -0.015375726856291294,
}
}
Sentry
如果 API 调用失败(llm/database),您可以将它们记录到 Sentry
第 1 步:安装 Sentry
uv add --upgrade sentry-sdk
第 2 步:保存您的 Sentry_DSN 并添加 litellm_settings: failure_callback
export SENTRY_DSN="your-sentry-dsn"
# Optional: Configure Sentry sampling rates
export SENTRY_API_SAMPLE_RATE="1.0" # Controls what percentage of errors are sent (default: 1.0 = 100%)
export SENTRY_API_TRACE_RATE="1.0" # Controls what percentage of transactions are sampled for performance monitoring (default: 1.0 = 100%)
export SENTRY_ENVIRONMENT="development" # Controls the Sentry Environment (default: production)
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
# other settings
failure_callback: ["sentry"]
general_settings:
database_url: "my-bad-url" # set a fake url to trigger a sentry exception
步骤 3:启动代理,发起测试请求
启动代理
litellm --config config.yaml --debug
测试请求
litellm --test
Athina
Athina 允许您记录 LLM 输入/输出以进行监控、分析和可观测性。
我们将使用 --config 来设置 litellm.success_callback = ["athina"],这将把所有成功的 LLM 调用记录到 athina
第 1 步:设置 Athina API 密钥
ATHINA_API_KEY = "your-athina-api-key"
第 2 步:创建 config.yaml 文件并设置 litellm_settings: success_callback
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
litellm_settings:
success_callback: ["athina"]
步骤 3:启动代理,发起测试请求
启动代理
litellm --config config.yaml --debug
测试请求
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "which llm are you"
}
]
}'