Helicone - 开源 LLM 可观测性平台

提示

这是社区维护的。如果您遇到错误，请提交 Issue：https://github.com/BerriAI/litellm

Helicone 是一个开源的可观测性平台，它代理您的 LLM 请求，并提供关于您的使用情况、开支、延迟等关键洞察。

将 Helicone 与 LiteLLM 结合使用

LiteLLM 提供了 success_callbacks 和 failure_callbacks，让您可以根据响应状态轻松地将数据记录到 Helicone。

支持的 LLM 提供商

Helicone 可以记录来自各种 LLM 提供商的请求，包括

OpenAI
Azure
Anthropic
Gemini
Groq
Cohere
Replicate
等等

集成方法

将 Helicone 与 LiteLLM 集成主要有两种方法

使用回调
将 Helicone 用作代理

让我们详细探讨每种方法。

方法 1：使用回调

只需一行代码即可立即使用 Helicone 记录您在所有提供商的响应

litellm.success_callback = ["helicone"]

完整代码

import os
from litellm import completion

## Set env variables
os.environ["HELICONE_API_KEY"] = "your-helicone-key"
os.environ["OPENAI_API_KEY"] = "your-openai-key"

# Set callbacks
litellm.success_callback = ["helicone"]

# OpenAI call
response = completion(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hi 👋 - I'm OpenAI"}],
)

print(response)

方法 2：将 Helicone 用作代理

Helicone 的代理提供了高级功能，例如缓存、速率限制、通过PromptArmor 实现的 LLM 安全等等。

要将 Helicone 用作您的 LLM 请求的代理

通过 litellm.api_base 将 Helicone 设置为您的基础 URL
通过 litellm.metadata 传入 Helicone 请求头

完整代码

import os
import litellm
from litellm import completion

litellm.api_base = "https://oai.hconeai.com/v1"
litellm.headers = {
    "Helicone-Auth": f"Bearer {os.getenv('HELICONE_API_KEY')}",  # Authenticate to send requests to Helicone API
}

response = litellm.completion(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "How does a court case get to the Supreme Court?"}]
)

print(response)

高级用法

您可以使用 Helicone 请求头向您的请求添加自定义元数据和属性。以下是一些示例

litellm.metadata = {
    "Helicone-Auth": f"Bearer {os.getenv('HELICONE_API_KEY')}",  # Authenticate to send requests to Helicone API
    "Helicone-User-Id": "user-abc",  # Specify the user making the request
    "Helicone-Property-App": "web",  # Custom property to add additional information
    "Helicone-Property-Custom": "any-value",  # Add any custom property
    "Helicone-Prompt-Id": "prompt-supreme-court",  # Assign an ID to associate this prompt with future versions
    "Helicone-Cache-Enabled": "true",  # Enable caching of responses
    "Cache-Control": "max-age=3600",  # Set cache limit to 1 hour
    "Helicone-RateLimit-Policy": "10;w=60;s=user",  # Set rate limit policy
    "Helicone-Retry-Enabled": "true",  # Enable retry mechanism
    "helicone-retry-num": "3",  # Set number of retries
    "helicone-retry-factor": "2",  # Set exponential backoff factor
    "Helicone-Model-Override": "gpt-3.5-turbo-0613",  # Override the model used for cost calculation
    "Helicone-Session-Id": "session-abc-123",  # Set session ID for tracking
    "Helicone-Session-Path": "parent-trace/child-trace",  # Set session path for hierarchical tracking
    "Helicone-Omit-Response": "false",  # Include response in logging (default behavior)
    "Helicone-Omit-Request": "false",  # Include request in logging (default behavior)
    "Helicone-LLM-Security-Enabled": "true",  # Enable LLM security features
    "Helicone-Moderations-Enabled": "true",  # Enable content moderation
    "Helicone-Fallbacks": '["gpt-3.5-turbo", "gpt-4"]',  # Set fallback models
}

缓存和速率限制

启用缓存并设置速率限制策略

litellm.metadata = {
    "Helicone-Auth": f"Bearer {os.getenv('HELICONE_API_KEY')}",  # Authenticate to send requests to Helicone API
    "Helicone-Cache-Enabled": "true",  # Enable caching of responses
    "Cache-Control": "max-age=3600",  # Set cache limit to 1 hour
    "Helicone-RateLimit-Policy": "100;w=3600;s=user",  # Set rate limit policy
}

会话追踪和跟踪

使用会话 ID 和路径追踪多步和 agentic LLM 交互

litellm.metadata = {
    "Helicone-Auth": f"Bearer {os.getenv('HELICONE_API_KEY')}",  # Authenticate to send requests to Helicone API
    "Helicone-Session-Id": "session-abc-123",  # The session ID you want to track
    "Helicone-Session-Path": "parent-trace/child-trace",  # The path of the session
}

Helicone-Session-Id: 用来指定您想要追踪的会话的唯一标识符。这允许您将相关的请求分组在一起。
Helicone-Session-Path: 这个请求头定义了会话的路径，允许您表示父追踪和子追踪。例如，“parent/child” 表示父追踪的一个子追踪。

通过使用这两个请求头，您可以有效地对多步 LLM 交互进行分组和可视化，从而深入了解复杂的 AI 工作流程。

重试和回退机制

设置重试机制和回退选项

litellm.metadata = {
    "Helicone-Auth": f"Bearer {os.getenv('HELICONE_API_KEY')}",  # Authenticate to send requests to Helicone API
    "Helicone-Retry-Enabled": "true",  # Enable retry mechanism
    "helicone-retry-num": "3",  # Set number of retries
    "helicone-retry-factor": "2",  # Set exponential backoff factor
    "Helicone-Fallbacks": '["gpt-3.5-turbo", "gpt-4"]',  # Set fallback models
}

支持的请求头 - 有关支持的 Helicone 请求头及其完整列表，请参阅 Helicone 文档。通过利用这些请求头和元数据选项，您可以更深入地了解您的 LLM 使用情况，优化性能，并更好地使用 Helicone 和 LiteLLM 管理您的 AI 工作流程。

Helicone - 开源 LLM 可观测性平台

将 Helicone 与 LiteLLM 结合使用​

支持的 LLM 提供商​

集成方法​

方法 1：使用回调​

方法 2：将 Helicone 用作代理​

高级用法​

缓存和速率限制​

会话追踪和跟踪​

重试和回退机制​