自定义护栏
如果你想编写代码运行自定义护栏,请使用此功能
快速入门
1. 编写一个 CustomGuardrail
类
一个 CustomGuardrail 有 4 个方法来执行护栏规则
async_pre_call_hook
-(可选)在调用 LLM API 之前修改输入或拒绝请求async_moderation_hook
-(可选)拒绝请求,在调用 LLM API 期间运行(有助于降低延迟)async_post_call_success_hook
-(可选)对输入/输出应用护栏规则,在成功调用 LLM API 后运行async_post_call_streaming_iterator_hook
-(可选)将整个流传递给护栏
CustomGuardrail
类示例
创建一个名为 custom_guardrail.py
的新文件,并将此代码添加到其中
from typing import Any, Dict, List, Literal, Optional, Union
import litellm
from litellm._logging import verbose_proxy_logger
from litellm.caching.caching import DualCache
from litellm.integrations.custom_guardrail import CustomGuardrail
from litellm.proxy._types import UserAPIKeyAuth
from litellm.proxy.guardrails.guardrail_helpers import should_proceed_based_on_metadata
from litellm.types.guardrails import GuardrailEventHooks
class myCustomGuardrail(CustomGuardrail):
def __init__(
self,
**kwargs,
):
# store kwargs as optional_params
self.optional_params = kwargs
super().__init__(**kwargs)
async def async_pre_call_hook(
self,
user_api_key_dict: UserAPIKeyAuth,
cache: DualCache,
data: dict,
call_type: Literal[
"completion",
"text_completion",
"embeddings",
"image_generation",
"moderation",
"audio_transcription",
"pass_through_endpoint",
"rerank"
],
) -> Optional[Union[Exception, str, dict]]:
"""
Runs before the LLM API call
Runs on only Input
Use this if you want to MODIFY the input
"""
# In this guardrail, if a user inputs `litellm` we will mask it and then send it to the LLM
_messages = data.get("messages")
if _messages:
for message in _messages:
_content = message.get("content")
if isinstance(_content, str):
if "litellm" in _content.lower():
_content = _content.replace("litellm", "********")
message["content"] = _content
verbose_proxy_logger.debug(
"async_pre_call_hook: Message after masking %s", _messages
)
return data
async def async_moderation_hook(
self,
data: dict,
user_api_key_dict: UserAPIKeyAuth,
call_type: Literal["completion", "embeddings", "image_generation", "moderation", "audio_transcription"],
):
"""
Runs in parallel to LLM API call
Runs on only Input
This can NOT modify the input, only used to reject or accept a call before going to LLM API
"""
# this works the same as async_pre_call_hook, but just runs in parallel as the LLM API Call
# In this guardrail, if a user inputs `litellm` we will mask it.
_messages = data.get("messages")
if _messages:
for message in _messages:
_content = message.get("content")
if isinstance(_content, str):
if "litellm" in _content.lower():
raise ValueError("Guardrail failed words - `litellm` detected")
async def async_post_call_success_hook(
self,
data: dict,
user_api_key_dict: UserAPIKeyAuth,
response,
):
"""
Runs on response from LLM API call
It can be used to reject a response
If a response contains the word "coffee" -> we will raise an exception
"""
verbose_proxy_logger.debug("async_pre_call_hook response: %s", response)
if isinstance(response, litellm.ModelResponse):
for choice in response.choices:
if isinstance(choice, litellm.Choices):
verbose_proxy_logger.debug("async_pre_call_hook choice: %s", choice)
if (
choice.message.content
and isinstance(choice.message.content, str)
and "coffee" in choice.message.content
):
raise ValueError("Guardrail failed Coffee Detected")
async def async_post_call_streaming_iterator_hook(
self,
user_api_key_dict: UserAPIKeyAuth,
response: Any,
request_data: dict,
) -> AsyncGenerator[ModelResponseStream, None]:
"""
Passes the entire stream to the guardrail
This is useful for guardrails that need to see the entire response, such as PII masking.
See Aim guardrail implementation for an example - https://github.com/BerriAI/litellm/blob/d0e022cfacb8e9ebc5409bb652059b6fd97b45c0/litellm/proxy/guardrails/guardrail_hooks/aim.py#L168
Triggered by mode: 'post_call'
"""
async for item in response:
yield item
2. 在 LiteLLM 的 config.yaml
中传递你的自定义护栏类
在下面的配置中,我们通过设置 guardrail: custom_guardrail.myCustomGuardrail
将护栏指向我们的自定义护栏
- Python 文件名:
custom_guardrail.py
- 护栏类名:
myCustomGuardrail
。这在步骤 1 中定义
guardrail: custom_guardrail.myCustomGuardrail
model_list:
- model_name: gpt-4
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
guardrails:
- guardrail_name: "custom-pre-guard"
litellm_params:
guardrail: custom_guardrail.myCustomGuardrail # 👈 Key change
mode: "pre_call" # runs async_pre_call_hook
- guardrail_name: "custom-during-guard"
litellm_params:
guardrail: custom_guardrail.myCustomGuardrail
mode: "during_call" # runs async_moderation_hook
- guardrail_name: "custom-post-guard"
litellm_params:
guardrail: custom_guardrail.myCustomGuardrail
mode: "post_call" # runs async_post_call_success_hook
3. 启动 LiteLLM 网关
- Docker 运行
- litellm pip
将你的 custom_guardrail.py
文件挂载到 LiteLLM Docker 容器上
这将把你的本地目录中的 custom_guardrail.py
文件挂载到 Docker 容器的 /app
目录,使其可供 LiteLLM 网关访问。
docker run -d \
-p 4000:4000 \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
--name my-app \
-v $(pwd)/my_config.yaml:/app/config.yaml \
-v $(pwd)/custom_guardrail.py:/app/custom_guardrail.py \
my-app:latest \
--config /app/config.yaml \
--port 4000 \
--detailed_debug \
litellm --config config.yaml --detailed_debug
4. 测试
测试 \"custom-pre-guard\"
- 修改输入
- 成功调用
预计这将在将请求发送到 LLM API 之前屏蔽单词 litellm
。这将运行 async_pre_call_hook
curl -i -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4",
"messages": [
{
"role": "user",
"content": "say the word - `litellm`"
}
],
"guardrails": ["custom-pre-guard"]
}'
预检护栏后的预期响应
{
"id": "chatcmpl-9zREDkBIG20RJB4pMlyutmi1hXQWc",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "It looks like you've chosen a string of asterisks. This could be a way to censor or hide certain text. However, without more context, I can't provide a specific word or phrase. If there's something specific you'd like me to say or if you need help with a topic, feel free to let me know!",
"role": "assistant",
"tool_calls": null,
"function_call": null
}
}
],
"created": 1724429701,
"model": "gpt-4o-2024-05-13",
"object": "chat.completion",
"system_fingerprint": "fp_3aa7262c27",
"usage": {
"completion_tokens": 65,
"prompt_tokens": 14,
"total_tokens": 79
},
"service_tier": null
}
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-npnwjPQciVRok5yNZgKmFQ" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "user", "content": "hi what is the weather"}
],
"guardrails": ["custom-pre-guard"]
}'
测试 \"custom-during-guard\"
- 不成功调用
- 成功调用
预计这将失败,因为 litellm
在消息内容中。这将运行 async_moderation_hook
curl -i -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4",
"messages": [
{
"role": "user",
"content": "say the word - `litellm`"
}
],
"guardrails": ["custom-during-guard"]
}'
运行期间护栏后的预期响应
{
"error": {
"message": "Guardrail failed words - `litellm` detected",
"type": "None",
"param": "None",
"code": "500"
}
}
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-npnwjPQciVRok5yNZgKmFQ" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "user", "content": "hi what is the weather"}
],
"guardrails": ["custom-during-guard"]
}'
测试 \"custom-post-guard\"
- 不成功调用
- 成功调用
预计这将失败,因为 coffee
会在响应内容中。这将运行 async_post_call_success_hook
curl -i -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4",
"messages": [
{
"role": "user",
"content": "what is coffee"
}
],
"guardrails": ["custom-post-guard"]
}'
运行期间护栏后的预期响应
{
"error": {
"message": "Guardrail failed Coffee Detected",
"type": "None",
"param": "None",
"code": "500"
}
}
curl -i -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4",
"messages": [
{
"role": "user",
"content": "what is tea"
}
],
"guardrails": ["custom-post-guard"]
}'
✨ 向护栏传递附加参数
✨ 这是仅限企业版的功能 联系我们获取免费试用
使用此功能可以向护栏 API 调用传递附加参数。例如,成功阈值等
- 使用
get_guardrail_dynamic_request_body_params
get_guardrail_dynamic_request_body_params
是 litellm.integrations.custom_guardrail.CustomGuardrail
类的一个方法,用于获取请求体中传递的动态护栏参数。
from typing import Any, Dict, List, Literal, Optional, Union
import litellm
from litellm._logging import verbose_proxy_logger
from litellm.caching.caching import DualCache
from litellm.integrations.custom_guardrail import CustomGuardrail
from litellm.proxy._types import UserAPIKeyAuth
class myCustomGuardrail(CustomGuardrail):
def __init__(self, **kwargs):
super().__init__(**kwargs)
async def async_pre_call_hook(
self,
user_api_key_dict: UserAPIKeyAuth,
cache: DualCache,
data: dict,
call_type: Literal[
"completion",
"text_completion",
"embeddings",
"image_generation",
"moderation",
"audio_transcription",
"pass_through_endpoint",
"rerank"
],
) -> Optional[Union[Exception, str, dict]]:
# Get dynamic params from request body
params = self.get_guardrail_dynamic_request_body_params(request_data=data)
# params will contain: {"success_threshold": 0.9}
verbose_proxy_logger.debug("Guardrail params: %s", params)
return data
- 在你的 API 请求中传递参数
LiteLLM Proxy 允许你在请求体中传递 guardrails
,遵循guardrails
规范。
- OpenAI Python
- Curl
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Write a short poem"}],
extra_body={
"guardrails": [
"custom-pre-guard": {
"extra_body": {
"success_threshold": 0.9
}
}
]
}
)
curl 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Write a short poem"
}
],
"guardrails": [
"custom-pre-guard": {
"extra_body": {
"success_threshold": 0.9
}
}
]
}'
get_guardrail_dynamic_request_body_params
方法将返回
{
"success_threshold": 0.9
}
CustomGuardrail 方法
组件 | 描述 | 可选 | 检查数据 | 能否修改输入 | 能否修改输出 | 能否中断调用 |
---|---|---|---|---|---|---|
async_pre_call_hook | 在调用 LLM API 之前运行的钩子 | ✅ | 输入 | ✅ | ✅ | ✅ |
❌ | async_moderation_hook | ✅ | 输入 | ✅ | ✅ | ✅ |
在调用 LLM API 期间运行的钩子 | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ |