跳至主要内容

Langchain、OpenAI SDK、LlamaIndex、Instructor、Curl 示例

LiteLLM Proxy 与 OpenAI 兼容,并支持

LiteLLM Proxy 与 Azure OpenAI 兼容

  • /chat/completions
  • /completions
  • /embeddings

LiteLLM Proxy 与 Anthropic 兼容

  • /messages

LiteLLM Proxy 与 Vertex AI 兼容

本文档涵盖

  • /chat/completion
  • /embedding

这些是 精选示例。LiteLLM Proxy 与 OpenAI 兼容,适用于所有调用 OpenAI 的项目。只需更改 base_urlapi_keymodel 即可。

要传递特定提供商的参数,请点击此处

要删除不支持的参数(例如,对于 librechat 的 bedrock 的 frequency_penalty),请点击此处

信息

输入、输出和异常映射到所有支持模型的 OpenAI 格式

如何将请求发送到代理,传递元数据,允许用户传递他们的 OpenAI API 密钥

/chat/completions

请求格式

extra_body={"metadata": { }} 设置为要传递的 metadata

import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)

# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
extra_body={ # pass in any provider-specific param, if not supported by openai, https://docs.litellm.com.cn/docs/completion/input#provider-specific-params
"metadata": { # 👈 use for logging additional params (e.g. to langfuse)
"generation_name": "ishaan-generation-openai-client",
"generation_id": "openai-client-gen-id22",
"trace_id": "openai-client-trace-id22",
"trace_user_id": "openai-client-user-id2"
}
}
)

print(response)

使用标签进行分类和跟踪

标签允许您对 LLM 请求进行分类、过滤和跟踪。将标签添加到您的元数据中,以获得更好的组织和分析。

import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Hello!"}],
extra_body={
"metadata": {
"tags": ["production", "customer-support", "urgent"],
"generation_name": "support-bot",
"trace_user_id": "user-123"
}
}
)

标签优势

  • 成本跟踪:按项目/团队/功能监控支出
  • 分析:在日志和仪表板中按标签过滤请求
  • 路由:使用标签进行条件模型路由
  • 调试:使用分类的请求更容易进行故障排除

响应格式

{
"id": "chatcmpl-8c5qbGTILZa1S4CK3b31yj5N40hFN",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "As an AI language model, I do not have a physical form or personal preferences. However, I am programmed to assist with various topics and provide information on a wide range of subjects. Is there something specific you would like assistance with?",
"role": "assistant"
}
}
],
"created": 1704089632,
"model": "gpt-35-turbo",
"object": "chat.completion",
"system_fingerprint": null,
"usage": {
"completion_tokens": 47,
"prompt_tokens": 12,
"total_tokens": 59
},
"_response_ms": 1753.426
}

流式传输

curl http://0.0.0.0:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPTIONAL_YOUR_PROXY_KEY" \
-d '{
"model": "gpt-4-turbo",
"messages": [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
"stream": true
}'

函数调用

以下是一些使用代理进行函数调用的示例。

您可以将代理用于与 任何openai 兼容项目的函数调用。

curl http://0.0.0.0:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPTIONAL_YOUR_PROXY_KEY" \
-d '{
"model": "gpt-4-turbo",
"messages": [
{
"role": "user",
"content": "What'\''s the weather like in Boston today?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}'

/embeddings

请求格式

输入、输出和异常映射到所有支持模型的 OpenAI 格式

import openai
from openai import OpenAI

# set base_url to your proxy server
# set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:4000")

response = client.embeddings.create(
input=["hello from litellm"],
model="text-embedding-ada-002"
)

print(response)

响应格式

{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [
0.0023064255,
-0.009327292,
....
-0.0028842222,
],
"index": 0
}
],
"model": "text-embedding-ada-002",
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
}
}

/moderations

请求格式

输入、输出和异常映射到所有支持模型的 OpenAI 格式

import openai
from openai import OpenAI

# set base_url to your proxy server
# set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:4000")

response = client.moderations.create(
input="hello from litellm",
model="text-moderation-stable"
)

print(response)

响应格式

{
"id": "modr-8sFEN22QCziALOfWTa77TodNLgHwA",
"model": "text-moderation-007",
"results": [
{
"categories": {
"harassment": false,
"harassment/threatening": false,
"hate": false,
"hate/threatening": false,
"self-harm": false,
"self-harm/instructions": false,
"self-harm/intent": false,
"sexual": false,
"sexual/minors": false,
"violence": false,
"violence/graphic": false
},
"category_scores": {
"harassment": 0.000019947197870351374,
"harassment/threatening": 5.5971017900446896e-6,
"hate": 0.000028560316422954202,
"hate/threatening": 2.2631787999216613e-8,
"self-harm": 2.9121162015144364e-7,
"self-harm/instructions": 9.314219084899378e-8,
"self-harm/intent": 8.093739012338119e-8,
"sexual": 0.00004414955765241757,
"sexual/minors": 0.0000156943697220413,
"violence": 0.00022354527027346194,
"violence/graphic": 8.804164281173144e-6
},
"flagged": false
}
]
}

与 OpenAI 兼容的项目一起使用

base_url 设置为 LiteLLM Proxy 服务器

import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)

# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
])

print(response)

与 Vertex、Boto3、Anthropic SDK(本机格式)一起使用

👉 了解如何使用 litellm 代理与 Vertex、boto3、Anthropic SDK - 以本机格式

高级

(BETA) 批量完成 - 传递多个模型

当您想将 1 个请求发送到 N 个模型时使用此功能

预期的请求格式

将模型作为逗号分隔的值字符串传递。例如 "model"="llama3,gpt-3.5-turbo"

相同的请求将被发送到 litellm proxy config.yaml 上的以下模型组

  • model_name="llama3"
  • model_name="gpt-3.5-turbo"
import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

response = client.chat.completions.create(
model="gpt-3.5-turbo,llama3",
messages=[
{"role": "user", "content": "this is a test request, write a short poem"}
],
)

print(response)

预期的响应格式

model 作为列表传递时,获取响应列表

[
ChatCompletion(
id='chatcmpl-9NoYhS2G0fswot0b6QpoQgmRQMaIf',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content='In the depths of my soul, a spark ignites\nA light that shines so pure and bright\nIt dances and leaps, refusing to die\nA flame of hope that reaches the sky\n\nIt warms my heart and fills me with bliss\nA reminder that in darkness, there is light to kiss\nSo I hold onto this fire, this guiding light\nAnd let it lead me through the darkest night.',
role='assistant',
function_call=None,
tool_calls=None
)
)
],
created=1715462919,
model='gpt-3.5-turbo-0125',
object='chat.completion',
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=83,
prompt_tokens=17,
total_tokens=100
)
),
ChatCompletion(
id='chatcmpl-4ac3e982-da4e-486d-bddb-ed1d5cb9c03c',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content="A test request, and I'm delighted!\nHere's a short poem, just for you:\n\nMoonbeams dance upon the sea,\nA path of light, for you to see.\nThe stars up high, a twinkling show,\nA night of wonder, for all to know.\n\nThe world is quiet, save the night,\nA peaceful hush, a gentle light.\nThe world is full, of beauty rare,\nA treasure trove, beyond compare.\n\nI hope you enjoyed this little test,\nA poem born, of whimsy and jest.\nLet me know, if there's anything else!",
role='assistant',
function_call=None,
tool_calls=None
)
)
],
created=1715462919,
model='groq/llama3-8b-8192',
object='chat.completion',
system_fingerprint='fp_a2c8d063cb',
usage=CompletionUsage(
completion_tokens=120,
prompt_tokens=20,
total_tokens=140
)
)
]