跳到主要内容

Langchain、OpenAI SDK、LlamaIndex、Instructor、Curl 示例

LiteLLM Proxy 兼容 OpenAI,并支持:

LiteLLM Proxy 兼容 Azure OpenAI

  • /chat/completions
  • /completions
  • /embeddings

LiteLLM Proxy 兼容 Anthropic

  • /messages

LiteLLM Proxy 兼容 Vertex AI

本文档涵盖:

  • /chat/completion
  • /embedding

这些是精选示例。LiteLLM Proxy 兼容 OpenAI,它适用于任何调用 OpenAI 的项目。只需更改 base_urlapi_keymodel

要传递提供商特定参数,请点击此处

要丢弃不支持的参数(例如,Bedrock 在 librechat 中的 frequency_penalty),请点击此处

信息

输入、输出、异常都映射到 OpenAI 格式,适用于所有支持的模型。

如何向代理发送请求、传递元数据、允许用户传入他们的 OpenAI API 密钥。

/chat/completions

请求格式

设置 extra_body={"metadata": { }} 为您想传递的 metadata

import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)

# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
extra_body={ # pass in any provider-specific param, if not supported by openai, https://docs.litellm.com.cn/docs/completion/input#provider-specific-params
"metadata": { # 👈 use for logging additional params (e.g. to langfuse)
"generation_name": "ishaan-generation-openai-client",
"generation_id": "openai-client-gen-id22",
"trace_id": "openai-client-trace-id22",
"trace_user_id": "openai-client-user-id2"
}
}
)

print(response)

响应格式

{
"id": "chatcmpl-8c5qbGTILZa1S4CK3b31yj5N40hFN",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "As an AI language model, I do not have a physical form or personal preferences. However, I am programmed to assist with various topics and provide information on a wide range of subjects. Is there something specific you would like assistance with?",
"role": "assistant"
}
}
],
"created": 1704089632,
"model": "gpt-35-turbo",
"object": "chat.completion",
"system_fingerprint": null,
"usage": {
"completion_tokens": 47,
"prompt_tokens": 12,
"total_tokens": 59
},
"_response_ms": 1753.426
}

流式传输

curl http://0.0.0.0:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPTIONAL_YOUR_PROXY_KEY" \
-d '{
"model": "gpt-4-turbo",
"messages": [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
"stream": true
}'

函数调用

以下是一些使用代理进行函数调用的示例。

您可以使用代理与任何兼容 OpenAI 的项目进行函数调用。

curl http://0.0.0.0:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPTIONAL_YOUR_PROXY_KEY" \
-d '{
"model": "gpt-4-turbo",
"messages": [
{
"role": "user",
"content": "What'\''s the weather like in Boston today?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}'

/embeddings

请求格式

输入、输出和异常都映射到 OpenAI 格式,适用于所有支持的模型

import openai
from openai import OpenAI

# set base_url to your proxy server
# set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:4000")

response = client.embeddings.create(
input=["hello from litellm"],
model="text-embedding-ada-002"
)

print(response)

响应格式

{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [
0.0023064255,
-0.009327292,
....
-0.0028842222,
],
"index": 0
}
],
"model": "text-embedding-ada-002",
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
}
}

/moderations

请求格式

输入、输出和异常都映射到 OpenAI 格式,适用于所有支持的模型

import openai
from openai import OpenAI

# set base_url to your proxy server
# set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:4000")

response = client.moderations.create(
input="hello from litellm",
model="text-moderation-stable"
)

print(response)

响应格式

{
"id": "modr-8sFEN22QCziALOfWTa77TodNLgHwA",
"model": "text-moderation-007",
"results": [
{
"categories": {
"harassment": false,
"harassment/threatening": false,
"hate": false,
"hate/threatening": false,
"self-harm": false,
"self-harm/instructions": false,
"self-harm/intent": false,
"sexual": false,
"sexual/minors": false,
"violence": false,
"violence/graphic": false
},
"category_scores": {
"harassment": 0.000019947197870351374,
"harassment/threatening": 5.5971017900446896e-6,
"hate": 0.000028560316422954202,
"hate/threatening": 2.2631787999216613e-8,
"self-harm": 2.9121162015144364e-7,
"self-harm/instructions": 9.314219084899378e-8,
"self-harm/intent": 8.093739012338119e-8,
"sexual": 0.00004414955765241757,
"sexual/minors": 0.0000156943697220413,
"violence": 0.00022354527027346194,
"violence/graphic": 8.804164281173144e-6
},
"flagged": false
}
]
}

与兼容 OpenAI 的项目一起使用

base_url 设置为 LiteLLM Proxy 服务器

import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)

# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
])

print(response)

与 Vertex、Boto3、Anthropic SDK 一起使用(原生格式)

👉 此处介绍如何在原生格式下将 litellm 代理与 Vertex、boto3、Anthropic SDK 一起使用

高级

(BETA) 批量补全 - 传递多个模型

当您想向 N 个模型发送 1 个请求时使用此功能。

预期请求格式

将 model 传递为一个逗号分隔的模型字符串。示例 "model"="llama3,gpt-3.5-turbo"

同样的请求将被发送到 litellm 代理 config.yaml 中的以下模型组:

  • model_name="llama3"
  • model_name="gpt-3.5-turbo"
import openai

client = openai.OpenAI(api_key="sk-1234", base_url="http://0.0.0.0:4000")

response = client.chat.completions.create(
model="gpt-3.5-turbo,llama3",
messages=[
{"role": "user", "content": "this is a test request, write a short poem"}
],
)

print(response)

预期响应格式

model 传递为列表时,获取响应列表。

[
ChatCompletion(
id='chatcmpl-9NoYhS2G0fswot0b6QpoQgmRQMaIf',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content='In the depths of my soul, a spark ignites\nA light that shines so pure and bright\nIt dances and leaps, refusing to die\nA flame of hope that reaches the sky\n\nIt warms my heart and fills me with bliss\nA reminder that in darkness, there is light to kiss\nSo I hold onto this fire, this guiding light\nAnd let it lead me through the darkest night.',
role='assistant',
function_call=None,
tool_calls=None
)
)
],
created=1715462919,
model='gpt-3.5-turbo-0125',
object='chat.completion',
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=83,
prompt_tokens=17,
total_tokens=100
)
),
ChatCompletion(
id='chatcmpl-4ac3e982-da4e-486d-bddb-ed1d5cb9c03c',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content="A test request, and I'm delighted!\nHere's a short poem, just for you:\n\nMoonbeams dance upon the sea,\nA path of light, for you to see.\nThe stars up high, a twinkling show,\nA night of wonder, for all to know.\n\nThe world is quiet, save the night,\nA peaceful hush, a gentle light.\nThe world is full, of beauty rare,\nA treasure trove, beyond compare.\n\nI hope you enjoyed this little test,\nA poem born, of whimsy and jest.\nLet me know, if there's anything else!",
role='assistant',
function_call=None,
tool_calls=None
)
)
],
created=1715462919,
model='groq/llama3-8b-8192',
object='chat.completion',
system_fingerprint='fp_a2c8d063cb',
usage=CompletionUsage(
completion_tokens=120,
prompt_tokens=20,
total_tokens=140
)
)
]