IBM watsonx.ai
LiteLLM 支持所有 IBM watsonx.ai 基础模型和嵌入模型。
环境变量
os.environ["WATSONX_URL"] = "" # (required) Base URL of your WatsonX instance
# (required) either one of the following:
os.environ["WATSONX_APIKEY"] = "" # IBM cloud API key
os.environ["WATSONX_TOKEN"] = "" # IAM auth token
# optional - can also be passed as params to completion() or embedding()
os.environ["WATSONX_PROJECT_ID"] = "" # Project ID of your WatsonX instance
os.environ["WATSONX_DEPLOYMENT_SPACE_ID"] = "" # ID of your deployment space to use deployed models
os.environ["WATSONX_ZENAPIKEY"] = "" # Zen API key (use for long-term api token)
有关如何获取访问令牌以验证 watsonx.ai 身份的更多信息,请参见此处。
用法
import os
from litellm import completion
os.environ["WATSONX_URL"] = ""
os.environ["WATSONX_APIKEY"] = ""
## Call WATSONX `/text/chat` endpoint - supports function calling
response = completion(
model="watsonx/meta-llama/llama-3-1-8b-instruct",
messages=[{ "content": "what is your favorite colour?","role": "user"}],
project_id="<my-project-id>" # or pass with os.environ["WATSONX_PROJECT_ID"]
)
## Call WATSONX `/text/generation` endpoint - not all models support /chat route.
response = completion(
model="watsonx/ibm/granite-13b-chat-v2",
messages=[{ "content": "what is your favorite colour?","role": "user"}],
project_id="<my-project-id>"
)
用法 - 流式传输
import os
from litellm import completion
os.environ["WATSONX_URL"] = ""
os.environ["WATSONX_APIKEY"] = ""
os.environ["WATSONX_PROJECT_ID"] = ""
response = completion(
model="watsonx/meta-llama/llama-3-1-8b-instruct",
messages=[{ "content": "what is your favorite colour?","role": "user"}],
stream=True
)
for chunk in response:
print(chunk)
流式传输输出块示例
{
"choices": [
{
"finish_reason": null,
"index": 0,
"delta": {
"content": "I don't have a favorite color, but I do like the color blue. What's your favorite color?"
}
}
],
"created": null,
"model": "watsonx/ibm/granite-13b-chat-v2",
"usage": {
"prompt_tokens": null,
"completion_tokens": null,
"total_tokens": null
}
}
用法 - 部署空间中的模型
已部署到部署空间(例如:微调模型)的模型可以使用 deployment/<deployment_id>
格式调用(其中 <deployment_id>
是部署空间中已部署模型的 ID)。
部署空间的 ID 也必须设置在环境变量 WATSONX_DEPLOYMENT_SPACE_ID
中,或作为 space_id=<deployment_space_id>
参数传递给函数。
import litellm
response = litellm.completion(
model="watsonx/deployment/<deployment_id>",
messages=[{"content": "Hello, how are you?", "role": "user"}],
space_id="<deployment_space_id>"
)
用法 - 嵌入
LiteLLM 还支持向 IBM watsonx.ai 嵌入模型发出请求。所需的凭证与补全相同。
from litellm import embedding
response = embedding(
model="watsonx/ibm/slate-30m-english-rtrvr",
input=["What is the capital of France?"],
project_id="<my-project-id>"
)
print(response)
# EmbeddingResponse(model='ibm/slate-30m-english-rtrvr', data=[{'object': 'embedding', 'index': 0, 'embedding': [-0.037463713, -0.02141933, -0.02851813, 0.015519324, ..., -0.0021367231, -0.01704561, -0.001425816, 0.0035238306]}], object='list', usage=Usage(prompt_tokens=8, total_tokens=8))
OpenAI 代理用法
以下是如何使用 LiteLLM 代理服务器调用 IBM watsonx.ai
1. 在您的环境中保存密钥
export WATSONX_URL=""
export WATSONX_APIKEY=""
export WATSONX_PROJECT_ID=""
2. 启动代理
- CLI
- config.yaml
$ litellm --model watsonx/meta-llama/llama-3-8b-instruct
# Server running on http://0.0.0.0:4000
model_list:
- model_name: llama-3-8b
litellm_params:
# all params accepted by litellm.completion()
model: watsonx/meta-llama/llama-3-8b-instruct
api_key: "os.environ/WATSONX_API_KEY" # does os.getenv("WATSONX_API_KEY")
3. 测试
- Curl 请求
- OpenAI v1.0.0+
- Langchain
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "llama-3-8b",
"messages": [
{
"role": "user",
"content": "what is your favorite colour?"
}
]
}
'
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="llama-3-8b", messages=[
{
"role": "user",
"content": "what is your favorite colour?"
}
])
print(response)
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:4000", # set openai_api_base to the LiteLLM Proxy
model = "llama-3-8b",
temperature=0.1
)
messages = [
SystemMessage(
content="You are a helpful assistant that im using to make a test request to."
),
HumanMessage(
content="test from litellm. tell me why it's amazing in 1 sentence"
),
]
response = chat(messages)
print(response)
身份验证
将凭证作为参数传递
您也可以将凭证作为参数传递给补全和嵌入函数。
import os
from litellm import completion
response = completion(
model="watsonx/ibm/granite-13b-chat-v2",
messages=[{ "content": "What is your favorite color?","role": "user"}],
url="",
api_key="",
project_id=""
)
支持的 IBM watsonx.ai 模型
以下是您可以使用 LiteLLM 调用的一些 IBM watsonx.ai 中的可用模型示例
模型名称 | 命令 |
---|---|
Flan T5 XXL | completion(model=watsonx/google/flan-t5-xxl, messages=messages) |
Flan Ul2 | completion(model=watsonx/google/flan-ul2, messages=messages) |
Mt0 XXL | completion(model=watsonx/bigscience/mt0-xxl, messages=messages) |
Gpt Neox | completion(model=watsonx/eleutherai/gpt-neox-20b, messages=messages) |
Mpt 7B Instruct2 | completion(model=watsonx/ibm/mpt-7b-instruct2, messages=messages) |
Starcoder | completion(model=watsonx/bigcode/starcoder, messages=messages) |
Llama 2 70B Chat | completion(model=watsonx/meta-llama/llama-2-70b-chat, messages=messages) |
Llama 2 13B Chat | completion(model=watsonx/meta-llama/llama-2-13b-chat, messages=messages) |
Granite 13B Instruct | completion(model=watsonx/ibm/granite-13b-instruct-v1, messages=messages) |
Granite 13B Chat | completion(model=watsonx/ibm/granite-13b-chat-v1, messages=messages) |
Flan T5 XL | completion(model=watsonx/google/flan-t5-xl, messages=messages) |
Granite 13B Chat V2 | completion(model=watsonx/ibm/granite-13b-chat-v2, messages=messages) |
Granite 13B Instruct V2 | completion(model=watsonx/ibm/granite-13b-instruct-v2, messages=messages) |
Elyza Japanese Llama 2 7B Instruct | completion(model=watsonx/elyza/elyza-japanese-llama-2-7b-instruct, messages=messages) |
Mixtral 8X7B Instruct V01 Q | completion(model=watsonx/ibm-mistralai/mixtral-8x7b-instruct-v01-q, messages=messages) |
有关 watsonx.ai 中所有可用模型的列表,请参阅此处。
支持的 IBM watsonx.ai 嵌入模型
模型名称 | 函数调用 |
---|---|
Slate 30m | embedding(model="watsonx/ibm/slate-30m-english-rtrvr", input=input) |
Slate 125m | embedding(model="watsonx/ibm/slate-125m-english-rtrvr", input=input) |
有关 watsonx.ai 中所有可用嵌入模型的列表,请参阅此处。