模型回退(w/ LiteLLM)
以下是如何使用 LiteLLM 在 3 个 LLM 提供商(OpenAI、Anthropic、Azure)之间实现模型回退。
1. 安装 LiteLLM
!pip install litellm
2. 基本回退代码
import litellm
from litellm import embedding, completion
# set ENV variables
os.environ["OPENAI_API_KEY"] = ""
os.environ["ANTHROPIC_API_KEY"] = ""
os.environ["AZURE_API_KEY"] = ""
os.environ["AZURE_API_BASE"] = ""
os.environ["AZURE_API_VERSION"] = ""
model_fallback_list = ["claude-instant-1", "gpt-3.5-turbo", "chatgpt-test"]
user_message = "Hello, how are you?"
messages = [{ "content": user_message,"role": "user"}]
for model in model_fallback_list:
try:
response = completion(model=model, messages=messages)
except Exception as e:
print(f"error occurred: {traceback.format_exc()}")
3. 上下文窗口异常
LiteLLM 为上下文窗口超出错误提供了 InvalidRequestError 类的一个子类(文档)。
基于上下文窗口异常实现模型回退。
LiteLLM 还公开了一个 get_max_tokens()
函数,您可以使用它来识别已超出的上下文窗口限制。
import litellm
from litellm import completion, ContextWindowExceededError, get_max_tokens
# set ENV variables
os.environ["OPENAI_API_KEY"] = ""
os.environ["COHERE_API_KEY"] = ""
os.environ["ANTHROPIC_API_KEY"] = ""
os.environ["AZURE_API_KEY"] = ""
os.environ["AZURE_API_BASE"] = ""
os.environ["AZURE_API_VERSION"] = ""
context_window_fallback_list = [{"model":"gpt-3.5-turbo-16k", "max_tokens": 16385}, {"model":"gpt-4-32k", "max_tokens": 32768}, {"model": "claude-instant-1", "max_tokens":100000}]
user_message = "Hello, how are you?"
messages = [{ "content": user_message,"role": "user"}]
initial_model = "command-nightly"
try:
response = completion(model=initial_model, messages=messages)
except ContextWindowExceededError as e:
model_max_tokens = get_max_tokens(model)
for model in context_window_fallback_list:
if model_max_tokens < model["max_tokens"]
try:
response = completion(model=model["model"], messages=messages)
return response
except ContextWindowExceededError as e:
model_max_tokens = get_max_tokens(model["model"])
continue
print(response)