使用视觉模型
快速入门
将图像传递给模型的示例
- LiteLLMPython SDK
- LiteLLM 代理服务器
import os
from litellm import completion
os.environ["OPENAI_API_KEY"] = "your-api-key"
# openai call
response = completion(
model = "gpt-4-vision-preview",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What’s in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
],
)
- 在 config.yaml 中定义视觉模型
model_list:
- model_name: gpt-4-vision-preview # OpenAI gpt-4-vision-preview
litellm_params:
model: openai/gpt-4-vision-preview
api_key: os.environ/OPENAI_API_KEY
- model_name: llava-hf # Custom OpenAI compatible model
litellm_params:
model: openai/llava-hf/llava-v1.6-vicuna-7b-hf
api_base: http://localhost:8000
api_key: fake-key
model_info:
supports_vision: True # set supports_vision to True so /model/info returns this attribute as True
- 运行代理服务器
litellm --config config.yaml
- 使用 OpenAI Python SDK 进行测试
import os
from openai import OpenAI
client = OpenAI(
api_key="sk-1234", # your litellm proxy api key
)
response = client.chat.completions.create(
model = "gpt-4-vision-preview", # use model="llava-hf" to test your custom OpenAI endpoint
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What’s in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
],
)
检查模型是否支持 vision
- LiteLLM Python SDK
- LiteLLM 代理服务器
使用 litellm.supports_vision(model="")
-> 如果模型支持 vision
则返回 True
,否则返回 False
assert litellm.supports_vision(model="openai/gpt-4-vision-preview") == True
assert litellm.supports_vision(model="vertex_ai/gemini-1.0-pro-vision") == True
assert litellm.supports_vision(model="openai/gpt-3.5-turbo") == False
assert litellm.supports_vision(model="xai/grok-2-vision-latest") == True
assert litellm.supports_vision(model="xai/grok-2-latest") == False
- 在 config.yaml 中定义视觉模型
model_list:
- model_name: gpt-4-vision-preview # OpenAI gpt-4-vision-preview
litellm_params:
model: openai/gpt-4-vision-preview
api_key: os.environ/OPENAI_API_KEY
- model_name: llava-hf # Custom OpenAI compatible model
litellm_params:
model: openai/llava-hf/llava-v1.6-vicuna-7b-hf
api_base: http://localhost:8000
api_key: fake-key
model_info:
supports_vision: True # set supports_vision to True so /model/info returns this attribute as True
- 运行代理服务器
litellm --config config.yaml
- 调用
/model_group/info
检查您的模型是否支持vision
curl -X 'GET' \
'http://localhost:4000/model_group/info' \
-H 'accept: application/json' \
-H 'x-api-key: sk-1234'
预期响应
{
"data": [
{
"model_group": "gpt-4-vision-preview",
"providers": ["openai"],
"max_input_tokens": 128000,
"max_output_tokens": 4096,
"mode": "chat",
"supports_vision": true, # 👈 supports_vision is true
"supports_function_calling": false
},
{
"model_group": "llava-hf",
"providers": ["openai"],
"max_input_tokens": null,
"max_output_tokens": null,
"mode": null,
"supports_vision": true, # 👈 supports_vision is true
"supports_function_calling": false
}
]
}
明确指定图像类型
如果您的图像没有 mime-type,或者 litellm 错误地推断了您的图像 mime-type (例如,在使用 vertex ai 调用 gs://
URL 时),您可以通过 format
参数明确设置。
"image_url": {
"url": "gs://my-gs-image",
"format": "image/jpeg"
}
LiteLLM 将在支持指定 mime-type 的任何 API 端点 (例如 anthropic/bedrock/vertex ai) 上使用此设置。
对于其他 (例如 openai),此设置将被忽略。
- SDK
- PROXY
import os
from litellm import completion
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"
# openai call
response = completion(
model = "claude-3-7-sonnet-latest",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What’s in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
"format": "image/jpeg"
}
}
]
}
],
)
- 在 config.yaml 中定义视觉模型
model_list:
- model_name: gpt-4-vision-preview # OpenAI gpt-4-vision-preview
litellm_params:
model: openai/gpt-4-vision-preview
api_key: os.environ/OPENAI_API_KEY
- model_name: llava-hf # Custom OpenAI compatible model
litellm_params:
model: openai/llava-hf/llava-v1.6-vicuna-7b-hf
api_base: http://localhost:8000
api_key: fake-key
model_info:
supports_vision: True # set supports_vision to True so /model/info returns this attribute as True
- 运行代理服务器
litellm --config config.yaml
- 使用 OpenAI Python SDK 进行测试
import os
from openai import OpenAI
client = OpenAI(
api_key="sk-1234", # your litellm proxy api key
)
response = client.chat.completions.create(
model = "gpt-4-vision-preview", # use model="llava-hf" to test your custom OpenAI endpoint
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What’s in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
"format": "image/jpeg"
}
}
]
}
],
)
规范
"image_url": str
OR
"image_url": {
"url": "url OR base64 encoded str",
"detail": "openai-only param",
"format": "specify mime-type of image"
}