跳到主要内容

Vertex AI SDK

Vertex AI 的透传端点 - 以原生格式(无需转换)调用特定于提供商的端点。

功能支持备注
成本跟踪支持 /generateContent 端点上的所有模型
日志记录适用于所有集成
终端用户跟踪如果您需要此功能,请告诉我们
流式处理

支持的端点

LiteLLM 支持 2 个 Vertex AI 透传路由

  1. /vertex_ai → 路由到 https://{vertex_location}-aiplatform.googleapis.com/
  2. /vertex_ai/discovery → 路由到 https://discoveryengine.googleapis.com

如何使用

只需将 https://REGION-aiplatform.googleapis.com 替换为 LITELLM_PROXY_BASE_URL/vertex_ai

LiteLLM 支持 3 种通过透传调用 Vertex AI 端点的流程

  1. 特定凭据:管理员为一个特定项目/区域设置透传凭据。

  2. 默认凭据:管理员设置默认凭据。

  3. 客户端凭据:用户可以将客户端凭据发送给 Vertex AI(默认行为 - 如果找不到默认或映射的凭据,请求将直接透传)。

使用示例

model_list:
- model_name: gemini-1.0-pro
litellm_params:
model: vertex_ai/gemini-1.0-pro
vertex_project: adroit-crow-413218
vertex_region: us-central1
vertex_credentials: /path/to/credentials.json
use_in_pass_through: true # 👈 KEY CHANGE

使用示例

curl http://localhost:4000/vertex_ai/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:generateContent \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{
"contents":[{
"role": "user",
"parts":[{"text": "How are you doing today?"}]
}]
}'

快速入门

让我们调用 Vertex AI /generateContent 端点

  1. 将 Vertex AI 凭据添加到您的环境
export DEFAULT_VERTEXAI_PROJECT="" # "adroit-crow-413218"
export DEFAULT_VERTEXAI_LOCATION="" # "us-central1"
export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="" # "/Users/Downloads/adroit-crow-413218-a956eef1a2a8.json"
  1. 启动 LiteLLM 代理
litellm

# RUNNING on http://0.0.0.0:4000
  1. 测试一下!

让我们调用 Google AI Studio token 计数端点

curl http://localhost:4000/vertex-ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"contents":[{
"role": "user",
"parts":[{"text": "How are you doing today?"}]
}]
}'

支持的 API 端点

  • Gemini API
  • Embeddings API
  • Imagen API
  • 代码补全 API
  • 批量预测 API
  • 调优 API
  • CountTokens API

Vertex AI 认证

LiteLLM 代理服务器支持两种对 Vertex AI 进行认证的方法

  1. 将 Vertex 凭据从客户端传递给代理服务器

  2. 在代理服务器上设置 Vertex AI 凭据

使用示例

Gemini API (Generate Content)

curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:generateContent \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{"contents":[{"role": "user", "parts":[{"text": "hi"}]}]}'

Embeddings API

curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/textembedding-gecko@001:predict \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{"instances":[{"content": "gm"}]}'

Imagen API

curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/imagen-3.0-generate-001:predict \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{"instances":[{"prompt": "make an otter"}], "parameters": {"sampleCount": 1}}'

Count Tokens API

curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:countTokens \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{"contents":[{"role": "user", "parts":[{"text": "hi"}]}]}'

调优 API

创建 Fine Tuning Job

curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:tuningJobs \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{
"baseModel": "gemini-1.0-pro-002",
"supervisedTuningSpec" : {
"training_dataset_uri": "gs://cloud-samples-data/ai-platform/generative_ai/sft_train_data.jsonl"
}
}'

高级

先决条件

使用此功能可以避免向开发者提供原始 Anthropic API 密钥,同时仍允许他们使用 Anthropic 端点。

与虚拟密钥一起使用

  1. 设置环境
export DATABASE_URL=""
export LITELLM_MASTER_KEY=""

# vertex ai credentials
export DEFAULT_VERTEXAI_PROJECT="" # "adroit-crow-413218"
export DEFAULT_VERTEXAI_LOCATION="" # "us-central1"
export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="" # "/Users/Downloads/adroit-crow-413218-a956eef1a2a8.json"
litellm

# RUNNING on http://0.0.0.0:4000
  1. 生成虚拟密钥
curl -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'x-litellm-api-key: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{}'

预期响应

{
...
"key": "sk-1234ewknldferwedojwojw"
}
  1. 测试一下!
curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{
"contents":[{
"role": "user",
"parts":[{"text": "How are you doing today?"}]
}]
}'

在请求头中发送 tags

如果您希望 tags 在 LiteLLM DB 和日志记录回调中被跟踪,请使用此功能

在请求头中将 tags 作为逗号分隔列表传递。在下面的示例中,将跟踪以下标签

tags: ["vertex-js-sdk", "pass-through-endpoint"]
curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-H "tags: vertex-js-sdk,pass-through-endpoint" \
-d '{
"contents":[{
"role": "user",
"parts":[{"text": "How are you doing today?"}]
}]
}'