Vertex AI SDK
Vertex AI 的透传端点 - 以原生格式(无需转换)调用特定于提供商的端点。
功能 | 支持 | 备注 |
---|---|---|
成本跟踪 | ✅ | 支持 /generateContent 端点上的所有模型 |
日志记录 | ✅ | 适用于所有集成 |
终端用户跟踪 | ❌ | 如果您需要此功能,请告诉我们 |
流式处理 | ✅ |
支持的端点
LiteLLM 支持 2 个 Vertex AI 透传路由
/vertex_ai
→ 路由到https://{vertex_location}-aiplatform.googleapis.com/
/vertex_ai/discovery
→ 路由到https://discoveryengine.googleapis.com
如何使用
只需将 https://REGION-aiplatform.googleapis.com
替换为 LITELLM_PROXY_BASE_URL/vertex_ai
LiteLLM 支持 3 种通过透传调用 Vertex AI 端点的流程
特定凭据:管理员为一个特定项目/区域设置透传凭据。
默认凭据:管理员设置默认凭据。
客户端凭据:用户可以将客户端凭据发送给 Vertex AI(默认行为 - 如果找不到默认或映射的凭据,请求将直接透传)。
使用示例
- 特定项目/区域
- 默认凭据
- 客户端凭据
model_list:
- model_name: gemini-1.0-pro
litellm_params:
model: vertex_ai/gemini-1.0-pro
vertex_project: adroit-crow-413218
vertex_region: us-central1
vertex_credentials: /path/to/credentials.json
use_in_pass_through: true # 👈 KEY CHANGE
- 在 config.yaml 中设置
- 在环境变量中设置
default_vertex_config:
vertex_project: adroit-crow-413218
vertex_region: us-central1
vertex_credentials: /path/to/credentials.json
export DEFAULT_VERTEXAI_PROJECT="adroit-crow-413218"
export DEFAULT_VERTEXAI_LOCATION="us-central1"
export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"
尝试 Gemini 2.0 Flash (curl)
MODEL_ID="gemini-2.0-flash-001"
PROJECT_ID="YOUR_PROJECT_ID"
curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
"${LITELLM_PROXY_BASE_URL}/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:streamGenerateContent" -d \
$'{
"contents": {
"role": "user",
"parts": [
{
"fileData": {
"mimeType": "image/png",
"fileUri": "gs://generativeai-downloads/images/scones.jpg"
}
},
{
"text": "Describe this picture."
}
]
}
}'
使用示例
- curl
- Vertex Node.js SDK
curl http://localhost:4000/vertex_ai/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:generateContent \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{
"contents":[{
"role": "user",
"parts":[{"text": "How are you doing today?"}]
}]
}'
const { VertexAI } = require('@google-cloud/vertexai');
const vertexAI = new VertexAI({
project: 'your-project-id', // enter your vertex project id
location: 'us-central1', // enter your vertex region
apiEndpoint: "localhost:4000/vertex_ai" // <proxy-server-url>/vertex_ai # note, do not include 'https://' in the url
});
const model = vertexAI.getGenerativeModel({
model: 'gemini-1.0-pro'
}, {
customHeaders: {
"x-litellm-api-key": "sk-1234" // Your litellm Virtual Key
}
});
async function generateContent() {
try {
const prompt = {
contents: [{
role: 'user',
parts: [{ text: 'How are you doing today?' }]
}]
};
const response = await model.generateContent(prompt);
console.log('Response:', response);
} catch (error) {
console.error('Error:', error);
}
}
generateContent();
快速入门
让我们调用 Vertex AI /generateContent
端点
- 将 Vertex AI 凭据添加到您的环境
export DEFAULT_VERTEXAI_PROJECT="" # "adroit-crow-413218"
export DEFAULT_VERTEXAI_LOCATION="" # "us-central1"
export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="" # "/Users/Downloads/adroit-crow-413218-a956eef1a2a8.json"
- 启动 LiteLLM 代理
litellm
# RUNNING on http://0.0.0.0:4000
- 测试一下!
让我们调用 Google AI Studio token 计数端点
curl http://localhost:4000/vertex-ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"contents":[{
"role": "user",
"parts":[{"text": "How are you doing today?"}]
}]
}'
支持的 API 端点
- Gemini API
- Embeddings API
- Imagen API
- 代码补全 API
- 批量预测 API
- 调优 API
- CountTokens API
Vertex AI 认证
LiteLLM 代理服务器支持两种对 Vertex AI 进行认证的方法
将 Vertex 凭据从客户端传递给代理服务器
在代理服务器上设置 Vertex AI 凭据
使用示例
Gemini API (Generate Content)
curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:generateContent \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{"contents":[{"role": "user", "parts":[{"text": "hi"}]}]}'
Embeddings API
curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/textembedding-gecko@001:predict \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{"instances":[{"content": "gm"}]}'
Imagen API
curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/imagen-3.0-generate-001:predict \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{"instances":[{"prompt": "make an otter"}], "parameters": {"sampleCount": 1}}'
Count Tokens API
curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:countTokens \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{"contents":[{"role": "user", "parts":[{"text": "hi"}]}]}'
调优 API
创建 Fine Tuning Job
curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:tuningJobs \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{
"baseModel": "gemini-1.0-pro-002",
"supervisedTuningSpec" : {
"training_dataset_uri": "gs://cloud-samples-data/ai-platform/generative_ai/sft_train_data.jsonl"
}
}'
高级
先决条件
使用此功能可以避免向开发者提供原始 Anthropic API 密钥,同时仍允许他们使用 Anthropic 端点。
与虚拟密钥一起使用
- 设置环境
export DATABASE_URL=""
export LITELLM_MASTER_KEY=""
# vertex ai credentials
export DEFAULT_VERTEXAI_PROJECT="" # "adroit-crow-413218"
export DEFAULT_VERTEXAI_LOCATION="" # "us-central1"
export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="" # "/Users/Downloads/adroit-crow-413218-a956eef1a2a8.json"
litellm
# RUNNING on http://0.0.0.0:4000
- 生成虚拟密钥
curl -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'x-litellm-api-key: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{}'
预期响应
{
...
"key": "sk-1234ewknldferwedojwojw"
}
- 测试一下!
curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-d '{
"contents":[{
"role": "user",
"parts":[{"text": "How are you doing today?"}]
}]
}'
在请求头中发送 tags
如果您希望 tags
在 LiteLLM DB 和日志记录回调中被跟踪,请使用此功能
在请求头中将 tags
作为逗号分隔列表传递。在下面的示例中,将跟踪以下标签
tags: ["vertex-js-sdk", "pass-through-endpoint"]
- curl
- Vertex Node.js SDK
curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
-H "Content-Type: application/json" \
-H "x-litellm-api-key: Bearer sk-1234" \
-H "tags: vertex-js-sdk,pass-through-endpoint" \
-d '{
"contents":[{
"role": "user",
"parts":[{"text": "How are you doing today?"}]
}]
}'
const { VertexAI } = require('@google-cloud/vertexai');
const vertexAI = new VertexAI({
project: 'your-project-id', // enter your vertex project id
location: 'us-central1', // enter your vertex region
apiEndpoint: "localhost:4000/vertex_ai" // <proxy-server-url>/vertex_ai # note, do not include 'https://' in the url
});
const model = vertexAI.getGenerativeModel({
model: 'gemini-1.0-pro'
}, {
customHeaders: {
"x-litellm-api-key": "sk-1234", // Your litellm Virtual Key
"tags": "vertex-js-sdk,pass-through-endpoint"
}
});
async function generateContent() {
try {
const prompt = {
contents: [{
role: 'user',
parts: [{ text: 'How are you doing today?' }]
}]
};
const response = await model.generateContent(prompt);
console.log('Response:', response);
} catch (error) {
console.error('Error:', error);
}
}
generateContent();