跳到主要内容

Gemini - Google AI Studio

属性详情
描述Google AI Studio 是一个用于构建和使用生成式 AI 的全托管 AI 开发平台。
LiteLLM 上的提供商路由gemini/
提供商文档Google AI Studio ↗
提供商 API 端点https://generativelanguage.googleapis.com
支持的 OpenAI 端点/chat/completions, /embeddings, /completions
直通端点支持

API 密钥

import os
os.environ["GEMINI_API_KEY"] = "your-api-key"

示例用法

from litellm import completion
import os

os.environ['GEMINI_API_KEY'] = ""
response = completion(
model="gemini/gemini-pro",
messages=[{"role": "user", "content": "write code for saying hi from LiteLLM"}]
)

支持的 OpenAI 参数

  • temperature
  • top_p
  • max_tokens
  • max_completion_tokens
  • stream
  • tools
  • tool_choice
  • functions
  • response_format
  • n
  • stop
  • logprobs
  • frequency_penalty
  • modalities
  • reasoning_content

Anthropic 参数

  • thinking (用于设置 Anthropic/Gemini 模型之间的最大预算 token)

查看更新列表

用法 - Thinking / reasoning_content

LiteLLM 将 OpenAI 的 reasoning_effort 参数转换为 Gemini 的 thinking 参数。 代码

映射

reasoning_effortthinking
"low""budget_tokens": 1024
"medium""budget_tokens": 2048
"high""budget_tokens": 4096
from litellm import completion

resp = completion(
model="gemini/gemini-2.5-flash-preview-04-17",
messages=[{"role": "user", "content": "What is the capital of France?"}],
reasoning_effort="low",
)

预期响应

ModelResponse(
id='chatcmpl-c542d76d-f675-4e87-8e5f-05855f5d0f5e',
created=1740470510,
model='claude-3-7-sonnet-20250219',
object='chat.completion',
system_fingerprint=None,
choices=[
Choices(
finish_reason='stop',
index=0,
message=Message(
content="The capital of France is Paris.",
role='assistant',
tool_calls=None,
function_call=None,
reasoning_content='The capital of France is Paris. This is a very straightforward factual question.'
),
)
],
usage=Usage(
completion_tokens=68,
prompt_tokens=42,
total_tokens=110,
completion_tokens_details=None,
prompt_tokens_details=PromptTokensDetailsWrapper(
audio_tokens=None,
cached_tokens=0,
text_tokens=None,
image_tokens=None
),
cache_creation_input_tokens=0,
cache_read_input_tokens=0
)
)

thinking 参数传递给 Gemini 模型

您也可以将 thinking 参数传递给 Gemini 模型。

这会被转换为 Gemini 的 thinkingConfig 参数

response = litellm.completion(
model="gemini/gemini-2.5-flash-preview-04-17",
messages=[{"role": "user", "content": "What is the capital of France?"}],
thinking={"type": "enabled", "budget_tokens": 1024},
)

传递 Gemini 特定参数

响应模式 (schema)

LiteLLM 支持在 Google AI Studio 上为 Gemini-1.5-Pro 模型发送 response_schema 参数。

响应模式 (Schema)

from litellm import completion 
import json
import os

os.environ['GEMINI_API_KEY'] = ""

messages = [
{
"role": "user",
"content": "List 5 popular cookie recipes."
}
]

response_schema = {
"type": "array",
"items": {
"type": "object",
"properties": {
"recipe_name": {
"type": "string",
},
},
"required": ["recipe_name"],
},
}


completion(
model="gemini/gemini-1.5-pro",
messages=messages,
response_format={"type": "json_object", "response_schema": response_schema} # 👈 KEY CHANGE
)

print(json.loads(completion.choices[0].message.content))

验证模式 (Schema)

要验证 response_schema,请设置 enforce_validation: true

from litellm import completion, JSONSchemaValidationError
try:
completion(
model="gemini/gemini-1.5-pro",
messages=messages,
response_format={
"type": "json_object",
"response_schema": response_schema,
"enforce_validation": true # 👈 KEY CHANGE
}
)
except JSONSchemaValidationError as e:
print("Raw Response: {}".format(e.raw_response))
raise e

LiteLLM 将根据模式验证响应,如果响应与模式不匹配,则会引发 JSONSchemaValidationError

JSONSchemaValidationError 继承自 openai.APIError

使用 e.raw_response 访问原始响应

GenerationConfig 参数

要传递额外的 GenerationConfig 参数,例如 topK,只需在调用请求体中传递它,LiteLLM 将直接将其作为键值对传递到请求体中。

查看 Gemini GenerationConfigParams

from litellm import completion 
import json
import os

os.environ['GEMINI_API_KEY'] = ""

messages = [
{
"role": "user",
"content": "List 5 popular cookie recipes."
}
]

completion(
model="gemini/gemini-1.5-pro",
messages=messages,
topK=1 # 👈 KEY CHANGE
)

print(json.loads(completion.choices[0].message.content))

验证模式 (Schema)

要验证 response_schema,请设置 enforce_validation: true

from litellm import completion, JSONSchemaValidationError
try:
completion(
model="gemini/gemini-1.5-pro",
messages=messages,
response_format={
"type": "json_object",
"response_schema": response_schema,
"enforce_validation": true # 👈 KEY CHANGE
}
)
except JSONSchemaValidationError as e:
print("Raw Response: {}".format(e.raw_response))
raise e

指定安全设置

在某些用例中,您可能需要调用模型并传递与默认设置不同的安全设置。为此,只需将 safety_settings 参数传递给 completionacompletion。例如

response = completion(
model="gemini/gemini-pro",
messages=[{"role": "user", "content": "write code for saying hi from LiteLLM"}],
safety_settings=[
{
"category": "HARM_CATEGORY_HARASSMENT",
"threshold": "BLOCK_NONE",
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"threshold": "BLOCK_NONE",
},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"threshold": "BLOCK_NONE",
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_NONE",
},
]
)

工具调用

from litellm import completion
import os
# set env
os.environ["GEMINI_API_KEY"] = ".."

tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
}
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]

response = completion(
model="gemini/gemini-1.5-flash",
messages=messages,
tools=tools,
)
# Add any assertions, here to check response args
print(response)
assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)
assert isinstance(
response.choices[0].message.tool_calls[0].function.arguments, str
)


Google 搜索工具

from litellm import completion
import os

os.environ["GEMINI_API_KEY"] = ".."

tools = [{"googleSearch": {}}] # 👈 ADD GOOGLE SEARCH

response = completion(
model="gemini/gemini-2.0-flash",
messages=[{"role": "user", "content": "What is the weather in San Francisco?"}],
tools=tools,
)

print(response)

Google 搜索检索

from litellm import completion
import os

os.environ["GEMINI_API_KEY"] = ".."

tools = [{"googleSearch": {}}] # 👈 ADD GOOGLE SEARCH

response = completion(
model="gemini/gemini-2.0-flash",
messages=[{"role": "user", "content": "What is the weather in San Francisco?"}],
tools=tools,
)

print(response)

代码执行工具

from litellm import completion
import os

os.environ["GEMINI_API_KEY"] = ".."

tools = [{"codeExecution": {}}] # 👈 ADD GOOGLE SEARCH

response = completion(
model="gemini/gemini-2.0-flash",
messages=[{"role": "user", "content": "What is the weather in San Francisco?"}],
tools=tools,
)

print(response)

JSON 模式

from litellm import completion 
import json
import os

os.environ['GEMINI_API_KEY'] = ""

messages = [
{
"role": "user",
"content": "List 5 popular cookie recipes."
}
]



completion(
model="gemini/gemini-1.5-pro",
messages=messages,
response_format={"type": "json_object"} # 👈 KEY CHANGE
)

print(json.loads(completion.choices[0].message.content))
# Gemini-Pro-Vision LiteLLM 支持在 `url` 中传递以下图像类型 - 直链图像 - https://storage.googleapis.com/github-repo/img/gemini/intro/landmark3.jpg - 本地存储图像 - ./localimage.jpeg

示例用法

import os
import litellm
from dotenv import load_dotenv

# Load the environment variables from .env file
load_dotenv()
os.environ["GEMINI_API_KEY"] = os.getenv('GEMINI_API_KEY')

prompt = 'Describe the image in a few sentences.'
# Note: You can pass here the URL or Path of image directly.
image_url = 'https://storage.googleapis.com/github-repo/img/gemini/intro/landmark3.jpg'

# Create the messages payload according to the documentation
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": prompt
},
{
"type": "image_url",
"image_url": {"url": image_url}
}
]
}
]

# Make the API call to Gemini model
response = litellm.completion(
model="gemini/gemini-pro-vision",
messages=messages,
)

# Extract the response content
content = response.get('choices', [{}])[0].get('message', {}).get('content')

# Print the result
print(content)

用法 - PDF / 视频 / 等文件

内联数据 (例如音频流)

LiteLLM 遵循 OpenAI 格式,接受以 base64 编码字符串发送内联数据。

遵循的格式是

data:<mime_type>;base64,<encoded_data>

LITELLM 调用

import litellm
from pathlib import Path
import base64
import os

os.environ["GEMINI_API_KEY"] = ""

litellm.set_verbose = True # 👈 See Raw call

audio_bytes = Path("speech_vertex.mp3").read_bytes()
encoded_data = base64.b64encode(audio_bytes).decode("utf-8")
print("Audio Bytes = {}".format(audio_bytes))
model = "gemini/gemini-1.5-flash"
response = litellm.completion(
model=model,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Please summarize the audio."},
{
"type": "file",
"file": {
"file_data": "data:audio/mp3;base64,{}".format(encoded_data), # 👈 SET MIME_TYPE + DATA
}
},
],
}
],
)

等效的 GOOGLE API 调用

# Initialize a Gemini model appropriate for your use case.
model = genai.GenerativeModel('models/gemini-1.5-flash')

# Create the prompt.
prompt = "Please summarize the audio."

# Load the samplesmall.mp3 file into a Python Blob object containing the audio
# file's bytes and then pass the prompt and the audio to Gemini.
response = model.generate_content([
prompt,
{
"mime_type": "audio/mp3",
"data": pathlib.Path('samplesmall.mp3').read_bytes()
}
])

# Output Gemini's response to the prompt and the inline audio.
print(response.text)

https:// 文件

import litellm
import os

os.environ["GEMINI_API_KEY"] = ""

litellm.set_verbose = True # 👈 See Raw call

model = "gemini/gemini-1.5-flash"
response = litellm.completion(
model=model,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Please summarize the file."},
{
"type": "file",
"file": {
"file_id": "https://storage...", # 👈 SET THE IMG URL
"format": "application/pdf" # OPTIONAL
}
},
],
}
],
)

gs:// 文件

import litellm
import os

os.environ["GEMINI_API_KEY"] = ""

litellm.set_verbose = True # 👈 See Raw call

model = "gemini/gemini-1.5-flash"
response = litellm.completion(
model=model,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Please summarize the file."},
{
"type": "file",
"file": {
"file_id": "gs://storage...", # 👈 SET THE IMG URL
"format": "application/pdf" # OPTIONAL
}
},
],
}
],
)

聊天模型

提示

我们支持所有 Gemini 模型,发送 litellm 请求时,只需将 model=gemini/<任意 Gemini 模型名称> 作为前缀即可

模型名称函数调用必需的 OS 环境变量
gemini-procompletion(model='gemini/gemini-pro', messages)os.environ['GEMINI_API_KEY']
gemini-1.5-pro-latestcompletion(model='gemini/gemini-1.5-pro-latest', messages)os.environ['GEMINI_API_KEY']
gemini-2.0-flashcompletion(model='gemini/gemini-2.0-flash', messages)os.environ['GEMINI_API_KEY']
gemini-2.0-flash-expcompletion(model='gemini/gemini-2.0-flash-exp', messages)os.environ['GEMINI_API_KEY']
gemini-2.0-flash-lite-preview-02-05completion(model='gemini/gemini-2.0-flash-lite-preview-02-05', messages)os.environ['GEMINI_API_KEY']

上下文缓存

Google AI Studio 上下文缓存通过以下方式支持

{
{
"role": "system",
"content": ...,
"cache_control": {"type": "ephemeral"} # 👈 KEY CHANGE
},
...
}

在您的消息内容块中。

架构图

注意

  • 相关代码

  • Gemini 上下文缓存只允许缓存 1 个连续的消息块。

  • 如果多个不连续的块包含 cache_control - 将使用第一个连续块。(以 Gemini 格式发送到 /cachedContent)

  • 发送到 Gemini 的 /generateContent 端点的原始请求如下所示
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-001:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"contents": [
{
"parts":[{
"text": "Please summarize this transcript"
}],
"role": "user"
},
],
"cachedContent": "'$CACHE_NAME'"
}'

示例用法

from litellm import completion 

for _ in range(2):
resp = completion(
model="gemini/gemini-1.5-pro",
messages=[
# System Message
{
"role": "system",
"content": [
{
"type": "text",
"text": "Here is the full text of a complex legal agreement" * 4000,
"cache_control": {"type": "ephemeral"}, # 👈 KEY CHANGE
}
],
},
# marked for caching with the cache_control parameter, so that this checkpoint can read from the previous cache.
{
"role": "user",
"content": [
{
"type": "text",
"text": "What are the key terms and conditions in this agreement?",
"cache_control": {"type": "ephemeral"},
}
],
}]
)

print(resp.usage) # 👈 2nd usage block will be less, since cached tokens used

图像生成

from litellm import completion 

response = completion(
model="gemini/gemini-2.0-flash-exp-image-generation",
messages=[{"role": "user", "content": "Generate an image of a cat"}],
modalities=["image", "text"],
)
assert response.choices[0].message.content is not None # ".."