Databricks

LiteLLM 支持 Databricks 上的所有模型

提示

我们支持所有 Databricks 模型，发送 litellm 请求时，只需设置 model=databricks/<any-model-on-databricks> 作为前缀即可

用法

SDK
代理

环境变量

import os 
os.environ["DATABRICKS_API_KEY"] = ""
os.environ["DATABRICKS_API_BASE"] = ""

示例调用

from litellm import completion
import os
## set ENV variables
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks base url" # e.g.: https://adb-3064715882934586.6.azuredatabricks.net/serving-endpoints

# Databricks dbrx-instruct call
response = completion(
    model="databricks/databricks-dbrx-instruct", 
    messages = [{ "content": "Hello, how are you?","role": "user"}]
)

将模型添加到 config.yaml

model_list:
  - model_name: dbrx-instruct
    litellm_params:
      model: databricks/databricks-dbrx-instruct
      api_key: os.environ/DATABRICKS_API_KEY
      api_base: os.environ/DATABRICKS_API_BASE

启动代理

$ litellm --config /path/to/config.yaml --debug

发送请求到 LiteLLM 代理服务器

OpenAI Python v1.0.0+
curl

import openai
client = openai.OpenAI(
    api_key="sk-1234",             # pass litellm proxy key, if you're using virtual keys
    base_url="http://0.0.0.0:4000" # litellm-proxy-base url
)

response = client.chat.completions.create(
    model="dbrx-instruct",
    messages = [
      {
          "role": "system",
          "content": "Be a good human!"
      },
      {
          "role": "user",
          "content": "What do you know about earth?"
      }
  ]
)

print(response)

curl --location 'http://0.0.0.0:4000/chat/completions' \
    --header 'Authorization: Bearer sk-1234' \
    --header 'Content-Type: application/json' \
    --data '{
    "model": "dbrx-instruct",
    "messages": [
      {
          "role": "system",
          "content": "Be a good human!"
      },
      {
          "role": "user",
          "content": "What do you know about earth?"
      }
      ],
}'

传递额外参数 - max_tokens, temperature

查看所有 litellm.completion 支持的参数此处

# !pip install litellm
from litellm import completion
import os
## set ENV variables
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks api base"

# databricks dbrx call
response = completion(
    model="databricks/databricks-dbrx-instruct", 
    messages = [{ "content": "Hello, how are you?","role": "user"}],
    max_tokens=20,
    temperature=0.5
)

代理

  model_list:
    - model_name: llama-3
      litellm_params:
        model: databricks/databricks-meta-llama-3-70b-instruct
        api_key: os.environ/DATABRICKS_API_KEY
        max_tokens: 20
        temperature: 0.5

用法 - Thinking / `reasoning_content`

LiteLLM 将 OpenAI 的 reasoning_effort 转换为 Anthropic 的 thinking 参数。代码

reasoning_effort	thinking
"low"	"budget_tokens": 1024
"medium"	"budget_tokens": 2048
"high"	"budget_tokens": 4096

已知限制

支持将 thinking blocks 传回给 Claude 问题

SDK
代理

from litellm import completion
import os

# set ENV variables (can also be passed in to .completion() - e.g. `api_base`, `api_key`)
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks base url"

resp = completion(
    model="databricks/databricks-claude-3-7-sonnet",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    reasoning_effort="low",
)

设置 config.yaml

- model_name: claude-3-7-sonnet
  litellm_params:
    model: databricks/databricks-claude-3-7-sonnet
    api_key: os.environ/DATABRICKS_API_KEY
    api_base: os.environ/DATABRICKS_API_BASE

启动代理

litellm --config /path/to/config.yaml

测试一下！

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR-LITELLM-KEY>" \
  -d '{
    "model": "claude-3-7-sonnet",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "reasoning_effort": "low"
  }'

预期响应

ModelResponse(
    id='chatcmpl-c542d76d-f675-4e87-8e5f-05855f5d0f5e',
    created=1740470510,
    model='claude-3-7-sonnet-20250219',
    object='chat.completion',
    system_fingerprint=None,
    choices=[
        Choices(
            finish_reason='stop',
            index=0,
            message=Message(
                content="The capital of France is Paris.",
                role='assistant',
                tool_calls=None,
                function_call=None,
                provider_specific_fields={
                    'citations': None,
                    'thinking_blocks': [
                        {
                            'type': 'thinking',
                            'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
                            'signature': 'EuYBCkQYAiJAy6...'
                        }
                    ]
                }
            ),
            thinking_blocks=[
                {
                    'type': 'thinking',
                    'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
                    'signature': 'EuYBCkQYAiJAy6AGB...'
                }
            ],
            reasoning_content='The capital of France is Paris. This is a very straightforward factual question.'
        )
    ],
    usage=Usage(
        completion_tokens=68,
        prompt_tokens=42,
        total_tokens=110,
        completion_tokens_details=None,
        prompt_tokens_details=PromptTokensDetailsWrapper(
            audio_tokens=None,
            cached_tokens=0,
            text_tokens=None,
            image_tokens=None
        ),
        cache_creation_input_tokens=0,
        cache_read_input_tokens=0
    )
)

将 `thinking` 传递给 Anthropic 模型

您也可以将 thinking 参数传递给 Anthropic 模型。

SDK
代理

from litellm import completion
import os

# set ENV variables (can also be passed in to .completion() - e.g. `api_base`, `api_key`)
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks base url"

response = litellm.completion(
  model="databricks/databricks-claude-3-7-sonnet",
  messages=[{"role": "user", "content": "What is the capital of France?"}],
  thinking={"type": "enabled", "budget_tokens": 1024},
)

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -d '{
    "model": "databricks/databricks-claude-3-7-sonnet",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "thinking": {"type": "enabled", "budget_tokens": 1024}
  }'

支持的 Databricks 聊天补全模型

提示

我们支持所有 Databricks 模型，发送 litellm 请求时，只需设置 model=databricks/<any-model-on-databricks> 作为前缀即可

模型名称	命令
databricks/databricks-claude-3-7-sonnet	`completion(model='databricks/databricks/databricks-claude-3-7-sonnet', messages=messages)`
databricks-meta-llama-3-1-70b-instruct	`completion(model='databricks/databricks-meta-llama-3-1-70b-instruct', messages=messages)`
databricks-meta-llama-3-1-405b-instruct	`completion(model='databricks/databricks-meta-llama-3-1-405b-instruct', messages=messages)`
databricks-dbrx-instruct	`completion(model='databricks/databricks-dbrx-instruct', messages=messages)`
databricks-meta-llama-3-70b-instruct	`completion(model='databricks/databricks-meta-llama-3-70b-instruct', messages=messages)`
databricks-llama-2-70b-chat	`completion(model='databricks/databricks-llama-2-70b-chat', messages=messages)`
databricks-mixtral-8x7b-instruct	`completion(model='databricks/databricks-mixtral-8x7b-instruct', messages=messages)`
databricks-mpt-30b-instruct	`completion(model='databricks/databricks-mpt-30b-instruct', messages=messages)`
databricks-mpt-7b-instruct	`completion(model='databricks/databricks-mpt-7b-instruct', messages=messages)`

嵌入模型

传递 Databricks 特定参数 - 'instruction'

对于嵌入模型，Databricks 允许您传入一个额外的参数 'instruction'。完整规范

# !pip install litellm
from litellm import embedding
import os
## set ENV variables
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks url"

# Databricks bge-large-en call
response = litellm.embedding(
      model="databricks/databricks-bge-large-en",
      input=["good morning from litellm"],
      instruction="Represent this sentence for searching relevant passages:",
  )

代理

  model_list:
    - model_name: bge-large
      litellm_params:
        model: databricks/databricks-bge-large-en
        api_key: os.environ/DATABRICKS_API_KEY
        api_base: os.environ/DATABRICKS_API_BASE
        instruction: "Represent this sentence for searching relevant passages:"

支持的 Databricks 嵌入模型

提示

我们支持所有 Databricks 模型，发送 litellm 请求时，只需设置 model=databricks/<any-model-on-databricks> 作为前缀即可

模型名称	命令
databricks-bge-large-en	`embedding(model='databricks/databricks-bge-large-en', messages=messages)`
databricks-gte-large-en	`embedding(model='databricks/databricks-gte-large-en', messages=messages)`

Databricks

用法​

环境变量​

示例调用​

传递额外参数 - max_tokens, temperature​

用法 - Thinking / reasoning_content​

将 thinking 传递给 Anthropic 模型​

支持的 Databricks 聊天补全模型​

嵌入模型​

传递 Databricks 特定参数 - 'instruction'​

支持的 Databricks 嵌入模型​

用法

环境变量

示例调用

传递额外参数 - max_tokens, temperature

用法 - Thinking / `reasoning_content`

将 `thinking` 传递给 Anthropic 模型

支持的 Databricks 聊天补全模型

嵌入模型

传递 Databricks 特定参数 - 'instruction'

支持的 Databricks 嵌入模型