Cohere

API KEYS

import os 
os.environ["COHERE_API_KEY"] = ""

用法

LiteLLM Python SDK

from litellm import completion

## set ENV variables
os.environ["COHERE_API_KEY"] = "cohere key"

# cohere call
response = completion(
    model="command-r", 
    messages = [{ "content": "Hello, how are you?","role": "user"}]
)

流式传输

from litellm import completion

## set ENV variables
os.environ["COHERE_API_KEY"] = "cohere key"

# cohere call
response = completion(
    model="command-r", 
    messages = [{ "content": "Hello, how are you?","role": "user"}],
    stream=True
)

for chunk in response:
    print(chunk)

与 LiteLLM 代理一起使用

如何使用 LiteLLM 代理服务器调用 Cohere

1. 在你的环境中保存密钥

export COHERE_API_KEY="your-api-key"

2. 启动代理

在 config.yaml 中定义你想使用的 Cohere 模型

model_list:
  - model_name: command-a-03-2025 
    litellm_params:
      model: command-a-03-2025
      api_key: "os.environ/COHERE_API_KEY"

litellm --config /path/to/config.yaml

3. 测试它

Curl 请求
OpenAI v1.0.0+

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <your-litellm-api-key>' \
--data ' {
      "model": "command-a-03-2025",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ]
    }
'

import openai
client = openai.OpenAI(
    api_key="anything",
    base_url="http://0.0.0.0:4000"
)

# request sent to model set on litellm proxy
response = client.chat.completions.create(model="command-a-03-2025", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
])

print(response)

支持的模型

模型名称	函数调用
command-a-03-2025	`litellm.completion('command-a-03-2025', messages)`
command-r-plus-08-2024	`litellm.completion('command-r-plus-08-2024', messages)`
command-r-08-2024	`litellm.completion('command-r-08-2024', messages)`
command-r-plus	`litellm.completion('command-r-plus', messages)`
command-r	`litellm.completion('command-r', messages)`
command-light	`litellm.completion('command-light', messages)`
command-nightly	`litellm.completion('command-nightly', messages)`

嵌入

from litellm import embedding
os.environ["COHERE_API_KEY"] = "cohere key"

# cohere call
response = embedding(
    model="embed-english-v3.0", 
    input=["good morning from litellm", "this is another item"], 
)

设置 - v3 模型的输入类型

v3 模型有一个必需参数：input_type。LiteLLM 默认为 search_document。它可以是以下四个值之一

input_type="search_document"：（默认）用于你想存储在向量数据库中的文本（文档）
input_type="search_query"：用于搜索查询，以在向量数据库中找到最相关的文档
input_type="classification"：如果你将嵌入用作分类系统的输入，请使用此项
input_type="clustering"：如果你将嵌入用于文本聚类，请使用此项

https://txt.cohere.com/introducing-embed-v3/

from litellm import embedding
os.environ["COHERE_API_KEY"] = "cohere key"

# cohere call
response = embedding(
    model="embed-english-v3.0", 
    input=["good morning from litellm", "this is another item"], 
    input_type="search_document" 
)

支持的嵌入模型

模型名称	函数调用
embed-english-v3.0	`embedding(model="embed-english-v3.0", input=["good morning from litellm", "this is another item"])`
embed-english-light-v3.0	`embedding(model="embed-english-light-v3.0", input=["good morning from litellm", "this is another item"])`
embed-multilingual-v3.0	`embedding(model="embed-multilingual-v3.0", input=["good morning from litellm", "this is another item"])`
embed-multilingual-light-v3.0	`embedding(model="embed-multilingual-light-v3.0", input=["good morning from litellm", "this is another item"])`
embed-english-v2.0	`embedding(model="embed-english-v2.0", input=["good morning from litellm", "this is another item"])`
embed-english-light-v2.0	`embedding(model="embed-english-light-v2.0", input=["good morning from litellm", "this is another item"])`
embed-multilingual-v2.0	`embedding(model="embed-multilingual-v2.0", input=["good morning from litellm", "this is another item"])`

重排序

用法

LiteLLM 支持 Cohere 重排序的 v1 和 v2 客户端。默认情况下，rerank 端点使用 v2 客户端，但你可以通过明确调用 v1/rerank 来指定 v1 客户端。

LiteLLM SDK 用法
LiteLLM 代理用法

from litellm import rerank
import os

os.environ["COHERE_API_KEY"] = "sk-.."

query = "What is the capital of the United States?"
documents = [
    "Carson City is the capital city of the American state of Nevada.",
    "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
    "Washington, D.C. is the capital of the United States.",
    "Capital punishment has existed in the United States since before it was a country.",
]

response = rerank(
    model="cohere/rerank-english-v3.0",
    query=query,
    documents=documents,
    top_n=3,
)
print(response)

LiteLLM 提供一个与 Cohere API 兼容的 /rerank 端点用于重排序调用。

设置

将此添加到你的 litellm 代理 config.yaml 中

model_list:
  - model_name: Salesforce/Llama-Rank-V1
    litellm_params:
      model: together_ai/Salesforce/Llama-Rank-V1
      api_key: os.environ/TOGETHERAI_API_KEY
  - model_name: rerank-english-v3.0
    litellm_params:
      model: cohere/rerank-english-v3.0
      api_key: os.environ/COHERE_API_KEY

启动 litellm

litellm --config /path/to/config.yaml

# RUNNING on http://0.0.0.0:4000

测试请求

curl http://0.0.0.0:4000/rerank \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rerank-english-v3.0",
    "query": "What is the capital of the United States?",
    "documents": [
        "Carson City is the capital city of the American state of Nevada.",
        "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
        "Washington, D.C. is the capital of the United States.",
        "Capital punishment has existed in the United States since before it was a country."
    ],
    "top_n": 3
  }'

Cohere

API KEYS​

用法​

LiteLLM Python SDK​

流式传输​

与 LiteLLM 代理一起使用​

1. 在你的环境中保存密钥​

2. 启动代理​

3. 测试它​

支持的模型​

嵌入​

设置 - v3 模型的输入类型​

支持的嵌入模型​

重排序​

用法​