/completions

用法

LiteLLM Python SDK
LiteLLM 代理服务器

from litellm import text_completion

response = text_completion(
    model="gpt-3.5-turbo-instruct",
    prompt="Say this is a test",
    max_tokens=7
)

在 config.yaml 中定义模型

model_list:
  - model_name: gpt-3.5-turbo-instruct
    litellm_params:
      model: text-completion-openai/gpt-3.5-turbo-instruct # The `text-completion-openai/` prefix will call openai.completions.create
      api_key: os.environ/OPENAI_API_KEY
  - model_name: text-davinci-003
    litellm_params:
      model: text-completion-openai/text-davinci-003
      api_key: os.environ/OPENAI_API_KEY

启动 litellm 代理服务器

litellm --config config.yaml

OpenAI Python SDK
Curl 请求

from openai import OpenAI

# set base_url to your proxy server
# set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:4000")

response = client.completions.create(
    model="gpt-3.5-turbo-instruct",
    prompt="Say this is a test",
    max_tokens=7
)

print(response)

curl --location 'http://0.0.0.0:4000/completions' \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer sk-1234' \
    --data '{
        "model": "gpt-3.5-turbo-instruct",
        "prompt": "Say this is a test",
        "max_tokens": 7
    }'

输入参数

LiteLLM 接受并转换所有支持的提供商的 OpenAI 文本补全参数。

必填字段

model: string - 要使用的模型 ID
prompt: string 或 array - 用于生成补全的提示

可选字段

best_of: integer - 在服务器端生成 best_of 个补全，并返回其中的“最佳”一个
echo: boolean - 除了补全外，还将提示原样返回。
frequency_penalty: number - 介于 -2.0 和 2.0 之间的数字。正值根据新生成的 token 在文本中的现有频率对其进行惩罚。
logit_bias: map - 修改指定 token 出现在补全中的可能性
logprobs: integer - 包括 logprobs 最可能 token 的对数概率。最大值为 5
max_tokens: integer - 要生成的最大 token 数。
n: integer - 为每个提示生成多少个补全。
presence_penalty: number - 介于 -2.0 和 2.0 之间的数字。正值根据新生成的 token 是否已出现在文本中对其进行惩罚。
seed: integer - 如果指定，系统将尝试生成确定性样本
stop: string 或 array - API 将停止生成 token 的最多 4 个序列
stream: boolean - 是否流式返回部分进度。默认为 false
suffix: string - 插入文本补全后的后缀
temperature: number - 使用的采样温度，介于 0 和 2 之间。
top_p: number - 另一种采样方法，称为核采样。
user: string - 表示您的最终用户的唯一标识符

输出格式

以下是您在调用补全时可预期的精确 JSON 输出格式

遵循 OpenAI 的输出格式

非流式响应
流式响应

{
  "id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
  "object": "text_completion",
  "created": 1589478378,
  "model": "gpt-3.5-turbo-instruct",
  "system_fingerprint": "fp_44709d6fcb",
  "choices": [
    {
      "text": "\n\nThis is indeed a test",
      "index": 0,
      "logprobs": null,
      "finish_reason": "length"
    }
  ],
  "usage": {
    "prompt_tokens": 5,
    "completion_tokens": 7,
    "total_tokens": 12
  }
}

{
  "id": "cmpl-7iA7iJjj8V2zOkCGvWF2hAkDWBQZe",
  "object": "text_completion",
  "created": 1690759702,
  "choices": [
    {
      "text": "This",
      "index": 0,
      "logprobs": null,
      "finish_reason": null
    }
  ],
  "model": "gpt-3.5-turbo-instruct"
  "system_fingerprint": "fp_44709d6fcb",
}

支持的提供商

提供商	用法链接
OpenAI	用法
Azure OpenAI	用法

/completions

用法​

输入参数​

必填字段​

可选字段​

输出格式​

支持的提供商​

用法

输入参数

必填字段

可选字段

输出格式

支持的提供商