Completions API 端点在 2023 年 7 月进行了最后一次更新,其接口与新的 Chat Completions 端点有所不同。其输入不再是消息列表,而是一个自由格式的文本字符串,称为 prompt.
一个旧版 Completions API 调用示例如下:
1
2
3
4
5
6
7
from openai import OpenAI
client = OpenAI()
response = client.completions.create(
model="gpt-3.5-turbo-instruct",
prompt="Write a tagline for an ice cream shop."
)中找到)请参阅完整的 API 参考文档 to learn more.
插入文本
Completions 端点还支持通过提供 后缀 以及被视为前缀的标准 prompt 来插入文本。在撰写长篇文本、在段落之间过渡、遵循大纲或引导模型走向结局时,自然会产生这种需求。这也适用于代码,可用于在函数或文件中间进行插入。
To illustrate how suffix context effects generated text, consider the prompt, “Today I decided to make a big change.” There’s many ways one could imagine completing the sentence. But if we now supply the ending of the story: “I’ve gotten many compliments on my new hair!”, the intended completion becomes clear.
I went to college at Boston University. After getting my degree, I decided to make a change**. A big change!**
I packed my bags and moved to the west coast of the United States.
Now, I can’t get enough of the Pacific Ocean!
By providing the model with additional context, it can be much more steerable. However, this is a more constrained and challenging task for the model. To get the best results, we recommend the following:
Use max_tokens > 256. The model is better at inserting longer completions. With too small max_tokens, the model may be cut off before it’s able to connect to the suffix. Note that you will only be charged for the number of tokens produced even when using larger max_tokens.
Prefer finish_reason == “stop”. When the model reaches a natural stopping point or a user provided stop sequence, it will set finish_reason as “stop”. This indicates that the model has managed to connect to the suffix well and is a good signal for the quality of a completion. This is especially relevant for choosing between a few completions when using n > 1 or resampling (see the next point).
Resample 3-5 times. While almost all completions connect to the prefix, the model may struggle to connect the suffix in harder cases. We find that resampling 3 or 5 times (or using best_of with k=3,5) and picking the samples with “stop” as their finish_reason can be an effective way in such cases. While resampling, you would typically want a higher temperatures to increase diversity.
Note: if all the returned samples have finish_reason == “length”, it’s likely that max_tokens is too small and model runs out of tokens before it manages to connect the prompt and the suffix naturally. Consider increasing max_tokens before resampling.
Try giving more clues. In some cases to better help the model’s generation, you can provide clues by giving a few examples of patterns that the model can follow to decide a natural place to stop.
How to make a delicious hot chocolate:
1.** Boil water** 2. Put hot chocolate in a cup 3. Add boiling water to the cup 4. Enjoy the hot chocolate
- Dogs are loyal animals.
- Lions are ferocious animals.
- Dolphins** are playful animals.**
- Horses are majestic animals.
Completions 响应格式
Completions API 响应示例如下:
{
"choices": [
{
"finish_reason": "length",
"index": 0,
"logprobs": null,
"text": "\n\n\"Let Your Sweet Tooth Run Wild at Our Creamy Ice Cream Shack"
}
],
"created": 1683130927,
"id": "cmpl-7C9Wxi9Du4j1lQjdjhxBlO22M61LD",
"model": "gpt-3.5-turbo-instruct",
"object": "text_completion",
"usage": {
"completion_tokens": 16,
"prompt_tokens": 10,
"total_tokens": 26
}
}在 Python 中,可以通过以下方式提取输出: response['choices'][0]['text'].
该响应格式与 Chat Completions API 的响应格式类似。
插入文本
Completions 端点还支持通过提供 后缀 以及被视为前缀的标准 prompt 来插入文本。在撰写长篇文本、在段落之间过渡、遵循大纲或引导模型走向结局时,自然会产生这种需求。这也适用于代码,可用于在函数或文件中间进行插入。
To illustrate how suffix context effects generated text, consider the prompt, “Today I decided to make a big change.” There’s many ways one could imagine completing the sentence. But if we now supply the ending of the story: “I’ve gotten many compliments on my new hair!”, the intended completion becomes clear.
I went to college at Boston University. After getting my degree, I decided to make a change**. A big change!**
I packed my bags and moved to the west coast of the United States.
Now, I can’t get enough of the Pacific Ocean!
By providing the model with additional context, it can be much more steerable. However, this is a more constrained and challenging task for the model. To get the best results, we recommend the following:
Use max_tokens > 256. The model is better at inserting longer completions. With too small max_tokens, the model may be cut off before it’s able to connect to the suffix. Note that you will only be charged for the number of tokens produced even when using larger max_tokens.
Prefer finish_reason == “stop”. When the model reaches a natural stopping point or a user provided stop sequence, it will set finish_reason as “stop”. This indicates that the model has managed to connect to the suffix well and is a good signal for the quality of a completion. This is especially relevant for choosing between a few completions when using n > 1 or resampling (see the next point).
Resample 3-5 times. While almost all completions connect to the prefix, the model may struggle to connect the suffix in harder cases. We find that resampling 3 or 5 times (or using best_of with k=3,5) and picking the samples with “stop” as their finish_reason can be an effective way in such cases. While resampling, you would typically want a higher temperatures to increase diversity.
Note: if all the returned samples have finish_reason == “length”, it’s likely that max_tokens is too small and model runs out of tokens before it manages to connect the prompt and the suffix naturally. Consider increasing max_tokens before resampling.
Try giving more clues. In some cases to better help the model’s generation, you can provide clues by giving a few examples of patterns that the model can follow to decide a natural place to stop.
How to make a delicious hot chocolate:
1.** Boil water** 2. Put hot chocolate in a cup 3. Add boiling water to the cup 4. Enjoy the hot chocolate
- Dogs are loyal animals.
- Lions are ferocious animals.
- Dolphins** are playful animals.**
- Horses are majestic animals.
Chat Completions 与 Completions
通过使用单个用户消息构建请求,可以使 Chat Completions 的格式类似于 completions 的格式。例如,可以使用以下 completions 提示词将英语翻译为法语:
Translate the following English text to French: "{text}"而等效的 chat 提示词为:
[{"role": "user", "content": 'Translate the following English text to French: "{text}"'}]同样,可以通过相应地格式化输入,使用 completions API 来模拟用户与助手之间的对话 相应地.
这些 API 之间的区别在于各自可用的底层模型。Chat Completions API 是我们最强大模型(gpt-4o)和最具成本效益模型(gpt-4o-mini).