后台模式 | OpenAI API

诸如 Codex and 深度研究等智能体表明，推理模型可能需要数分钟才能解决复杂问题。后台模式使您能够在 GPT-5.2 和 GPT-5.2 pro 等模型上可靠地执行耗时较长的任务，而无需担心超时或其他连接问题。

后台模式会异步启动这些任务，开发者可以通过轮询响应对象来随时检查状态。要在后台启动响应生成，请使用以下参数发起 API 请求： background 进行上传，并将其设置为 true:

由于后台模式会存储响应数据约 10 分钟以支持轮询，因此它与零数据留存 (ZDR) 不兼容。出于遗留原因，来自 ZDR 项目的请求在带有 background=true 的情况下仍会被接受，但使用它会破坏 ZDR 保证。修改版滥用监控 (MAM) 项目可以安全地使用后台模式。

在后台生成响应

python

1
2
3
4
5
6
7
8
curl https://api.openai.com/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
  "model": "gpt-5.5",
  "input": "Write a very long novel about otters in space.",
  "background": true
}'

1
2
3
4
5
6
7
8
9
10
import OpenAI from "openai";
const client = new OpenAI();

const resp = await client.responses.create({
  model: "gpt-5.5",
  input: "Write a very long novel about otters in space.",
  background: true,
});

console.log(resp.status);

1
2
3
4
5
6
7
8
9
10
11
from openai import OpenAI

client = OpenAI()

resp = client.responses.create(
  model="gpt-5.5",
  input="Write a very long novel about otters in space.",
  background=True,
)

print(resp.status)

轮询后台响应

要检查后台请求的状态，请使用 Responses 的 GET 端点。在请求处于 queued 或 in_progress 状态时，请继续轮询。当其离开这些状态时，即表示已达到最终（终止）状态。

检索在后台执行的响应

python

1
2
3
curl https://api.openai.com/v1/responses/resp_123 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import OpenAI from "openai";
const client = new OpenAI();

let resp = await client.responses.create({
model: "gpt-5.5",
input: "Write a very long novel about otters in space.",
background: true,
});

while (resp.status === "queued" || resp.status === "in_progress") {
console.log("Current status: " + resp.status);
await new Promise(resolve => setTimeout(resolve, 2000)); // wait 2 seconds
resp = await client.responses.retrieve(resp.id);
}

console.log("Final status: " + resp.status + "\nOutput:\n" + resp.output_text);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from openai import OpenAI
from time import sleep

client = OpenAI()

resp = client.responses.create(
  model="gpt-5.5",
  input="Write a very long novel about otters in space.",
  background=True,
)

while resp.status in {"queued", "in_progress"}:
  print(f"Current status: {resp.status}")
  sleep(2)
  resp = client.responses.retrieve(resp.id)

print(f"Final status: {resp.status}\nOutput:\n{resp.output_text}")

取消后台响应

您也可以像这样取消正在处理中的响应：

取消正在进行的响应

python

1
2
3
curl -X POST https://api.openai.com/v1/responses/resp_123/cancel \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY"

1
2
3
4
5
6
import OpenAI from "openai";
const client = new OpenAI();

const resp = await client.responses.cancel("resp_123");

console.log(resp.status);

1
2
3
4
5
6
from openai import OpenAI
client = OpenAI()

resp = client.responses.cancel("resp_123")

print(resp.status)

取消操作是幂等的——后续调用只会返回最终的 Response object.

流式传输后台响应

您可以创建后台响应并立即开始从中流式传输事件。如果您预计客户端会断开流连接并希望稍后能够恢复，这将非常有用。为此，请使用以下参数创建响应： background and stream 进行上传，并将其设置为 true。你需要跟踪一个与 sequence_number 您在每个流式传输事件中接收到的内容。

目前，从后台响应接收首个 token 的时间要高于同步响应。我们正在努力在未来几周内缩小这一延迟差距。

生成并流式传输后台响应

python

1
2
3
4
5
6
7
8
9
10
11
12
13
14
curl https://api.openai.com/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
  "model": "gpt-5.5",
  "input": "Write a very long novel about otters in space.",
  "background": true,
  "stream": true
}'

// To resume:
curl "https://api.openai.com/v1/responses/resp_123?stream=true&starting_after=42" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import OpenAI from "openai";
const client = new OpenAI();

const stream = await client.responses.create({
  model: "gpt-5.5",
  input: "Write a very long novel about otters in space.",
  background: true,
  stream: true,
});

let cursor = null;
for await (const event of stream) {
  console.log(event);
  cursor = event.sequence_number;
}

// If the connection drops, you can resume streaming from the last cursor (SDK support coming soon):
// const resumedStream = await client.responses.stream(resp.id, { starting_after: cursor });
// for await (const event of resumedStream) { ... }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
from openai import OpenAI

client = OpenAI()

# Fire off an async response but also start streaming immediately
stream = client.responses.create(
  model="gpt-5.5",
  input="Write a very long novel about otters in space.",
  background=True,
  stream=True,
)

cursor = None
for event in stream:
  print(event)
  cursor = event.sequence_number

# If your connection drops, the response continues running and you can reconnect:
# SDK support for resuming the stream is coming soon.
# for event in client.responses.stream(resp.id, starting_after=cursor):
#     print(event)

限制

后台采样需要 store=true；无状态请求将被拒绝。
要取消同步响应，请终止连接
只有在创建后台响应时带有 stream=true.

推荐

入门

核心概念

Apps SDK

工具

运行与扩展

评估

实时与音频

模型优化

专业模型

正式上线

旧版 API

资源

入门指南

使用 Codex

配置

管理

自动化

学习

发布

核心概念

规划

构建

部署

转化应用

指南

资源

指南

文件上传

API

衡量

广告主 API

API 参考

最新

主题

主题

贡献

分类

主题

项目

活动

轮询后台响应

取消后台响应

流式传输后台响应

限制