Cache 诊断
通过比较连续请求并精确定位 prompt 前缀的分歧位置,诊断意外的 prompt cache 未命中。
此功能符合零数据保留(ZDR)条件(有限技术保留)。有关保留内容及原因的详情,请参阅数据保留部分。
Prompt caching 可显著降低延迟和成本,但前提是您的 prompt 开头与最近的请求逐字节完全一致。重新排序的工具、插入系统 prompt 中的时间戳或对早期消息的编辑都可能默默使 cache 失效。没有 cache 诊断时,唯一的信号是 usage.cache_read_input_tokens 降为零,却无法指示是什么发生了变化。
Cache 诊断弥补了这一缺陷。传入前一个响应的 id,API 就会比较两个请求并告诉您分歧点在哪里(模型、系统 prompt、工具还是消息历史),这样您就可以修复根本原因,而不是靠猜测。
Cache 诊断目前处于 beta 阶段。在 API 请求中包含 beta 头 cache-diagnosis-2026-04-07 即可使用此功能。
Cache 诊断目前仅在 Claude API 上可用,不支持 Amazon Bedrock 或 Vertex AI。
Cache 诊断的工作原理
当 beta 头存在时,API 会为每个请求存储一个轻量级指纹,以响应 id 为键。在下一个请求中,将该 id 作为 diagnostics.previous_message_id 传入。API 会为新请求重建指纹,与存储的指纹进行比较,并在响应中附加一个 diagnostics 对象,描述第一个分歧点。
比较关注的是请求结构,与 cache 是否实际命中无关。有关如何将 diagnostics 结果与 usage.cache_read_input_tokens 结合使用,请参阅结合 usage 读取诊断信息。
指纹仅包含哈希值和 token 计数估计(从不包含原始 prompt 内容),保留时间有限,范围限定在您的组织和工作区,不会用于任何其他用途。
基本用法
在每轮对话中发送 beta 头。在第一轮中,传入 "previous_message_id": null 以在没有先前消息可比较的情况下选择加入。在后续轮次中,传入前一个响应的 id。
# 第 1 轮:建立 cache 并选择加入诊断
response=$(curl -sS --fail-with-body https://api.anthropic.com/v1/messages \
--header "x-api-key: $ANTHROPIC_API_KEY" \
--header "anthropic-version: 2023-06-01" \
--header "anthropic-beta: cache-diagnosis-2026-04-07" \
--header "content-type: application/json" \
--data '{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"cache_control": {"type": "ephemeral"},
"system": "You are an AI assistant analyzing a large document. <document>...</document>",
"messages": [{"role": "user", "content": "Summarize section 1."}],
"diagnostics": {"previous_message_id": null}
}')
jq '{id, diagnostics}' <<< "$response"
message_id=$(jq -r '.id' <<< "$response")
# 第 2 轮:引用前一轮以便 API 比较前缀
curl -sS --fail-with-body https://api.anthropic.com/v1/messages \
--header "x-api-key: $ANTHROPIC_API_KEY" \
--header "anthropic-version: 2023-06-01" \
--header "anthropic-beta: cache-diagnosis-2026-04-07" \
--header "content-type: application/json" \
--data @- <<EOF | jq '{id, diagnostics}' # diagnostics: null 表示未发现分歧
{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"cache_control": {"type": "ephemeral"},
"system": "You are an AI assistant analyzing a large document. <document>...</document>",
"messages": [
{"role": "user", "content": "Summarize section 1."},
{"role": "assistant", "content": "Section 1 covers..."},
{"role": "user", "content": "Now summarize section 2."}
],
"diagnostics": {"previous_message_id": "$message_id"}
}
EOF
# 第 1 轮
turn1=$(ant beta:messages create \
--beta cache-diagnosis-2026-04-07 \
--transform '{id,usage,diagnostics}' <<'YAML'
model: claude-opus-4-7
max_tokens: 1024
cache_control:
type: ephemeral
system: "You are an AI assistant analyzing a large document. <document>...</document>"
messages:
- role: user
content: Summarize section 1.
diagnostics:
previous_message_id: null
YAML
)
printf '%s\n' "$turn1"
# 第 2 轮:将第 1 轮的 id 作为 previous_message_id 传入
message_id=$(jq -r '.id' <<<"$turn1")
ant beta:messages create \
--beta cache-diagnosis-2026-04-07 \
--transform '{id,usage,diagnostics}' <<YAML
model: claude-opus-4-7
max_tokens: 1024
cache_control:
type: ephemeral
system: "You are an AI assistant analyzing a large document. <document>...</document>"
messages:
- role: user
content: Summarize section 1.
- role: assistant
content: Section 1 covers...
- role: user
content: Now summarize section 2.
diagnostics:
previous_message_id: $message_id
YAML
import anthropic
client = anthropic.Anthropic()
SYSTEM = "You are an AI assistant analyzing a large document. <document>...</document>"
# 第 1 轮:使用 previous_message_id=None 选择加入
r1 = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
cache_control={"type": "ephemeral"},
system=SYSTEM,
messages=[{"role": "user", "content": "Summarize section 1."}],
diagnostics={"previous_message_id": None},
betas=["cache-diagnosis-2026-04-07"],
)
# 第 2 轮:引用前一个响应 id
r2 = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
cache_control={"type": "ephemeral"},
system=SYSTEM,
messages=[
{"role": "user", "content": "Summarize section 1."},
{"role": "assistant", "content": r1.content},
{"role": "user", "content": "Now summarize section 2."},
],
diagnostics={"previous_message_id": r1.id},
betas=["cache-diagnosis-2026-04-07"],
)
diagnostics = r2.diagnostics
if diagnostics is None:
print("未检测到分歧。")
elif diagnostics.cache_miss_reason is None:
print("比较仍在进行中。")
else:
print(f"cache_miss_reason: {diagnostics.cache_miss_reason.type}")
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const SYSTEM = "You are an AI assistant analyzing a large document. <document>...</document>";
// 第 1 轮:使用 previous_message_id: null 选择加入
const r1 = await client.beta.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
cache_control: { type: "ephemeral" },
system: SYSTEM,
messages: [{ role: "user", content: "Summarize section 1." }],
diagnostics: { previous_message_id: null },
betas: ["cache-diagnosis-2026-04-07"]
});
// 第 2 轮:引用前一个响应 id
const r2 = await client.beta.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
cache_control: { type: "ephemeral" },
system: SYSTEM,
messages: [
{ role: "user", content: "Summarize section 1." },
{ role: "assistant", content: r1.content },
{ role: "user", content: "Now summarize section 2." }
],
diagnostics: { previous_message_id: r1.id },
betas: ["cache-diagnosis-2026-04-07"]
});
if (r2.diagnostics === null) {
console.log("未检测到分歧。");
} else if (r2.diagnostics.cache_miss_reason === null) {
console.log("比较仍在进行中。");
} else {
console.log(`cache_miss_reason: ${r2.diagnostics.cache_miss_reason.type}`);
}
using Anthropic;
using Anthropic.Models.Beta;
using Anthropic.Models.Beta.Messages;
using Messages = Anthropic.Models.Messages;
using Role = Anthropic.Models.Beta.Messages.Role;
AnthropicClient client = new();
var system = "You are an AI assistant analyzing a large document. <document>...</document>";
var r1 = await client.Beta.Messages.Create(
new()
{
Model = Messages::Model.ClaudeOpus4_7,
MaxTokens = 1024,
CacheControl = new(),
System = system,
Messages =
[
new() { Role = Role.User, Content = "Summarize section 1." },
],
Diagnostics = new() { PreviousMessageID = null },
Betas = [AnthropicBeta.CacheDiagnosis2026_04_07],
}
);
var r2 = await client.Beta.Messages.Create(
new()
{
Model = Messages::Model.ClaudeOpus4_7,
MaxTokens = 1024,
CacheControl = new(),
System = system,
Messages =
[
new() { Role = Role.User, Content = "Summarize section 1." },
new()
{
Role = Role.Assistant,
Content = r1.Content.Select(block => new BetaContentBlockParam(block.Json)).ToList(),
},
new() { Role = Role.User, Content = "Now summarize section 2." },
],
Diagnostics = new() { PreviousMessageID = r1.ID },
Betas = [AnthropicBeta.CacheDiagnosis2026_04_07],
}
);
Console.WriteLine(r2.Diagnostics switch
{
null => "未检测到分歧。",
{ CacheMissReason: null } => "比较仍在进行中。",
{ CacheMissReason.Type: var type } => {{CONTENT}}quot;cache_miss_reason: {type.GetString()}",
});
package main
import (
"context"
"fmt"
"github.com/anthropics/anthropic-sdk-go"
"github.com/anthropics/anthropic-sdk-go/packages/param"
)
func main() {
client := anthropic.NewClient()
ctx := context.Background()
system := []anthropic.BetaTextBlockParam{
{Text: "You are an AI assistant analyzing a large document. <document>...</document>"},
}
r1, err := client.Beta.Messages.New(ctx, anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeOpus4_7,
MaxTokens: 1024,
CacheControl: anthropic.BetaCacheControlEphemeralParam{},
System: system,
Messages: []anthropic.BetaMessageParam{
anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Summarize section 1.")),
},
Diagnostics: anthropic.BetaDiagnosticsParam{
PreviousMessageID: param.Null[string](),
},
Betas: []anthropic.AnthropicBeta{anthropic.AnthropicBetaCacheDiagnosis2026_04_07},
})
if err != nil {
panic(err)
}
r2, err := client.Beta.Messages.New(ctx, anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeOpus4_7,
MaxTokens: 1024,
CacheControl: anthropic.BetaCacheControlEphemeralParam{},
System: system,
Messages: []anthropic.BetaMessageParam{
anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Summarize section 1.")),
r1.ToParam(),
anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Now summarize section 2.")),
},
Diagnostics: anthropic.BetaDiagnosticsParam{
PreviousMessageID: anthropic.String(r1.ID),
},
Betas: []anthropic.AnthropicBeta{anthropic.AnthropicBetaCacheDiagnosis2026_04_07},
})
if err != nil {
panic(err)
}
switch {
case !r2.JSON.Diagnostics.Valid():
fmt.Println("未检测到分歧。")
case !r2.Diagnostics.JSON.CacheMissReason.Valid():
fmt.Println("比较仍在进行中。")
default:
fmt.Printf("cache_miss_reason: %s\n", r2.Diagnostics.CacheMissReason.Type)
}
}
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.core.JsonValue;
import com.anthropic.models.beta.AnthropicBeta;
import com.anthropic.models.beta.messages.BetaCacheControlEphemeral;
import com.anthropic.models.beta.messages.BetaDiagnosticsParam;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.messages.Model;
void main() {
var client = AnthropicOkHttpClient.fromEnv();
var system = "You are an AI assistant analyzing a large document. <document>...</document>";
var r1 = client.beta().messages().create(
MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.cacheControl(BetaCacheControlEphemeral.builder().build())
.system(system)
.addUserMessage("Summarize section 1.")
// 在第一轮传入 null 以在没有先前消息可比较的情况下选择加入
.diagnostics(BetaDiagnosticsParam.builder().previousMessageId((String) null).build())
.addBeta(AnthropicBeta.CACHE_DIAGNOSIS_2026_04_07)
.build()
);
var r2 = client.beta().messages().create(
MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.cacheControl(BetaCacheControlEphemeral.builder().build())
.system(system)
.addUserMessage("Summarize section 1.")
.addMessage(r1)
.addUserMessage("Now summarize section 2.")
.diagnostics(BetaDiagnosticsParam.builder().previousMessageId(r1.id()).build())
.addBeta(AnthropicBeta.CACHE_DIAGNOSIS_2026_04_07)
.build()
);
if (r2.diagnostics().isEmpty()) {
IO.println("未检测到分歧。");
} else if (r2.diagnostics().get().cacheMissReason().isEmpty()) {
IO.println("比较仍在进行中。");
} else {
var reason = r2.diagnostics().get().cacheMissReason().get();
// CacheMissReason 没有暴露类型化的 .type() 访问器;从原始 JSON 中读取
@SuppressWarnings("unchecked")
var json = (Map<String, JsonValue>) reason._json().orElseThrow().asObject().orElseThrow();
IO.println("cache_miss_reason: " + json.get("type").asStringOrThrow());
}
}
<?php
use Anthropic\Client;
use Anthropic\Beta\AnthropicBeta;
use Anthropic\Beta\Messages\BetaCacheControlEphemeral;
use Anthropic\Beta\Messages\BetaDiagnosticsParam;
use Anthropic\Messages\Model;
$client = new Client();
$system = 'You are an AI assistant analyzing a large document. <document>...</document>';
$r1 = $client->beta->messages->create(
model: Model::CLAUDE_OPUS_4_7,
maxTokens: 1024,
cacheControl: new BetaCacheControlEphemeral,
system: $system,
messages: [
['role' => 'user', 'content' => 'Summarize section 1.'],
],
diagnostics: (new BetaDiagnosticsParam)->withPreviousMessageID(null),
betas: [AnthropicBeta::CACHE_DIAGNOSIS_2026_04_07],
);
$r2 = $client->beta->messages->create(
model: Model::CLAUDE_OPUS_4_7,
maxTokens: 1024,
cacheControl: new BetaCacheControlEphemeral,
system: $system,
messages: [
['role' => 'user', 'content' => 'Summarize section 1.'],
['role' => 'assistant', 'content' => $r1->content],
['role' => 'user', 'content' => 'Now summarize section 2.'],
],
diagnostics: (new BetaDiagnosticsParam)->withPreviousMessageID($r1->id),
betas: [AnthropicBeta::CACHE_DIAGNOSIS_2026_04_07],
);
echo match (true) {
$r2->diagnostics === null => "未检测到分歧。\n",
$r2->diagnostics->cacheMissReason === null => "比较仍在进行中。\n",
default => "cache_miss_reason: {$r2->diagnostics->cacheMissReason->type}\n",
};
require "anthropic"
client = Anthropic::Client.new
SYSTEM = "You are an AI assistant analyzing a large document. <document>...</document>"
r1 = client.beta.messages.create(
model: :"claude-opus-4-7",
max_tokens: 1024,
cache_control: {type: "ephemeral"},
system_: SYSTEM,
messages: [
{role: "user", content: "Summarize section 1."}
],
diagnostics: {previous_message_id: nil},
betas: ["cache-diagnosis-2026-04-07"]
)
r2 = client.beta.messages.create(
model: :"claude-opus-4-7",
max_tokens: 1024,
cache_control: {type: "ephemeral"},
system_: SYSTEM,
messages: [
{role: "user", content: "Summarize section 1."},
{role: "assistant", content: r1.content},
{role: "user", content: "Now summarize section 2."}
],
diagnostics: {previous_message_id: r1.id},
betas: ["cache-diagnosis-2026-04-07"]
)
case r2.diagnostics
in nil
puts "未检测到分歧。"
in {cache_miss_reason: nil}
puts "比较仍在进行中。"
in {cache_miss_reason: {type:}}
puts "cache_miss_reason: #{type}"
end
流式传输
在流式响应中,diagnostics 出现在 message_start 事件中。
# 第 1 轮:建立 cache 并选择加入诊断(非流式)
response=$(curl -sS --fail-with-body https://api.anthropic.com/v1/messages \
--header "x-api-key: $ANTHROPIC_API_KEY" \
--header "anthropic-version: 2023-06-01" \
--header "anthropic-beta: cache-diagnosis-2026-04-07" \
--header "content-type: application/json" \
--data '{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"cache_control": {"type": "ephemeral"},
"system": "You are an AI assistant analyzing a large document. <document>...</document>",
"messages": [{"role": "user", "content": "Summarize section 1."}],
"diagnostics": {"previous_message_id": null}
}')
message_id=$(jq -r '.id' <<< "$response")
# 第 2 轮:流式响应。diagnostics 在 message_start 事件中到达;
# null 值表示未发现分歧。
curl -sS --fail-with-body https://api.anthropic.com/v1/messages \
--header "x-api-key: $ANTHROPIC_API_KEY" \
--header "anthropic-version: 2023-06-01" \
--header "anthropic-beta: cache-diagnosis-2026-04-07" \
--header "content-type: application/json" \
--data @- <<EOF | sed -n 's/^data: //p' | jq -s '.[] | select(.type == "message_start") | .message.diagnostics'
{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"stream": true,
"cache_control": {"type": "ephemeral"},
"system": "You are an AI assistant analyzing a large document. <document>...</document>",
"messages": [
{"role": "user", "content": "Summarize section 1."},
{"role": "assistant", "content": "Section 1 covers..."},
{"role": "user", "content": "Now summarize section 2."}
],
"diagnostics": {"previous_message_id": "$message_id"}
}
EOF
#!/usr/bin/env bash
set -euo pipefail
# 第 1 轮
turn1=$(ant beta:messages create \
--beta cache-diagnosis-2026-04-07 \
--transform '{id,usage,diagnostics}' <<'YAML'
model: claude-opus-4-7
max_tokens: 1024
cache_control:
type: ephemeral
system: "You are an AI assistant analyzing a large document. <document>...</document>"
messages:
- role: user
content: Summarize section 1.
diagnostics:
previous_message_id: null
YAML
)
printf '%s\n' "$turn1"
message_id=$(jq -r '.id' <<<"$turn1")
# 第 2 轮:流式传输。使用 --stream 时,CLI 将每个 SSE 事件输出为一个 JSON 对象。
# diagnostics 在 message_start 事件中到达;用 jq 提取。
ant beta:messages create \
--beta cache-diagnosis-2026-04-07 \
--stream --format jsonl <<YAML |
model: claude-opus-4-7
max_tokens: 1024
cache_control:
type: ephemeral
system: "You are an AI assistant analyzing a large document. <document>...</document>"
messages:
- role: user
content: Summarize section 1.
- role: assistant
content: Section 1 covers...
- role: user
content: Now summarize section 2.
diagnostics:
previous_message_id: $message_id
YAML
jq -c 'select(.type == "message_start") | .message | {id,usage,diagnostics}'
import anthropic
client = anthropic.Anthropic()
SYSTEM = "You are an AI assistant analyzing a large document. <document>...</document>"
# 第 1 轮:使用 previous_message_id=None 选择加入
r1 = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
cache_control={"type": "ephemeral"},
system=SYSTEM,
messages=[{"role": "user", "content": "Summarize section 1."}],
diagnostics={"previous_message_id": None},
betas=["cache-diagnosis-2026-04-07"],
)
# 第 2 轮:流式传输,引用前一个响应 id
with client.beta.messages.stream(
model="claude-opus-4-7",
max_tokens=1024,
cache_control={"type": "ephemeral"},
system=SYSTEM,
messages=[
{"role": "user", "content": "Summarize section 1."},
{"role": "assistant", "content": r1.content},
{"role": "user", "content": "Now summarize section 2."},
],
diagnostics={"previous_message_id": r1.id},
betas=["cache-diagnosis-2026-04-07"],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
print()
r2 = stream.get_final_message()
diagnostics = r2.diagnostics
if diagnostics is None:
print("未检测到分歧。")
elif diagnostics.cache_miss_reason is None:
print("比较仍在进行中。")
else:
print(f"cache_miss_reason: {diagnostics.cache_miss_reason.type}")
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const SYSTEM = "You are an AI assistant analyzing a large document. <document>...</document>";
// 第 1 轮:使用 previous_message_id: null 选择加入
const r1 = await client.beta.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
cache_control: { type: "ephemeral" },
system: SYSTEM,
messages: [{ role: "user", content: "Summarize section 1." }],
diagnostics: { previous_message_id: null },
betas: ["cache-diagnosis-2026-04-07"]
});
// 第 2 轮:流式传输,引用前一个响应 id
const stream = client.beta.messages.stream({
model: "claude-opus-4-7",
max_tokens: 1024,
cache_control: { type: "ephemeral" },
system: SYSTEM,
messages: [
{ role: "user", content: "Summarize section 1." },
{ role: "assistant", content: r1.content },
{ role: "user", content: "Now summarize section 2." }
],
diagnostics: { previous_message_id: r1.id },
betas: ["cache-diagnosis-2026-04-07"]
});
for await (const event of stream) {
if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
process.stdout.write(event.delta.text);
}
}
process.stdout.write("\n");
// diagnostics 在 message_start 中到达,并传递到最终消息
const r2 = await stream.finalMessage();
if (r2.diagnostics === null) {
console.log("未检测到分歧。");
} else if (r2.diagnostics.cache_miss_reason === null) {
console.log("比较仍在进行中。");
} else {
console.log(`cache_miss_reason: ${r2.diagnostics.cache_miss_reason.type}`);
}
using Anthropic;
using Anthropic.Models.Beta;
using Anthropic.Models.Beta.Messages;
using Messages = Anthropic.Models.Messages;
using Role = Anthropic.Models.Beta.Messages.Role;
AnthropicClient client = new();
var system = "You are an AI assistant analyzing a large document. <document>...</document>";
// 第 1 轮:使用 PreviousMessageID = null 选择加入
var r1 = await client.Beta.Messages.Create(
new()
{
Model = Messages::Model.ClaudeOpus4_7,
MaxTokens = 1024,
CacheControl = new(),
System = system,
Messages =
[
new() { Role = Role.User, Content = "Summarize section 1." },
],
Diagnostics = new() { PreviousMessageID = null },
Betas = [AnthropicBeta.CacheDiagnosis2026_04_07],
}
);
// 第 2 轮:流式传输,引用前一个响应 id
BetaDiagnostics? diagnostics = null;
var stream = client.Beta.Messages.CreateStreaming(
new()
{
Model = Messages::Model.ClaudeOpus4_7,
MaxTokens = 1024,
CacheControl = new(),
System = system,
Messages =
[
new() { Role = Role.User, Content = "Summarize section 1." },
new()
{
Role = Role.Assistant,
Content = r1.Content.Select(block => new BetaContentBlockParam(block.Json)).ToList(),
},
new() { Role = Role.User, Content = "Now summarize section 2." },
],
Diagnostics = new() { PreviousMessageID = r1.ID },
Betas = [AnthropicBeta.CacheDiagnosis2026_04_07],
}
);
await foreach (var streamEvent in stream)
{
if (streamEvent.TryPickStart(out var start))
{
// diagnostics 在 message_start 事件中到达
diagnostics = start.Message.Diagnostics;
}
else if (streamEvent.TryPickContentBlockDelta(out var delta) && delta.Delta.TryPickText(out var textDelta))
{
Console.Write(textDelta.Text);
}
}
Console.WriteLine();
Console.WriteLine(diagnostics switch
{
null => "未检测到分歧。",
{ CacheMissReason: null } => "比较仍在进行中。",
{ CacheMissReason.Type: var type } => {{CONTENT}}quot;cache_miss_reason: {type.GetString()}",
});
package main
import (
"context"
"fmt"
"github.com/anthropics/anthropic-sdk-go"
"github.com/anthropics/anthropic-sdk-go/packages/param"
)
func main() {
client := anthropic.NewClient()
ctx := context.Background()
system := []anthropic.BetaTextBlockParam{
{Text: "You are an AI assistant analyzing a large document. <document>...</document>"},
}
// 第 1 轮:使用 PreviousMessageID null 选择加入
r1, err := client.Beta.Messages.New(ctx, anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeOpus4_7,
MaxTokens: 1024,
CacheControl: anthropic.BetaCacheControlEphemeralParam{},
System: system,
Messages: []anthropic.BetaMessageParam{
anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Summarize section 1.")),
},
Diagnostics: anthropic.BetaDiagnosticsParam{
PreviousMessageID: param.Null[string](),
},
Betas: []anthropic.AnthropicBeta{anthropic.AnthropicBetaCacheDiagnosis2026_04_07},
})
if err != nil {
panic(err)
}
// 第 2 轮:流式传输,引用前一个响应 id
stream := client.Beta.Messages.NewStreaming(ctx, anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeOpus4_7,
MaxTokens: 1024,
CacheControl: anthropic.BetaCacheControlEphemeralParam{},
System: system,
Messages: []anthropic.BetaMessageParam{
anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Summarize section 1.")),
r1.ToParam(),
anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Now summarize section 2.")),
},
Diagnostics: anthropic.BetaDiagnosticsParam{
PreviousMessageID: anthropic.String(r1.ID),
},
Betas: []anthropic.AnthropicBeta{anthropic.AnthropicBetaCacheDiagnosis2026_04_07},
})
defer stream.Close()
// diagnostics 在 message_start 中到达;Accumulate 将其传递到 r2
var r2 anthropic.BetaMessage
for stream.Next() {
if err := r2.Accumulate(stream.Current()); err != nil {
panic(err)
}
}
if err := stream.Err(); err != nil {
panic(err)
}
switch {
case !r2.JSON.Diagnostics.Valid():
fmt.Println("未检测到分歧。")
case !r2.Diagnostics.JSON.CacheMissReason.Valid():
fmt.Println("比较仍在进行中。")
default:
fmt.Printf("cache_miss_reason: %s\n", r2.Diagnostics.CacheMissReason.Type)
}
}
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.core.JsonValue;
import com.anthropic.helpers.BetaMessageAccumulator;
import com.anthropic.models.beta.AnthropicBeta;
import com.anthropic.models.beta.messages.BetaCacheControlEphemeral;
import com.anthropic.models.beta.messages.BetaDiagnosticsParam;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.messages.Model;
void main() {
var client = AnthropicOkHttpClient.fromEnv();
var system = "You are an AI assistant analyzing a large document. <document>...</document>";
// 第 1 轮:使用 previousMessageId = null 选择加入
var r1 = client.beta().messages().create(
MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.cacheControl(BetaCacheControlEphemeral.builder().build())
.system(system)
.addUserMessage("Summarize section 1.")
.diagnostics(BetaDiagnosticsParam.builder().previousMessageId((String) null).build())
.addBeta(AnthropicBeta.CACHE_DIAGNOSIS_2026_04_07)
.build()
);
// 第 2 轮:流式传输,引用前一个响应 id
var params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.cacheControl(BetaCacheControlEphemeral.builder().build())
.system(system)
.addUserMessage("Summarize section 1.")
.addMessage(r1)
.addUserMessage("Now summarize section 2.")
.diagnostics(BetaDiagnosticsParam.builder().previousMessageId(r1.id()).build())
.addBeta(AnthropicBeta.CACHE_DIAGNOSIS_2026_04_07)
.build();
var accumulator = BetaMessageAccumulator.create();
try (var streamResponse = client.beta().messages().createStreaming(params)) {
streamResponse.stream()
.peek(accumulator::accumulate)
.flatMap(event -> event.contentBlockDelta().stream())
.flatMap(deltaEvent -> deltaEvent.delta().text().stream())
.forEach(textDelta -> IO.print(textDelta.text()));
IO.println("");
}
// diagnostics 在 message_start 中到达,并传递到累积消息
var diagnostics = accumulator.message().diagnostics();
if (diagnostics.isEmpty()) {
IO.println("未检测到分歧。");
} else if (diagnostics.get().cacheMissReason().isEmpty()) {
IO.println("比较仍在进行中。");
} else {
var reason = diagnostics.get().cacheMissReason().get();
// CacheMissReason 没有暴露类型化的 .type() 访问器;从原始 JSON 中读取
@SuppressWarnings("unchecked")
var json = (Map<String, JsonValue>) reason._json().orElseThrow().asObject().orElseThrow();
IO.println("cache_miss_reason: " + json.get("type").asStringOrThrow());
}
}
<?php
use Anthropic\Client;
use Anthropic\Beta\AnthropicBeta;
use Anthropic\Beta\Messages\BetaCacheControlEphemeral;
use Anthropic\Beta\Messages\BetaDiagnosticsParam;
use Anthropic\Beta\Messages\BetaRawContentBlockDeltaEvent;
use Anthropic\Beta\Messages\BetaRawMessageStartEvent;
use Anthropic\Beta\Messages\BetaTextDelta;
use Anthropic\Messages\Model;
$client = new Client();
$system = 'You are an AI assistant analyzing a large document. <document>...</document>';
// 第 1 轮:使用 previousMessageID null 选择加入
$r1 = $client->beta->messages->create(
model: Model::CLAUDE_OPUS_4_7,
maxTokens: 1024,
cacheControl: new BetaCacheControlEphemeral,
system: $system,
messages: [
['role' => 'user', 'content' => 'Summarize section 1.'],
],
diagnostics: (new BetaDiagnosticsParam)->withPreviousMessageID(null),
betas: [AnthropicBeta::CACHE_DIAGNOSIS_2026_04_07],
);
// 第 2 轮:流式传输,引用前一个响应 id
$stream = $client->beta->messages->createStream(
model: Model::CLAUDE_OPUS_4_7,
maxTokens: 1024,
cacheControl: new BetaCacheControlEphemeral,
system: $system,
messages: [
['role' => 'user', 'content' => 'Summarize section 1.'],
['role' => 'assistant', 'content' => $r1->content],
['role' => 'user', 'content' => 'Now summarize section 2.'],
],
diagnostics: (new BetaDiagnosticsParam)->withPreviousMessageID($r1->id),
betas: [AnthropicBeta::CACHE_DIAGNOSIS_2026_04_07],
);
$diagnostics = null;
foreach ($stream as $event) {
if ($event instanceof BetaRawMessageStartEvent) {
// diagnostics 在 message_start 事件的嵌入式 BetaMessage 中到达
$diagnostics = $event->message->diagnostics;
} elseif ($event instanceof BetaRawContentBlockDeltaEvent && $event->delta instanceof BetaTextDelta) {
echo $event->delta->text;
}
}
echo PHP_EOL;
echo match (true) {
$diagnostics === null => "未检测到分歧。\n",
$diagnostics->cacheMissReason === null => "比较仍在进行中。\n",
default => "cache_miss_reason: {$diagnostics->cacheMissReason->type}\n",
};
require "anthropic"
client = Anthropic::Client.new
SYSTEM = "You are an AI assistant analyzing a large document. <document>...</document>"
# 第 1 轮:使用 previous_message_id: nil 选择加入
r1 = client.beta.messages.create(
model: :"claude-opus-4-7",
max_tokens: 1024,
cache_control: {type: "ephemeral"},
system_: SYSTEM,
messages: [{role: "user", content: "Summarize section 1."}],
diagnostics: {previous_message_id: nil},
betas: ["cache-diagnosis-2026-04-07"]
)
# 第 2 轮:流式传输,引用前一个响应 id
stream = client.beta.messages.stream(
model: :"claude-opus-4-7",
max_tokens: 1024,
cache_control: {type: "ephemeral"},
system_: SYSTEM,
messages: [
{role: "user", content: "Summarize section 1."},
{role: "assistant", content: r1.content},
{role: "user", content: "Now summarize section 2."}
],
diagnostics: {previous_message_id: r1.id},
betas: ["cache-diagnosis-2026-04-07"]
)
stream.each do |event|
print(event.text) if event.is_a?(Anthropic::Streaming::TextEvent)
end
puts
# diagnostics 在 message_start 中到达,并保留在累积消息上
r2 = stream.accumulated_message
case r2.diagnostics
in nil
puts "未检测到分歧。"
in {cache_miss_reason: nil}
puts "比较仍在进行中。"
in {cache_miss_reason: {type:}}
puts "cache_miss_reason: #{type}"
end
message_start 事件携带完整的 diagnostics 字段;有关可能的值,请参阅响应格式。
在对话循环中传递诊断信息
在多轮对话中,将最新的响应 id 作为 previous_message_id 在每轮中传递。第一次迭代传入 null 以选择加入;每次后续迭代传入前一个响应的 id。
此工作流不太适合一次性 shell 命令。有关循环模式,请参阅 SDK 选项卡;每轮 HTTP 请求与基本用法相同。
此工作流不太适合一次性 shell 命令。有关循环模式,请参阅 SDK 选项卡;每轮 CLI 调用与基本用法相同。
import anthropic
client = anthropic.Anthropic()
SYSTEM = "You are an AI assistant analyzing a large document. <document>...</document>"
messages = []
prev_id = None
for i, user_message in enumerate(
["Summarize section 1.", "Now section 2.", "Now section 3."]
):
messages.append({"role": "user", "content": user_message})
r = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
cache_control={"type": "ephemeral"},
system=SYSTEM,
messages=messages,
diagnostics={"previous_message_id": prev_id},
betas=["cache-diagnosis-2026-04-07"],
)
if r.diagnostics is not None and r.diagnostics.cache_miss_reason is not None:
print(f"第 {i + 1} 轮 cache_miss_reason: {r.diagnostics.cache_miss_reason.type}")
messages.append({"role": "assistant", "content": r.content})
prev_id = r.id
import Anthropic from "@anthropic-ai/sdk";
import type { BetaMessageParam } from "@anthropic-ai/sdk/resources/beta";
const client = new Anthropic();
const SYSTEM = "You are an AI assistant analyzing a large document. <document>...</document>";
const prompts = ["Summarize section 1.", "Now section 2.", "Now section 3."];
const messages: BetaMessageParam[] = [];
let prevId: string | null = null;
for (const [i, prompt] of prompts.entries()) {
messages.push({ role: "user", content: prompt });
const r = await client.beta.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
cache_control: { type: "ephemeral" },
system: SYSTEM,
messages,
diagnostics: { previous_message_id: prevId },
betas: ["cache-diagnosis-2026-04-07"]
});
if (r.diagnostics?.cache_miss_reason) {
console.log(`第 ${i + 1} 轮 cache_miss_reason: ${r.diagnostics.cache_miss_reason.type}`);
}
messages.push({ role: "assistant", content: r.content });
prevId = r.id;
}
using Anthropic;
using Anthropic.Models.Beta;
using Anthropic.Models.Beta.Messages;
using Messages = Anthropic.Models.Messages;
using Role = Anthropic.Models.Beta.Messages.Role;
AnthropicClient client = new();
var system = "You are an AI assistant analyzing a large document. <document>...</document>";
List<BetaMessageParam> messages = [];
string? prevId = null;
string[] prompts = ["Summarize section 1.", "Now section 2.", "Now section 3."];
for (int i = 0; i < prompts.Length; i++)
{
messages.Add(new() { Role = Role.User, Content = prompts[i] });
var r = await client.Beta.Messages.Create(
new()
{
Model = Messages::Model.ClaudeOpus4_7,
MaxTokens = 1024,
CacheControl = new(),
System = system,
Messages = messages,
Diagnostics = new() { PreviousMessageID = prevId },
Betas = [AnthropicBeta.CacheDiagnosis2026_04_07],
}
);
if (r.Diagnostics?.CacheMissReason is { Type: var type })
{
Console.WriteLine({{CONTENT}}quot;第 {i + 1} 轮 cache_miss_reason: {type.GetString()}");
}
messages.Add(
new()
{
Role = Role.Assistant,
Content = r.Content.Select(block => new BetaContentBlockParam(block.Json)).ToList(),
}
);
prevId = r.ID;
}
package main
import (
"context"
"fmt"
"github.com/anthropics/anthropic-sdk-go"
"github.com/anthropics/anthropic-sdk-go/packages/param"
)
func main() {
client := anthropic.NewClient()
ctx := context.Background()
system := []anthropic.BetaTextBlockParam{
{Text: "You are an AI assistant analyzing a large document. <document>...</document>"},
}
prompts := []string{"Summarize section 1.", "Now section 2.", "Now section 3."}
var messages []anthropic.BetaMessageParam
prevID := param.Null[string]()
for turn, prompt := range prompts {
messages = append(messages, anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock(prompt)))
r, err := client.Beta.Messages.New(ctx, anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeOpus4_7,
MaxTokens: 1024,
CacheControl: anthropic.BetaCacheControlEphemeralParam{},
System: system,
Messages: messages,
Diagnostics: anthropic.BetaDiagnosticsParam{
PreviousMessageID: prevID,
},
Betas: []anthropic.AnthropicBeta{anthropic.AnthropicBetaCacheDiagnosis2026_04_07},
})
if err != nil {
panic(err)
}
if r.JSON.Diagnostics.Valid() && r.Diagnostics.JSON.CacheMissReason.Valid() {
fmt.Printf("第 %d 轮 cache_miss_reason: %s\n", turn+1, r.Diagnostics.CacheMissReason.Type)
}
messages = append(messages, r.ToParam())
prevID = anthropic.String(r.ID)
}
}
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.core.JsonValue;
import com.anthropic.models.beta.AnthropicBeta;
import com.anthropic.models.beta.messages.BetaCacheControlEphemeral;
import com.anthropic.models.beta.messages.BetaDiagnosticsParam;
import com.anthropic.models.beta.messages.BetaMessageParam;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.messages.Model;
void main() {
var client = AnthropicOkHttpClient.fromEnv();
var system = "You are an AI assistant analyzing a large document. <document>...</document>";
var prompts = List.of("Summarize section 1.", "Now section 2.", "Now section 3.");
var messages = new ArrayList<BetaMessageParam>();
String prevId = null;
for (var turn = 0; turn < prompts.size(); turn++) {
messages.add(
BetaMessageParam.builder()
.role(BetaMessageParam.Role.USER)
.content(prompts.get(turn))
.build()
);
var r = client.beta().messages().create(
MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.cacheControl(BetaCacheControlEphemeral.builder().build())
.system(system)
.messages(messages)
.diagnostics(BetaDiagnosticsParam.builder().previousMessageId(prevId).build())
.addBeta(AnthropicBeta.CACHE_DIAGNOSIS_2026_04_07)
.build()
);
if (r.diagnostics().isPresent() && r.diagnostics().get().cacheMissReason().isPresent()) {
var reason = r.diagnostics().get().cacheMissReason().get();
// CacheMissReason 没有暴露类型化的 .type() 访问器;从原始 JSON 中读取
@SuppressWarnings("unchecked")
var json = (Map<String, JsonValue>) reason._json().orElseThrow().asObject().orElseThrow();
IO.println("第 " + (turn + 1) + " 轮 cache_miss_reason: " + json.get("type").asStringOrThrow());
}
messages.add(r.toParam());
prevId = r.id();
}
}
<?php
use Anthropic\Client;
use Anthropic\Beta\AnthropicBeta;
use Anthropic\Beta\Messages\BetaCacheControlEphemeral;
use Anthropic\Beta\Messages\BetaDiagnosticsParam;
use Anthropic\Messages\Model;
$client = new Client();
$system = 'You are an AI assistant analyzing a large document. <document>...</document>';
$messages = [];
$prevId = null;
foreach (['Summarize section 1.', 'Now section 2.', 'Now section 3.'] as $i => $userMsg) {
$turn = $i + 1;
$messages[] = ['role' => 'user', 'content' => $userMsg];
$r = $client->beta->messages->create(
model: Model::CLAUDE_OPUS_4_7,
maxTokens: 1024,
cacheControl: new BetaCacheControlEphemeral,
system: $system,
messages: $messages,
diagnostics: (new BetaDiagnosticsParam)->withPreviousMessageID($prevId),
betas: [AnthropicBeta::CACHE_DIAGNOSIS_2026_04_07],
);
if ($r->diagnostics?->cacheMissReason !== null) {
echo "第 {$turn} 轮 cache_miss_reason: {$r->diagnostics->cacheMissReason->type}\n";
}
$messages[] = ['role' => 'assistant', 'content' => $r->content];
$prevId = $r->id;
}
require "anthropic"
client = Anthropic::Client.new
SYSTEM = "You are an AI assistant analyzing a large document. <document>...</document>"
messages = []
prev_id = nil
["Summarize section 1.", "Now section 2.", "Now section 3."].each_with_index do |user_msg, i|
messages << {role: "user", content: user_msg}
r = client.beta.messages.create(
model: :"claude-opus-4-7",
max_tokens: 1024,
cache_control: {type: "ephemeral"},
system_: SYSTEM,
messages: messages,
diagnostics: {previous_message_id: prev_id},
betas: ["cache-diagnosis-2026-04-07"]
)
if (reason = r.diagnostics&.cache_miss_reason)
puts "第 #{i + 1} 轮 cache_miss_reason: #{reason.type}"
end
messages << {role: "assistant", content: r.content}
prev_id = r.id
end
响应格式
响应 Message 上的 diagnostics 字段有四种可能的状态:
| 值 | 含义 |
|---|---|
| 字段不存在 | 请求未包含 diagnostics,或 beta 头缺失。 |
null | previous_message_id 为 null(第一轮,没有可比较的内容),或比较已运行且未发现分歧。 |
{"cache_miss_reason": null} | 响应序列化时比较仍在运行。当响应启动非常快时可能发生这种情况。将其视为不确定,并在下一轮检查。 |
{"cache_miss_reason": {...}} | 附加了 cache_miss_reason。对于 *_changed 类型,这标识了第一个分歧点;previous_message_not_found 和 unavailable 是未产生比较的情况。 |
当 cache_miss_reason 非空时,其格式如下:
{
"id": "msg_01Xyz...",
"type": "message",
"role": "assistant",
"content": [{ "type": "text", "text": "..." }],
"usage": {
"input_tokens": 42,
"cache_read_input_tokens": 0,
"cache_creation_input_tokens": 41850,
"output_tokens": 210
},
"diagnostics": {
"cache_miss_reason": {
"type": "system_changed",
"cache_missed_input_tokens": 41850
}
}
}
Cache 未命中原因类型
cache_miss_reason 是基于 type 的可区分联合体。响应仅报告最早的分歧,因此请先修复它;后面的分歧可能被隐藏在其后。
| 类型 | 含义 | 需要更改的内容 |
|---|---|---|
model_changed | model 与先前请求不同(例如,路由器、A/B 测试或回退选择了不同的模型)。Cache 是按模型的。 | 在缓存的对话中保持模型不变。 |
system_changed | system 参数不同。通常是因为时间戳、请求 ID 或其他按请求的值被插入到系统 prompt 中。 | 使系统 prompt 成为字节稳定的常量,并将动态数据移至 cache 断点之后的第一个 user 消息中。 |
tools_changed | tools 数组不同:工具在轮次之间被添加、移除或重新排序,或工具 input_schema JSON 的序列化不是确定性的。 | 在每轮中以固定顺序发送相同的工具列表,并使用确定性序列化的 schema(例如,排序键)。 |
messages_changed | 模型、系统和工具都匹配,但 messages 中的早期条目被更改、重新排序或移除,而不是追加。通常是因为对话历史被截断或编辑,或者助手轮次和 tool_result 块在重新发送时被不同地序列化。 | 将历史记录视为仅追加;将助手 content 和工具结果原样回传。 |
previous_message_not_found | 对提供的 previous_message_id 没有存储的指纹。这并不证明您的请求发生了变化。通常是因为先前的请求没有携带 beta 头、来自不同的工作区,或自发送以来已过太长时间。 | 在每轮中发送 beta 头,并保持连续轮次在时间上紧密相连。 |
unavailable | 此请求的诊断信息不可用。包括 model、system 和 tools 匹配但另一个影响 prompt 的请求参数(tool_choice、thinking、context_management、output_config、output_format 或活动的 anthropic-beta 头集合)不同的情况,以及分歧超出比较范围的非常长的对话。您的请求已被正常处理。 | 在缓存对话的生命周期内保持影响 prompt 的请求参数不变。如果持续出现,请按照 prompt caching 页面上的排查常见问题进行手动检查。 |
四种 *_changed 类型还携带一个 cache_missed_input_tokens 整数:对分歧点之后有多少输入 token 的估计,让您了解有多少可缓存的前缀丢失了。它基于 token 化之前的字节长度推导,因此请将其视为量级指标而非计费数字。它可能与(有时会超过)usage.input_tokens 不同。
结合 usage 读取诊断信息
diagnostics 回答"我的请求是否改变了?",而 usage.cache_read_input_tokens 回答"cache 是否命中?"。将两者结合可以告诉您该在哪里查找。
此矩阵适用于您传递了真实 previous_message_id 的轮次。在第一轮(previous_message_id: null)中,diagnostics 始终为 null,cache_read_input_tokens 通常为零,因为 cache 正在被写入而不是读取;无需排查。当 cache_miss_reason 为 null(比较仍在进行中;检查下一轮)或其 type 为 previous_message_not_found 或 unavailable(未产生比较)时,此矩阵也不适用。
| 诊断结果 | Cache 读取 token | 解释 |
|---|---|---|
null | 高 | 符合预期。您的前缀稳定且 cache 命中。 |
null | 低或零 | 您的请求匹配但 cache 条目不再可用。考虑缩短轮次之间的间隔或使用 1 小时 cache TTL。 |
cache_miss_reason 为 *_changed 类型 | 低或零 | 您的 bug。请求发生了变化;修复 type 指示的原因。 |
cache_miss_reason 为 *_changed 类型 | 高 | 罕见。在 prompt 后期发生了变化,但较早的 cache_control 断点仍然命中。值得修复,但影响较小。 |
限制
- Beta: 字段名称和语义在正式发布之前可能会发生变化。
- 仅限 Claude API: 在 Amazon Bedrock 或 Vertex AI 上不可用。
- 有限保留: 用于
previous_message_id查找的指纹在短时间后过期。在紧密间隔的请求之间运行诊断比较。 - 同一工作区: 先前请求必须使用来自同一组织和工作区的 API 密钥发出。
- 比较范围: 对于唯一变化在消息列表深处的非常长的对话,响应可能是
unavailable而不是精确位置。 - 尽力而为: 诊断永远不会阻塞或导致您的请求失败。如果诊断信息不可用,响应返回
unavailable,或在比较仍在运行时返回cache_miss_reason: null。
数据保留
Cache 诊断符合 ZDR 条件(合格)。Anthropic 不会为此功能存储您的 prompt 原始文本或 Claude 的输出。
为每个请求存储的指纹仅由加密哈希和 token 计数估计组成,以响应 id 为键,范围限定在您的组织和工作区。指纹在短时间后过止,不会用于任何其他用途。
有关所有功能的 ZDR 资格,请参阅 API 和数据保留。