Compaction

服务端上下文压缩,用于管理接近上下文窗口限制的长对话。


Note

此功能支持零数据留存(ZDR)。当您的组织有 ZDR 协议时,通过此功能发送的数据在 API 响应返回后不会被存储。

Tip

服务端压缩是管理长对话和智能体工作流中上下文的推荐策略。它以最少的集成工作自动处理上下文管理。

Compaction 通过在接近上下文窗口限制时自动摘要旧上下文,来扩展长对话和任务的有效上下文长度。这不仅仅是关于保持在 token 限制以内。随着对话变长,模型难以在整个历史中保持专注。Compaction 通过用简洁的摘要替换陈旧内容,使活跃上下文保持集中和高效。

Tip

关于长上下文为何会退化以及 compaction 如何帮助的更深入了解,请参阅 Effective context engineering

此功能适用于:

  • 基于聊天的多轮对话,您希望用户长时间使用同一个聊天
  • 需要大量后续工作(通常是 tool use)的任务导向 prompt,可能超出上下文窗口
Note

Compaction 目前处于 beta 阶段。在您的 API 请求中包含 beta 头 compact-2026-01-12 以使用此功能。

支持的模型

Compaction 支持以下模型:

  • Claude Mythos Preview (claude-mythos-preview)
  • Claude Opus 4.7 (claude-opus-4-7)
  • Claude Opus 4.6 (claude-opus-4-6)
  • Claude Sonnet 4.6 (claude-sonnet-4-6)

Compaction 的工作原理

当 compaction 启用时,Claude 会在接近配置的 token 阈值时自动摘要您的对话。API 会:

  1. 检测输入 token 何时超过您指定的触发阈值。
  2. 生成当前对话的摘要。
  3. 创建一个包含摘要的 compaction 块。
  4. 使用压缩后的上下文继续响应。

在后续请求中,将响应追加到您的消息中。API 会自动丢弃 compaction 块之前的所有消息块,从摘要继续对话。

流程图展示 compaction 过程:当输入 token 超过触发阈值时,Claude 在 compaction 块中生成摘要并使用压缩后的上下文继续响应

基本用法

通过在 Messages API 请求中将 compact_20260112 策略添加到 context_management.edits 来启用 compaction。

curl https://api.anthropic.com/v1/messages \
     --header "x-api-key: $ANTHROPIC_API_KEY" \
     --header "anthropic-version: 2023-06-01" \
     --header "anthropic-beta: compact-2026-01-12" \
     --header "content-type: application/json" \
     --data \
'{
    "model": "claude-opus-4-7",
    "max_tokens": 4096,
    "messages": [
        {
            "role": "user",
            "content": "Help me build a website"
        }
    ],
    "context_management": {
        "edits": [
            {
                "type": "compact_20260112"
            }
        ]
    }
}'
ant beta:messages create --beta compact-2026-01-12 <<'YAML'
model: claude-opus-4-7
max_tokens: 4096
messages:
  - role: user
    content: Help me build a website
context_management:
  edits:
    - type: compact_20260112
YAML
import anthropic

client = anthropic.Anthropic()

messages = [{"role": "user", "content": "Help me build a website"}]

response = client.beta.messages.create(
    betas=["compact-2026-01-12"],
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=messages,
    context_management={"edits": [{"type": "compact_20260112"}]},
)

# 将响应(包括任何 compaction 块)追加以继续对话
messages.append({"role": "assistant", "content": response.content})
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const messages: Anthropic.Beta.Messages.BetaMessageParam[] = [
  { role: "user", content: "Help me build a website" }
];

const response = await client.beta.messages.create({
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages,
  context_management: {
    edits: [
      {
        type: "compact_20260112"
      }
    ]
  }
});

// 将响应(包括任何 compaction 块)追加以继续对话
messages.push({
  role: "assistant",
  content: response.content
});
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Beta.Messages;

class Program
{
    static async Task Main(string[] args)
    {
        AnthropicClient client = new();

        var messages = new List<BetaMessageParam>
        {
            new() { Role = Role.User, Content = "Help me build a website" }
        };

        var parameters = new MessageCreateParams
        {
            Betas = ["compact-2026-01-12"],
            Model = "claude-opus-4-7",
            MaxTokens = 4096,
            Messages = messages,
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit()]
            }
        };

        var response = await client.Beta.Messages.Create(parameters);

        // 将响应(包括任何 compaction 块)追加以继续对话
        messages.Add(new BetaMessageParam
        {
            Role = Role.Assistant,
            Content = response.Content.Select(b => new BetaContentBlockParam(b.Json)).ToList()
        });

        Console.WriteLine(response);
    }
}
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()

	messages := []anthropic.BetaMessageParam{
		anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Help me build a website")),
	}

	response, err := client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 4096,
		Messages:  messages,
		ContextManagement: anthropic.BetaContextManagementConfigParam{
			Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
				{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{}},
			},
		},
		Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})
	if err != nil {
		log.Fatal(err)
	}

	// 将响应(包括任何 compaction 块)追加以继续对话
	messages = append(messages, response.ToParam())

	fmt.Println(response)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.beta.messages.BetaMessage;
import com.anthropic.models.beta.messages.BetaContextManagementConfig;
import com.anthropic.models.beta.messages.BetaCompact20260112Edit;

public class CompactionExample {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        MessageCreateParams params = MessageCreateParams.builder()
            .addBeta("compact-2026-01-12")
            .model("claude-opus-4-7")
            .maxTokens(4096L)
            .addUserMessage("Help me build a website")
            .contextManagement(BetaContextManagementConfig.builder()
                .addEdit(BetaCompact20260112Edit.builder().build())
                .build())
            .build();

        BetaMessage response = client.beta().messages().create(params);

        // 将响应(包括任何 compaction 块)追加以继续对话
        // 通过将其包含在下一个请求的消息中
        System.out.println(response);
    }
}
<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));

$messages = [
    ['role' => 'user', 'content' => 'Help me build a website']
];

$response = $client->beta->messages->create(
    maxTokens: 4096,
    messages: $messages,
    model: 'claude-opus-4-7',
    betas: ['compact-2026-01-12'],
    contextManagement: [
        'edits' => [
            ['type' => 'compact_20260112']
        ]
    ]
);

// 将响应(包括任何 compaction 块)追加以继续对话
$messages[] = ['role' => 'assistant', 'content' => $response->content];

echo $response->content[0]->text;
require "anthropic"

client = Anthropic::Client.new

messages = [
  { role: "user", content: "Help me build a website" }
]

response = client.beta.messages.create(
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages: messages,
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
)

# 将响应(包括任何 compaction 块)追加以继续对话
messages << { role: "assistant", content: response.content }

puts response

参数

参数类型默认值描述
typestring必填必须为 "compact_20260112"
triggerobject150,000 tokens何时触发 compaction。必须至少为 50,000 tokens。
pause_after_compactionbooleanfalse是否在生成 compaction 摘要后暂停
instructionsstringnull自定义摘要 prompt。提供时完全替换默认 prompt。

触发配置

使用 trigger 参数配置 compaction 何时触发:

ant beta:messages create --beta compact-2026-01-12 <<'YAML'
model: claude-opus-4-7
max_tokens: 4096
messages:
  - role: user
    content: Hello, Claude
context_management:
  edits:
    - type: compact_20260112
      trigger:
        type: input_tokens
        value: 150000
YAML
import anthropic

client = anthropic.Anthropic()
messages = [{"role": "user", "content": "Hello, Claude"}]
response = client.beta.messages.create(
    betas=["compact-2026-01-12"],
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=messages,
    context_management={
        "edits": [
            {
                "type": "compact_20260112",
                "trigger": {"type": "input_tokens", "value": 150000},
            }
        ]
    },
)
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();
const messages: Anthropic.Beta.Messages.BetaMessageParam[] = [];

const response = await client.beta.messages.create({
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages,
  context_management: {
    edits: [
      {
        type: "compact_20260112",
        trigger: {
          type: "input_tokens",
          value: 150000
        }
      }
    ]
  }
});
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Beta.Messages;

class Program
{
    static async Task Main(string[] args)
    {
        AnthropicClient client = new();
        List<BetaMessageParam> messages = [new() { Role = Role.User, Content = "Hello" }];

        var parameters = new MessageCreateParams
        {
            Model = "claude-opus-4-7",
            MaxTokens = 4096,
            Betas = ["compact-2026-01-12"],
            Messages = messages,
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit
                {
                    Trigger = new BetaInputTokensTrigger(150000)
                }]
            }
        };

        var message = await client.Beta.Messages.Create(parameters);
        Console.WriteLine(message);
    }
}
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()
	messages := []anthropic.BetaMessageParam{anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Hello, Claude"))}

	response, err := client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 4096,
		Messages:  messages,
		ContextManagement: anthropic.BetaContextManagementConfigParam{
			Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
				{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{
					Trigger: anthropic.BetaInputTokensTriggerParam{Value: 150000},
				}},
			},
		},
		Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(response)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.beta.messages.BetaMessage;
import com.anthropic.models.beta.messages.BetaContextManagementConfig;
import com.anthropic.models.beta.messages.BetaCompact20260112Edit;
import com.anthropic.models.beta.messages.BetaInputTokensTrigger;

public class CompactionExample {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        MessageCreateParams params = MessageCreateParams.builder()
            .model("claude-opus-4-7")
            .maxTokens(4096L)
            .addBeta("compact-2026-01-12")
            .addUserMessage("Hello, Claude")
            .contextManagement(BetaContextManagementConfig.builder()
                .addEdit(BetaCompact20260112Edit.builder()
                    .trigger(BetaInputTokensTrigger.builder()
                        .value(150000L)
                        .build())
                    .build())
                .build())
            .build();

        BetaMessage response = client.beta().messages().create(params);
        System.out.println(response);
    }
}
<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
$messages = [['role' => 'user', 'content' => 'Hello, Claude']];

$message = $client->beta->messages->create(
    maxTokens: 4096,
    messages: $messages,
    model: 'claude-opus-4-7',
    betas: ['compact-2026-01-12'],
    contextManagement: [
        'edits' => [
            [
                'type' => 'compact_20260112',
                'trigger' => [
                    'type' => 'input_tokens',
                    'value' => 150000
                ]
            ]
        ]
    ]
);

echo $message;
require "anthropic"

client = Anthropic::Client.new
messages = [{ role: "user", content: "Hello, Claude" }]

response = client.beta.messages.create(
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages: messages,
  context_management: {
    edits: [
      {
        type: "compact_20260112",
        trigger: {
          type: "input_tokens",
          value: 150000
        }
      }
    ]
  }
)
puts response

自定义摘要指令

默认情况下,compaction 使用以下摘要 prompt:

You have written a partial transcript for the initial task above. Please write a summary of the transcript. The purpose of this summary is to provide continuity so you can continue to make progress towards solving the task in a future context, where the raw history above may not be accessible and will be replaced with this summary. Write down anything that would be helpful, including the state, next steps, learnings etc. You must wrap your summary in a <summary></summary> block.

您可以通过 instructions 参数提供自定义指令来完全替换此 prompt。自定义指令不会补充默认 prompt,而是完全替换它:

ant beta:messages create --beta compact-2026-01-12 <<'YAML'
model: claude-opus-4-7
max_tokens: 4096
messages:
  - role: user
    content: Hello, Claude
context_management:
  edits:
    - type: compact_20260112
      instructions: >-
        Focus on preserving code snippets, variable names, and
        technical decisions.
YAML
import anthropic

client = anthropic.Anthropic()
messages = [{"role": "user", "content": "Hello, Claude"}]
response = client.beta.messages.create(
    betas=["compact-2026-01-12"],
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=messages,
    context_management={
        "edits": [
            {
                "type": "compact_20260112",
                "instructions": "Focus on preserving code snippets, variable names, and technical decisions.",
            }
        ]
    },
)
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();
const messages: Anthropic.Beta.Messages.BetaMessageParam[] = [];

const response = await client.beta.messages.create({
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages,
  context_management: {
    edits: [
      {
        type: "compact_20260112",
        instructions:
          "Focus on preserving code snippets, variable names, and technical decisions."
      }
    ]
  }
});
using System;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Beta.Messages;

class Program
{
    static async Task Main(string[] args)
    {
        AnthropicClient client = new();

        var parameters = new MessageCreateParams
        {
            Betas = ["compact-2026-01-12"],
            Model = "claude-opus-4-7",
            MaxTokens = 4096,
            Messages =
            [
                new BetaMessageParam { Role = Role.User, Content = "Help me build a Python web scraper" },
                new BetaMessageParam { Role = Role.Assistant, Content = "I'll help you build a web scraper..." },
                new BetaMessageParam { Role = Role.User, Content = "Add support for JavaScript-rendered pages" }
            ],
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit
                {
                    Instructions = "Focus on preserving code snippets, variable names, and technical decisions."
                }]
            }
        };

        var message = await client.Beta.Messages.Create(parameters);
        Console.WriteLine(message);
    }
}
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()

	response, err := client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 4096,
		Messages: []anthropic.BetaMessageParam{
			anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Help me build a Python web scraper")),
			{Role: anthropic.BetaMessageParamRoleAssistant, Content: []anthropic.BetaContentBlockParamUnion{anthropic.NewBetaTextBlock("I'll help you build a web scraper...")}},
			anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Add support for JavaScript-rendered pages")),
		},
		ContextManagement: anthropic.BetaContextManagementConfigParam{
			Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
				{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{
					Instructions: anthropic.String("Focus on preserving code snippets, variable names, and technical decisions."),
				}},
			},
		},
		Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(response)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.beta.messages.BetaMessage;
import com.anthropic.models.beta.messages.BetaContextManagementConfig;
import com.anthropic.models.beta.messages.BetaCompact20260112Edit;

public class CompactionExample {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        MessageCreateParams params = MessageCreateParams.builder()
            .addBeta("compact-2026-01-12")
            .model("claude-opus-4-7")
            .maxTokens(4096L)
            .addUserMessage("Help me build a Python web scraper")
            .addAssistantMessage("I'll help you build a web scraper...")
            .addUserMessage("Add support for JavaScript-rendered pages")
            .contextManagement(BetaContextManagementConfig.builder()
                .addEdit(BetaCompact20260112Edit.builder()
                    .instructions("Focus on preserving code snippets, variable names, and technical decisions.")
                    .build())
                .build())
            .build();

        BetaMessage response = client.beta().messages().create(params);
        System.out.println(response);
    }
}
<?php
use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));

$response = $client->beta->messages->create(
    maxTokens: 4096,
    messages: [
        ['role' => 'user', 'content' => 'Help me build a Python web scraper'],
        ['role' => 'assistant', 'content' => "I'll help you build a web scraper..."],
        ['role' => 'user', 'content' => 'Add support for JavaScript-rendered pages']
    ],
    model: 'claude-opus-4-7',
    betas: ['compact-2026-01-12'],
    contextManagement: [
        'edits' => [
            [
                'type' => 'compact_20260112',
                'instructions' => 'Focus on preserving code snippets, variable names, and technical decisions.'
            ]
        ]
    ]
);

echo $response->content[0]->text;
require "anthropic"

client = Anthropic::Client.new

response = client.beta.messages.create(
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages: [
    { role: "user", content: "Help me build a Python web scraper" },
    { role: "assistant", content: "I'll help you build a web scraper..." },
    { role: "user", content: "Add support for JavaScript-rendered pages" }
  ],
  context_management: {
    edits: [
      {
        type: "compact_20260112",
        instructions:
          "Focus on preserving code snippets, variable names, and technical decisions."
      }
    ]
  }
)

puts response

压缩后暂停

使用 pause_after_compaction 在生成 compaction 摘要后暂停 API。这允许您在 API 继续响应之前添加额外的内容块(例如保留最近的消息或特定的指令消息)。

启用后,API 在生成 compaction 块后返回一个带有 compaction 停止原因的消息:

ant beta:messages create --beta compact-2026-01-12 \
  --transform '{stop_reason,content}' --format jsonl <<'YAML' > resp.json
model: claude-opus-4-7
max_tokens: 4096
messages:
  - role: user
    content: "Hello, Claude"
context_management:
  edits:
    - type: compact_20260112
      pause_after_compaction: true
YAML

# 检查 compaction 是否触发了暂停
if grep -q '"stop_reason":"compaction"' resp.json; then
  # 响应仅包含 compaction 块
  RESP=$(cat resp.json)
  CONTENT="${RESP#*\"content\":}"
  printf '%s' "${CONTENT%\}}" > content.json

  # 继续请求
  ant beta:messages create --beta compact-2026-01-12 <<YAML > /dev/null
model: claude-opus-4-7
max_tokens: 4096
messages:
  - role: user
    content: "Hello, Claude"
  - role: assistant
    content: $(cat content.json)
context_management:
  edits:
    - type: compact_20260112
YAML
fi
import anthropic

client = anthropic.Anthropic()
messages = [{"role": "user", "content": "Hello, Claude"}]
response = client.beta.messages.create(
    betas=["compact-2026-01-12"],
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=messages,
    context_management={
        "edits": [{"type": "compact_20260112", "pause_after_compaction": True}]
    },
)

# 检查 compaction 是否触发了暂停
if response.stop_reason == "compaction":
    # 响应仅包含 compaction 块
    messages.append({"role": "assistant", "content": response.content})

    # 继续请求
    response = client.beta.messages.create(
        betas=["compact-2026-01-12"],
        model="claude-opus-4-7",
        max_tokens=4096,
        messages=messages,
        context_management={"edits": [{"type": "compact_20260112"}]},
    )
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();
const messages: Anthropic.Beta.Messages.BetaMessageParam[] = [
  { role: "user", content: "Hello, Claude" }
];

let response = await client.beta.messages.create({
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages,
  context_management: {
    edits: [
      {
        type: "compact_20260112",
        pause_after_compaction: true
      }
    ]
  }
});

// 检查 compaction 是否触发了暂停
if (response.stop_reason === "compaction") {
  // 响应仅包含 compaction 块
  messages.push({
    role: "assistant",
    content: response.content
  });

  // 继续请求
  response = await client.beta.messages.create({
    betas: ["compact-2026-01-12"],
    model: "claude-opus-4-7",
    max_tokens: 4096,
    messages,
    context_management: {
      edits: [{ type: "compact_20260112" }]
    }
  });
}
using Anthropic;
using Anthropic.Models.Beta.Messages;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

class Program
{
    static async Task Main(string[] args)
    {
        var client = new AnthropicClient();
        var messages = new List<BetaMessageParam>
        {
            new() { Role = Role.User, Content = "Hello, Claude" }
        };

        var parameters = new MessageCreateParams
        {
            Model = "claude-opus-4-7",
            MaxTokens = 4096,
            Betas = ["compact-2026-01-12"],
            Messages = messages,
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit
                {
                    PauseAfterCompaction = true
                }]
            }
        };

        var response = await client.Beta.Messages.Create(parameters);

        if (response.StopReason == BetaStopReason.Compaction)
        {
            messages.Add(new BetaMessageParam
            {
                Role = Role.Assistant,
                Content = response.Content.Select(b => new BetaContentBlockParam(b.Json)).ToList()
            });

            parameters = new()
            {
                Model = "claude-opus-4-7",
                MaxTokens = 4096,
                Betas = ["compact-2026-01-12"],
                Messages = messages,
                ContextManagement = new BetaContextManagementConfig
                {
                    Edits = [new BetaCompact20260112Edit()]
                }
            };

            response = await client.Beta.Messages.Create(parameters);
        }

        Console.WriteLine(response);
    }
}
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()
	messages := []anthropic.BetaMessageParam{anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Hello, Claude"))}

	compactEdit := anthropic.BetaContextManagementConfigParam{
		Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
			{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{
				PauseAfterCompaction: anthropic.Bool(true),
			}},
		},
	}

	response, err := client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
		Model:             anthropic.ModelClaudeOpus4_7,
		MaxTokens:         4096,
		Messages:          messages,
		ContextManagement: compactEdit,
		Betas:             []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})
	if err != nil {
		log.Fatal(err)
	}

	if response.StopReason == "compaction" {
		messages = append(messages, response.ToParam())

		response, err = client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
			Model:     anthropic.ModelClaudeOpus4_7,
			MaxTokens: 4096,
			Messages:  messages,
			ContextManagement: anthropic.BetaContextManagementConfigParam{
				Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
					{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{}},
				},
			},
			Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
		})
		if err != nil {
			log.Fatal(err)
		}
	}

	fmt.Println(response)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.beta.messages.BetaMessage;
import com.anthropic.models.beta.messages.BetaContextManagementConfig;
import com.anthropic.models.beta.messages.BetaCompact20260112Edit;
import com.anthropic.models.beta.messages.BetaStopReason;

public class CompactionPauseExample {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        MessageCreateParams params = MessageCreateParams.builder()
            .model("claude-opus-4-7")
            .maxTokens(4096L)
            .addBeta("compact-2026-01-12")
            .addUserMessage("Help me build a website")
            .contextManagement(BetaContextManagementConfig.builder()
                .addEdit(BetaCompact20260112Edit.builder()
                    .pauseAfterCompaction(true)
                    .build())
                .build())
            .build();

        BetaMessage response = client.beta().messages().create(params);

        // 检查 compaction 是否触发了暂停
        if (response.stopReason().isPresent()
                && response.stopReason().get().equals(BetaStopReason.COMPACTION)) {
            // 追加 compaction 块并继续请求
            // 通过使用压缩后的上下文构建新请求
            MessageCreateParams continueParams = MessageCreateParams.builder()
                .model("claude-opus-4-7")
                .maxTokens(4096L)
                .addBeta("compact-2026-01-12")
                .addUserMessage("Help me build a website")
                .addMessage(response)
                .contextManagement(BetaContextManagementConfig.builder()
                    .addEdit(BetaCompact20260112Edit.builder().build())
                    .build())
                .build();

            response = client.beta().messages().create(continueParams);
        }

        System.out.println(response);
    }
}
<?php

use Anthropic\Client;

// PHP SDK 尚未暴露 `compaction` 停止原因的类型常量;直接比较字符串值。
$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
$messages = [['role' => 'user', 'content' => 'Hello, Claude']];

$response = $client->beta->messages->create(
    maxTokens: 4096,
    messages: $messages,
    model: 'claude-opus-4-7',
    betas: ['compact-2026-01-12'],
    contextManagement: [
        'edits' => [
            [
                'type' => 'compact_20260112',
                'pause_after_compaction' => true
            ]
        ]
    ]
);

if ($response->stopReason === 'compaction') {
    $messages[] = [
        'role' => 'assistant',
        'content' => $response->content
    ];

    $response = $client->beta->messages->create(
        maxTokens: 4096,
        messages: $messages,
        model: 'claude-opus-4-7',
        betas: ['compact-2026-01-12'],
        contextManagement: [
            'edits' => [
                ['type' => 'compact_20260112']
            ]
        ]
    );
}

echo $response;
require "anthropic"

client = Anthropic::Client.new
messages = [{ role: "user", content: "Hello, Claude" }]

response = client.beta.messages.create(
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages: messages,
  context_management: {
    edits: [
      {
        type: "compact_20260112",
        pause_after_compaction: true
      }
    ]
  }
)

if response.stop_reason == :compaction
  messages << { role: "assistant", content: response.content }

  response = client.beta.messages.create(
    betas: ["compact-2026-01-12"],
    model: "claude-opus-4-7",
    max_tokens: 4096,
    messages: messages,
    context_management: {
      edits: [{ type: "compact_20260112" }]
    }
  )
end

puts response

强制总 token 预算

当模型处理具有多次 tool use 迭代的长任务时,总 token 消耗可能显著增长。您可以将 pause_after_compaction 与压缩计数器结合使用,来估算累积使用量并在达到预算时优雅地结束任务:

import anthropic

client = anthropic.Anthropic()
messages = [{"role": "user", "content": "Hello, Claude"}]
TRIGGER_THRESHOLD = 100_000
TOTAL_TOKEN_BUDGET = 3_000_000
n_compactions = 0

response = client.beta.messages.create(
    betas=["compact-2026-01-12"],
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=messages,
    context_management={
        "edits": [
            {
                "type": "compact_20260112",
                "trigger": {"type": "input_tokens", "value": TRIGGER_THRESHOLD},
                "pause_after_compaction": True,
            }
        ]
    },
)

if response.stop_reason == "compaction":
    n_compactions += 1
    messages.append({"role": "assistant", "content": response.content})

    # 估算消耗的总 token;如果超出预算则提示结束
    if n_compactions * TRIGGER_THRESHOLD >= TOTAL_TOKEN_BUDGET:
        messages.append(
            {
                "role": "user",
                "content": "Please wrap up your current work and summarize the final state.",
            }
        )

使用 compaction 块

当 compaction 被触发时,API 在助手响应的开头返回一个 compaction 块。

长时间运行的对话可能导致多次 compaction。最后一个 compaction 块反映 prompt 的最终状态,用生成的摘要替换其之前的内容。

{
  "content": [
    {
      "type": "compaction",
      "content": "Summary of the conversation: The user requested help building a web scraper..."
    },
    {
      "type": "text",
      "text": "Based on our conversation so far..."
    }
  ]
}

传回 compaction 块

您必须在后续请求中将 compaction 块传回 API,以使用缩短后的 prompt 继续对话。最简单的方法是将整个响应内容追加到您的消息中:

ant beta:messages create --beta compact-2026-01-12 \
  --transform content --format jsonl <<'YAML' > content.json
model: claude-opus-4-7
max_tokens: 4096
messages:
  - role: user
    content: Hello, Claude
context_management:
  edits:
    - type: compact_20260112
YAML

# 收到带有 compaction 块的响应后,将其作为助手轮次追加
# 并继续对话
ant beta:messages create --beta compact-2026-01-12 <<YAML
model: claude-opus-4-7
max_tokens: 4096
messages:
  - role: user
    content: Hello, Claude
  - role: assistant
    content: $(cat content.json)
  - role: user
    content: Now add error handling
context_management:
  edits:
    - type: compact_20260112
YAML
import anthropic

client = anthropic.Anthropic()
messages = [{"role": "user", "content": "Hello, Claude"}]
response = client.beta.messages.create(
    betas=["compact-2026-01-12"],
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=messages,
    context_management={"edits": [{"type": "compact_20260112"}]},
)
# 收到带有 compaction 块的响应后
messages.append({"role": "assistant", "content": response.content})

# 继续对话
messages.append({"role": "user", "content": "Now add error handling"})

response = client.beta.messages.create(
    betas=["compact-2026-01-12"],
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=messages,
    context_management={"edits": [{"type": "compact_20260112"}]},
)
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();
const messages: Anthropic.Beta.Messages.BetaMessageParam[] = [];

// 假设我们已经有了之前请求的响应
const response = await client.beta.messages.create({
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages,
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
});

// 收到带有 compaction 块的响应后
messages.push({
  role: "assistant",
  content: response.content
});

// 继续对话
messages.push({ role: "user", content: "Now add error handling" });

const nextResponse = await client.beta.messages.create({
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages,
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
});
using Anthropic;
using Anthropic.Models.Beta.Messages;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

class Program
{
    static async Task Main(string[] args)
    {
        AnthropicClient client = new();

        var messages = new List<BetaMessageParam>
        {
            new() { Role = Role.User, Content = "Help me build a web scraper" }
        };

        var response = await client.Beta.Messages.Create(new()
        {
            Betas = ["compact-2026-01-12"],
            Model = "claude-opus-4-7",
            MaxTokens = 4096,
            Messages = messages,
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit()]
            }
        });

        messages.Add(new BetaMessageParam
        {
            Role = Role.Assistant,
            Content = response.Content.Select(b => new BetaContentBlockParam(b.Json)).ToList()
        });

        messages.Add(new BetaMessageParam { Role = Role.User, Content = "Now add error handling" });

        var nextResponse = await client.Beta.Messages.Create(new()
        {
            Betas = ["compact-2026-01-12"],
            Model = "claude-opus-4-7",
            MaxTokens = 4096,
            Messages = messages,
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit()]
            }
        });

        Console.WriteLine(nextResponse);
    }
}
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()

	messages := []anthropic.BetaMessageParam{
		anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Help me build a web scraper")),
	}

	compactEdit := anthropic.BetaContextManagementConfigParam{
		Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
			{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{}},
		},
	}

	response, err := client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
		Model:             anthropic.ModelClaudeOpus4_7,
		MaxTokens:         4096,
		Messages:          messages,
		ContextManagement: compactEdit,
		Betas:             []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})
	if err != nil {
		log.Fatal(err)
	}

	messages = append(messages, response.ToParam())

	messages = append(messages, anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Now add error handling")))

	nextResponse, err := client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
		Model:             anthropic.ModelClaudeOpus4_7,
		MaxTokens:         4096,
		Messages:          messages,
		ContextManagement: compactEdit,
		Betas:             []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(nextResponse)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.beta.messages.BetaMessage;
import com.anthropic.models.beta.messages.BetaContextManagementConfig;
import com.anthropic.models.beta.messages.BetaCompact20260112Edit;

public class CompactionExample {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        // 第一个请求
        BetaMessage response = client.beta().messages().create(
            MessageCreateParams.builder()
                .addBeta("compact-2026-01-12")
                .model("claude-opus-4-7")
                .maxTokens(4096L)
                .addUserMessage("Help me build a web scraper")
                .contextManagement(BetaContextManagementConfig.builder()
                    .addEdit(BetaCompact20260112Edit.builder().build())
                    .build())
                .build());

        // 收到带有 compaction 块的响应后,追加完整内容
        // (包括 compaction 块)并继续对话
        BetaMessage nextResponse = client.beta().messages().create(
            MessageCreateParams.builder()
                .addBeta("compact-2026-01-12")
                .model("claude-opus-4-7")
                .maxTokens(4096L)
                .addUserMessage("Help me build a web scraper")
                .addMessage(response)
                .addUserMessage("Now add error handling")
                .contextManagement(BetaContextManagementConfig.builder()
                    .addEdit(BetaCompact20260112Edit.builder().build())
                    .build())
                .build());

        System.out.println(nextResponse);
    }
}
<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));

$messages = [
    ['role' => 'user', 'content' => 'Help me build a web scraper']
];

$response = $client->beta->messages->create(
    maxTokens: 4096,
    messages: $messages,
    model: 'claude-opus-4-7',
    betas: ['compact-2026-01-12'],
    contextManagement: [
        'edits' => [['type' => 'compact_20260112']]
    ]
);

$messages[] = ['role' => 'assistant', 'content' => $response->content];

$messages[] = ['role' => 'user', 'content' => 'Now add error handling'];

$nextResponse = $client->beta->messages->create(
    maxTokens: 4096,
    messages: $messages,
    model: 'claude-opus-4-7',
    betas: ['compact-2026-01-12'],
    contextManagement: [
        'edits' => [['type' => 'compact_20260112']]
    ]
);

echo $nextResponse->content[0]->text;
require "anthropic"

client = Anthropic::Client.new

messages = [
  { role: "user", content: "Help me build a web scraper" }
]

response = client.beta.messages.create(
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages: messages,
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
)

messages << { role: "assistant", content: response.content }

messages << { role: "user", content: "Now add error handling" }

next_response = client.beta.messages.create(
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages: messages,
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
)

puts next_response.content

当 API 收到 compaction 块时,其之前的所有内容块都会被忽略。您可以:

  • 保留原始消息列表,让 API 处理压缩内容的移除
  • 手动丢弃压缩的消息,仅从 compaction 块开始包含

流式传输

使用 compaction 启用流式传输时,compaction 开始时您会收到 content_block_start 事件。Compaction 块的流式传输方式与文本块不同。您会收到 content_block_start 事件,然后是包含完整摘要内容的单个 content_block_delta(没有中间流式传输),最后是 content_block_stop 事件。

ant beta:messages create --stream --format jsonl \
  --beta compact-2026-01-12 <<'YAML'
model: claude-opus-4-7
max_tokens: 4096
messages:
  - role: user
    content: Hello, Claude
context_management:
  edits:
    - type: compact_20260112
YAML
import anthropic

client = anthropic.Anthropic()
messages = [{"role": "user", "content": "Hello, Claude"}]

with client.beta.messages.stream(
    betas=["compact-2026-01-12"],
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=messages,
    context_management={"edits": [{"type": "compact_20260112"}]},
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            if event.content_block.type == "compaction":
                print("Compaction started...")
            elif event.content_block.type == "text":
                print("Text response started...")

        elif event.type == "content_block_delta":
            if event.delta.type == "compaction_delta":
                print(f"Compaction complete: {len(event.delta.content or '')} chars")
            elif event.delta.type == "text_delta":
                print(event.delta.text, end="", flush=True)

    # 获取最终累积的消息
    message = stream.get_final_message()
    messages.append({"role": "assistant", "content": message.content})
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();
const messages: Anthropic.Beta.Messages.BetaMessageParam[] = [];

const stream = await client.beta.messages.stream({
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages,
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
});

for await (const event of stream) {
  if (event.type === "content_block_start") {
    if (event.content_block.type === "compaction") {
      console.log("Compaction started...");
    } else if (event.content_block.type === "text") {
      console.log("Text response started...");
    }
  } else if (event.type === "content_block_delta") {
    if (event.delta.type === "compaction_delta") {
      console.log(`Compaction complete: ${event.delta.content?.length ?? 0} chars`);
    } else if (event.delta.type === "text_delta") {
      process.stdout.write(event.delta.text);
    }
  }
}

// 获取最终累积的消息
const message = await stream.finalMessage();
messages.push({
  role: "assistant",
  content: message.content
});
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Beta.Messages;

class Program
{
    static async Task Main(string[] args)
    {
        var client = new AnthropicClient();
        List<BetaMessageParam> messages = [new() { Role = Role.User, Content = "Hello" }];

        var parameters = new MessageCreateParams
        {
            Betas = ["compact-2026-01-12"],
            Model = "claude-opus-4-7",
            MaxTokens = 4096,
            Messages = messages,
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit()]
            }
        };

        await foreach (var streamEvent in client.Beta.Messages.CreateStreaming(parameters))
        {
            if (streamEvent.TryPickContentBlockStart(out var startEvent))
            {
                if (startEvent.ContentBlock.TryPickBetaCompaction(out _))
                {
                    Console.WriteLine("Compaction started...");
                }
                else if (startEvent.ContentBlock.TryPickBetaText(out _))
                {
                    Console.WriteLine("Text response started...");
                }
            }
            else if (streamEvent.TryPickContentBlockDelta(out var deltaEvent))
            {
                if (deltaEvent.Delta.TryPickCompaction(out var compactionDelta))
                {
                    Console.WriteLine({{CONTENT}}quot;Compaction complete: {compactionDelta.Content?.Length ?? 0} chars");
                }
                else if (deltaEvent.Delta.TryPickText(out var textDelta))
                {
                    Console.Write(textDelta.Text);
                }
            }
        }
    }
}
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()
	messages := []anthropic.BetaMessageParam{anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Hello, Claude"))}

	stream := client.Beta.Messages.NewStreaming(context.TODO(), anthropic.BetaMessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 4096,
		Messages:  messages,
		ContextManagement: anthropic.BetaContextManagementConfigParam{
			Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
				{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{}},
			},
		},
		Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})

	for stream.Next() {
		event := stream.Current()
		switch eventVariant := event.AsAny().(type) {
		case anthropic.BetaRawContentBlockStartEvent:
			switch eventVariant.ContentBlock.AsAny().(type) {
			case anthropic.BetaCompactionBlock:
				fmt.Println("Compaction started...")
			case anthropic.BetaTextBlock:
				fmt.Println("Text response started...")
			}
		case anthropic.BetaRawContentBlockDeltaEvent:
			switch deltaVariant := eventVariant.Delta.AsAny().(type) {
			case anthropic.BetaCompactionContentBlockDelta:
				fmt.Printf("Compaction complete: %d chars\n", len(deltaVariant.Content))
			case anthropic.BetaTextDelta:
				fmt.Print(deltaVariant.Text)
			}
		}
	}
	if err := stream.Err(); err != nil {
		log.Fatal(err)
	}
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.beta.messages.BetaContextManagementConfig;
import com.anthropic.models.beta.messages.BetaCompact20260112Edit;

public class CompactionStreamingExample {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        MessageCreateParams params = MessageCreateParams.builder()
            .model("claude-opus-4-7")
            .maxTokens(4096L)
            .addBeta("compact-2026-01-12")
            .addUserMessage("Hello, Claude")
            .contextManagement(BetaContextManagementConfig.builder()
                .addEdit(BetaCompact20260112Edit.builder().build())
                .build())
            .build();

        try (var streamResponse = client.beta().messages().createStreaming(params)) {
            streamResponse.stream().forEach(event -> {
                event.contentBlockStart().ifPresent(startEvent -> {
                    startEvent.contentBlock().compaction().ifPresent(c ->
                        System.out.println("Compaction started...")
                    );
                    startEvent.contentBlock().text().ifPresent(t ->
                        System.out.println("Text response started...")
                    );
                });

                event.contentBlockDelta().ifPresent(deltaEvent -> {
                    deltaEvent.delta().compaction().ifPresent(cd ->
                        System.out.println("Compaction complete: " + cd.content().map(String::length).orElse(0) + " chars")
                    );
                    deltaEvent.delta().text().ifPresent(td ->
                        System.out.print(td.text())
                    );
                });
            });
        }
    }
}
<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
$messages = [['role' => 'user', 'content' => 'Hello, Claude']];

$stream = $client->beta->messages->createStream(
    maxTokens: 4096,
    messages: $messages,
    model: 'claude-opus-4-7',
    betas: ['compact-2026-01-12'],
    contextManagement: [
        'edits' => [
            ['type' => 'compact_20260112']
        ]
    ]
);

foreach ($stream as $event) {
    if ($event->type === 'content_block_start') {
        if ($event->contentBlock->type === 'compaction') {
            echo "Compaction started...\n";
        } elseif ($event->contentBlock->type === 'text') {
            echo "Text response started...\n";
        }
    } elseif ($event->type === 'content_block_delta') {
        if ($event->delta->type === 'compaction_delta') {
            echo "Compaction complete: " . strlen($event->delta->content ?? '') . " chars\n";
        } elseif ($event->delta->type === 'text_delta') {
            echo $event->delta->text;
        }
    }
}
require "anthropic"

client = Anthropic::Client.new
messages = [{ role: "user", content: "Hello, Claude" }]

stream = client.beta.messages.stream(
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  messages: messages,
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
)

stream.each do |event|
  case event.type
  when :content_block_start
    if event.content_block.type == :compaction
      puts "Compaction started..."
    elsif event.content_block.type == :text
      puts "Text response started..."
    end
  when :content_block_delta
    if event.delta.type == :compaction_delta
      puts "Compaction complete: #{(event.delta.content || "").length} chars"
    elsif event.delta.type == :text_delta
      print event.delta.text
    end
  end
end

Prompt caching

Compaction 与 prompt caching 配合良好。您可以在 compaction 块上添加 cache_control 断点来缓存摘要内容。原始压缩内容会被忽略。

{
  "role": "assistant",
  "content": [
    {
      "type": "compaction",
      "content": "[summary text]",
      "cache_control": { "type": "ephemeral" }
    },
    {
      "type": "text",
      "text": "Based on our conversation..."
    }
  ]
}

使用系统 prompt 最大化缓存命中

当 compaction 发生时,摘要成为需要写入缓存的新内容。如果没有额外的缓存断点,这也会使缓存的系统 prompt 失效,需要将其与 compaction 摘要一起重新缓存。

为了最大化缓存命中率,请在系统 prompt 的末尾添加 cache_control 断点。这使系统 prompt 与对话分开缓存,因此当 compaction 发生时:

  • 系统 prompt 缓存保持有效并从缓存读取
  • 只有 compaction 摘要需要作为新的缓存条目写入
ant beta:messages create --beta compact-2026-01-12 <<'YAML'
model: claude-opus-4-7
max_tokens: 4096
system:
  - type: text
    text: You are a helpful coding assistant...
    cache_control:
      type: ephemeral
messages:
  - role: user
    content: Hello, Claude
context_management:
  edits:
    - type: compact_20260112
YAML
import anthropic

client = anthropic.Anthropic()
messages = [{"role": "user", "content": "Hello, Claude"}]
response = client.beta.messages.create(
    betas=["compact-2026-01-12"],
    model="claude-opus-4-7",
    max_tokens=4096,
    system=[
        {
            "type": "text",
            "text": "You are a helpful coding assistant...",
            "cache_control": {
                "type": "ephemeral"
            },  # 单独缓存系统 prompt
        }
    ],
    messages=messages,
    context_management={"edits": [{"type": "compact_20260112"}]},
)
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();
const messages: Anthropic.Beta.Messages.BetaMessageParam[] = [];

const response = await client.beta.messages.create({
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  system: [
    {
      type: "text",
      text: "You are a helpful coding assistant...",
      cache_control: { type: "ephemeral" } // 单独缓存系统 prompt
    }
  ],
  messages,
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
});
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Beta.Messages;

class Program
{
    static async Task Main(string[] args)
    {
        var client = new AnthropicClient();

        var parameters = new MessageCreateParams
        {
            Betas = ["compact-2026-01-12"],
            Model = "claude-opus-4-7",
            MaxTokens = 4096,
            System = new List<BetaTextBlockParam>
            {
                new()
                {
                    Text = "You are a helpful coding assistant...",
                    CacheControl = new BetaCacheControlEphemeral()
                }
            },
            Messages = [],
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit()]
            }
        };

        var response = await client.Beta.Messages.Create(parameters);
        Console.WriteLine(response);
    }
}
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()

	response, err := client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 4096,
		System: []anthropic.BetaTextBlockParam{
			{
				Text:         "You are a helpful coding assistant...",
				CacheControl: anthropic.NewBetaCacheControlEphemeralParam(),
			},
		},
		Messages: []anthropic.BetaMessageParam{anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Hello, Claude"))},
		ContextManagement: anthropic.BetaContextManagementConfigParam{
			Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
				{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{}},
			},
		},
		Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(response)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.beta.messages.BetaMessage;
import com.anthropic.models.beta.messages.BetaTextBlockParam;
import com.anthropic.models.beta.messages.BetaContextManagementConfig;
import com.anthropic.models.beta.messages.BetaCompact20260112Edit;
import com.anthropic.models.beta.messages.BetaCacheControlEphemeral;
import java.util.List;

public class CompactionExample {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        MessageCreateParams params = MessageCreateParams.builder()
            .model("claude-opus-4-7")
            .maxTokens(4096L)
            .addBeta("compact-2026-01-12")
            .systemOfBetaTextBlockParams(List.of(
                BetaTextBlockParam.builder()
                    .text("You are a helpful coding assistant...")
                    .cacheControl(BetaCacheControlEphemeral.builder().build())
                    .build()
            ))
            .addUserMessage("Hello, Claude")
            .contextManagement(BetaContextManagementConfig.builder()
                .addEdit(BetaCompact20260112Edit.builder().build())
                .build())
            .build();

        BetaMessage response = client.beta().messages().create(params);
        System.out.println(response);
    }
}
<?php
use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));

$response = $client->beta->messages->create(
    maxTokens: 4096,
    messages: [['role' => 'user', 'content' => 'Hello, Claude']],
    model: 'claude-opus-4-7',
    betas: ['compact-2026-01-12'],
    system: [
        [
            'type' => 'text',
            'text' => 'You are a helpful coding assistant...',
            'cache_control' => [
                'type' => 'ephemeral'
            ]
        ]
    ],
    contextManagement: [
        'edits' => [
            ['type' => 'compact_20260112']
        ]
    ]
);

echo $response->content[0]->text;
require "anthropic"

client = Anthropic::Client.new

response = client.beta.messages.create(
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  max_tokens: 4096,
  system: [
    {
      type: "text",
      text: "You are a helpful coding assistant...",
      cache_control: {
        type: "ephemeral"
      }
    }
  ],
  messages: [],
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
)
puts response

这种方法对长系统 prompt 特别有利,因为即使在对话中发生多次 compaction 事件,它们仍然保持缓存状态。

理解使用量

Compaction 需要额外的采样步骤,这会影响速率限制和计费。API 在响应中返回详细的使用信息:

{
  "usage": {
    "input_tokens": 23000,
    "output_tokens": 1000,
    "iterations": [
      {
        "type": "compaction",
        "input_tokens": 180000,
        "output_tokens": 3500
      },
      {
        "type": "message",
        "input_tokens": 23000,
        "output_tokens": 1000
      }
    ]
  }
}

iterations 数组显示每个采样迭代的使用量。当 compaction 发生时,您会看到一个 compaction 迭代,然后是主要的 message 迭代。在此示例中,顶级 input_tokensoutput_tokensmessage 迭代完全匹配,因为只有一个非 compaction 迭代。最后一次迭代的 token 计数反映了 compaction 后的有效上下文大小。

Note

顶级 input_tokensoutput_tokens 不包括 compaction 迭代的使用量。它们反映所有非 compaction 迭代的总和。要计算请求消耗和计费的总 token,请对 usage.iterations 数组中的所有条目求和。

如果您之前依赖 usage.input_tokensusage.output_tokens 进行成本跟踪或审计,当 compaction 启用时,您需要更新跟踪逻辑以跨 usage.iterations 进行聚合。iterations 数组仅在请求期间触发新 compaction 时才会填充。重新应用之前的 compaction 块不会产生额外的 compaction 费用,在这种情况下顶级使用量字段仍然准确。

与其他功能结合

服务端工具

使用服务端工具(如 web search)时,compaction 触发器在每次采样迭代开始时检查。根据您的触发阈值和生成的输出量,compaction 可能在单个请求中发生多次。

Token 计数

Token 计数端点(/v1/messages/count_tokens)会应用 prompt 中现有的 compaction 块,但不会触发新的 compaction。使用它来检查之前 compaction 后的有效 token 计数:

cat > request.yaml <<'YAML'
model: claude-opus-4-7
messages:
  - role: user
    content: Hello, Claude
context_management:
  edits:
    - type: compact_20260112
YAML

CURRENT=$(ant beta:messages count-tokens \
  --beta compact-2026-01-12 \
  --transform input_tokens --raw-output < request.yaml)

ORIGINAL=$(ant beta:messages count-tokens \
  --beta compact-2026-01-12 \
  --transform context_management.original_input_tokens \
  --raw-output < request.yaml)

printf 'Current tokens: %s\n' "$CURRENT"
printf 'Original tokens: %s\n' "$ORIGINAL"
import anthropic

client = anthropic.Anthropic()
messages = [{"role": "user", "content": "Hello, Claude"}]
count_response = client.beta.messages.count_tokens(
    betas=["compact-2026-01-12"],
    model="claude-opus-4-7",
    messages=messages,
    context_management={"edits": [{"type": "compact_20260112"}]},
)

print(f"Current tokens: {count_response.input_tokens}")
print(f"Original tokens: {count_response.context_management.original_input_tokens}")
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();
const messages: Anthropic.Beta.Messages.BetaMessageParam[] = [
  { role: "user", content: "Summarize the key points of our conversation so far." }
];

const countResponse = await client.beta.messages.countTokens({
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  messages,
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
});

console.log(`Current tokens: ${countResponse.input_tokens}`);
console.log(`Original tokens: ${countResponse.context_management!.original_input_tokens}`);
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Beta.Messages;

class Program
{
    static async Task Main(string[] args)
    {
        AnthropicClient client = new();
        List<BetaMessageParam> messages = [new() { Role = Role.User, Content = "Hello" }];

        var countParams = new MessageCountTokensParams
        {
            Model = "claude-opus-4-7",
            Messages = messages,
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit()]
            },
            Betas = ["compact-2026-01-12"]
        };

        var countResponse = await client.Beta.Messages.CountTokens(countParams);
        Console.WriteLine({{CONTENT}}quot;Current tokens: {countResponse.InputTokens}");
        Console.WriteLine({{CONTENT}}quot;Original tokens: {countResponse.ContextManagement?.OriginalInputTokens}");
    }
}
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()
	messages := []anthropic.BetaMessageParam{anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("Hello, Claude"))}

	countResponse, err := client.Beta.Messages.CountTokens(context.TODO(), anthropic.BetaMessageCountTokensParams{
		Model:    anthropic.ModelClaudeOpus4_7,
		Messages: messages,
		ContextManagement: anthropic.BetaContextManagementConfigParam{
			Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
				{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{}},
			},
		},
		Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})
	if err != nil {
		log.Fatal(err)
	}

	fmt.Printf("Current tokens: %d\n", countResponse.InputTokens)
	fmt.Printf("Original tokens: %d\n", countResponse.ContextManagement.OriginalInputTokens)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.messages.BetaMessageTokensCount;
import com.anthropic.models.beta.messages.MessageCountTokensParams;
import com.anthropic.models.beta.messages.BetaContextManagementConfig;
import com.anthropic.models.beta.messages.BetaCompact20260112Edit;

public class Main {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        MessageCountTokensParams params = MessageCountTokensParams.builder()
            .model("claude-opus-4-7")
            .addUserMessage("Hello, Claude")
            .contextManagement(BetaContextManagementConfig.builder()
                .addEdit(BetaCompact20260112Edit.builder().build())
                .build())
            .addBeta("compact-2026-01-12")
            .build();

        BetaMessageTokensCount countResponse = client.beta().messages().countTokens(params);
        System.out.println("Current tokens: " + countResponse.inputTokens());
        System.out.println("Original tokens: " + countResponse.contextManagement().get().originalInputTokens());
    }
}
<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
$messages = [['role' => 'user', 'content' => 'Hello, Claude']];

$countResponse = $client->beta->messages->countTokens(
    messages: $messages,
    model: 'claude-opus-4-7',
    betas: ['compact-2026-01-12'],
    contextManagement: [
        'edits' => [
            ['type' => 'compact_20260112']
        ]
    ]
);

echo "Current tokens: " . $countResponse->inputTokens . "\n";
echo "Original tokens: " . $countResponse->contextManagement->originalInputTokens . "\n";
require "anthropic"

client = Anthropic::Client.new
messages = [{ role: "user", content: "Hello, Claude" }]

count_response = client.beta.messages.count_tokens(
  betas: ["compact-2026-01-12"],
  model: "claude-opus-4-7",
  messages: messages,
  context_management: {
    edits: [{ type: "compact_20260112" }]
  }
)

puts "Current tokens: #{count_response.input_tokens}"
puts "Original tokens: #{count_response.context_management.original_input_tokens}"

示例

以下是使用 compaction 的长对话完整示例:

# CLI 处理单个轮次;在调用脚本中维护 messages 数组。
# 请参阅 SDK 标签页获取完整的 chat() 循环。单轮请求形式:
ant beta:messages create --beta compact-2026-01-12 \
  --transform 'content.#(type=="text").text' --raw-output <<'YAML'
model: claude-opus-4-7
max_tokens: 4096
messages:
  - role: user
    content: Help me build a Python web scraper
context_management:
  edits:
    - type: compact_20260112
      trigger:
        type: input_tokens
        value: 100000
YAML
import anthropic

client = anthropic.Anthropic()

messages: list[dict] = []


def chat(user_message: str) -> str:
    messages.append({"role": "user", "content": user_message})

    response = client.beta.messages.create(
        betas=["compact-2026-01-12"],
        model="claude-opus-4-7",
        max_tokens=4096,
        messages=messages,
        context_management={
            "edits": [
                {
                    "type": "compact_20260112",
                    "trigger": {"type": "input_tokens", "value": 100000},
                }
            ]
        },
    )

    # 追加响应(compaction 块自动包含)
    messages.append({"role": "assistant", "content": response.content})

    # 返回文本内容
    return next(block.text for block in response.content if block.type == "text")


# 运行长对话
print(chat("Help me build a Python web scraper"))
print(chat("Add support for JavaScript-rendered pages"))
print(chat("Now add rate limiting and error handling"))
# ... 根据需要继续
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const messages: Anthropic.Beta.BetaMessageParam[] = [];

async function chat(userMessage: string): Promise<string> {
  messages.push({ role: "user", content: userMessage });

  const response = await client.beta.messages.create({
    betas: ["compact-2026-01-12"],
    model: "claude-opus-4-7",
    max_tokens: 4096,
    messages,
    context_management: {
      edits: [
        {
          type: "compact_20260112",
          trigger: { type: "input_tokens", value: 100000 }
        }
      ]
    }
  });

  // 追加响应(compaction 块自动包含)
  messages.push({ role: "assistant", content: response.content });

  // 返回文本内容
  const textBlock = response.content.find((block) => block.type === "text");
  return textBlock?.text ?? "";
}

// 运行长对话
console.log(await chat("Help me build a Python web scraper"));
console.log(await chat("Add support for JavaScript-rendered pages"));
console.log(await chat("Now add rate limiting and error handling"));
// ... 根据需要继续
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Beta.Messages;

public class Program
{
    static AnthropicClient client = new();
    static List<BetaMessageParam> messages = new();

    static async Task Main(string[] args)
    {
        Console.WriteLine(await Chat("Help me build a Python web scraper"));
        Console.WriteLine(await Chat("Add support for JavaScript-rendered pages"));
        Console.WriteLine(await Chat("Now add rate limiting and error handling"));
    }

    static async Task<string> Chat(string userMessage)
    {
        messages.Add(new() { Role = Role.User, Content = userMessage });

        var parameters = new MessageCreateParams
        {
            Betas = ["compact-2026-01-12"],
            Model = "claude-opus-4-7",
            MaxTokens = 4096,
            Messages = messages,
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit
                {
                    Trigger = new BetaInputTokensTrigger(100000)
                }]
            }
        };

        var response = await client.Beta.Messages.Create(parameters);

        messages.Add(new()
        {
            Role = Role.Assistant,
            Content = response.Content.Select(b => new BetaContentBlockParam(b.Json)).ToList()
        });

        return response.Content
            .Select(b => b.Value)
            .OfType<BetaTextBlock>()
            .Select(tb => tb.Text)
            .FirstOrDefault() ?? "";
    }
}
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

var (
	client   = anthropic.NewClient()
	messages []anthropic.BetaMessageParam
)

func chat(userMessage string) string {
	messages = append(messages, anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock(userMessage)))

	response, err := client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 4096,
		Messages:  messages,
		ContextManagement: anthropic.BetaContextManagementConfigParam{
			Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
				{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{
					Trigger: anthropic.BetaInputTokensTriggerParam{Value: 100000},
				}},
			},
		},
		Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})
	if err != nil {
		log.Fatal(err)
	}

	messages = append(messages, response.ToParam())

	for _, block := range response.Content {
		if variant, ok := block.AsAny().(anthropic.BetaTextBlock); ok {
			return variant.Text
		}
	}
	return ""
}

func main() {
	fmt.Println(chat("Help me build a Python web scraper"))
	fmt.Println(chat("Add support for JavaScript-rendered pages"))
	fmt.Println(chat("Now add rate limiting and error handling"))
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.beta.messages.BetaMessage;
import com.anthropic.models.beta.messages.BetaMessageParam;
import com.anthropic.models.beta.messages.BetaContextManagementConfig;
import com.anthropic.models.beta.messages.BetaCompact20260112Edit;
import com.anthropic.models.beta.messages.BetaInputTokensTrigger;
import java.util.ArrayList;
import java.util.List;

public class CompactionExample {
    private static final AnthropicClient client = AnthropicOkHttpClient.fromEnv();
    private static final List<BetaMessageParam> messages = new ArrayList<>();

    public static void main(String[] args) {
        System.out.println(chat("Help me build a Python web scraper"));
        System.out.println(chat("Add support for JavaScript-rendered pages"));
        System.out.println(chat("Now add rate limiting and error handling"));
    }

    private static String chat(String userMessage) {
        messages.add(BetaMessageParam.builder()
            .role(BetaMessageParam.Role.USER)
            .content(userMessage)
            .build());

        MessageCreateParams params = MessageCreateParams.builder()
            .addBeta("compact-2026-01-12")
            .model("claude-opus-4-7")
            .maxTokens(4096L)
            .messages(messages)
            .contextManagement(BetaContextManagementConfig.builder()
                .addEdit(BetaCompact20260112Edit.builder()
                    .trigger(BetaInputTokensTrigger.builder()
                        .value(100000L)
                        .build())
                    .build())
                .build())
            .build();

        BetaMessage response = client.beta().messages().create(params);

        // 追加响应(compaction 块自动包含)
        messages.add(response.toParam());

        return response.content().stream()
            .filter(block -> block.text().isPresent())
            .map(block -> block.text().get().text())
            .findFirst()
            .orElse("");
    }
}
<?php
use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
$messages = [];

function chat($client, &$messages, $userMessage) {
    $messages[] = ['role' => 'user', 'content' => $userMessage];

    $response = $client->beta->messages->create(
        maxTokens: 4096,
        messages: $messages,
        model: 'claude-opus-4-7',
        betas: ['compact-2026-01-12'],
        contextManagement: [
            'edits' => [
                [
                    'type' => 'compact_20260112',
                    'trigger' => ['type' => 'input_tokens', 'value' => 100000]
                ]
            ]
        ]
    );

    $messages[] = ['role' => 'assistant', 'content' => $response->content];

    foreach ($response->content as $block) {
        if ($block->type === 'text') {
            return $block->text;
        }
    }
    return '';
}

echo chat($client, $messages, "Help me build a Python web scraper") . "\n";
echo chat($client, $messages, "Add support for JavaScript-rendered pages") . "\n";
echo chat($client, $messages, "Now add rate limiting and error handling") . "\n";
require "anthropic"

client = Anthropic::Client.new
messages = []

def chat(client, messages, user_message)
  messages << { role: "user", content: user_message }

  response = client.beta.messages.create(
    betas: ["compact-2026-01-12"],
    model: "claude-opus-4-7",
    max_tokens: 4096,
    messages: messages,
    context_management: {
      edits: [
        {
          type: "compact_20260112",
          trigger: { type: "input_tokens", value: 100000 }
        }
      ]
    }
  )

  messages << { role: "assistant", content: response.content }

  response.content.find { |block| block.type == :text }&.text || ""
end

puts chat(client, messages, "Help me build a Python web scraper")
puts chat(client, messages, "Add support for JavaScript-rendered pages")
puts chat(client, messages, "Now add rate limiting and error handling")

以下是使用 pause_after_compaction 保留先前交换和当前用户消息(共三条消息)而不是摘要它们的示例:

# CLI 处理单个轮次;在调用脚本中维护 messages 数组。
# 请参阅 SDK 标签页获取带有暂停和保留处理的完整 chat() 循环。
# 单轮请求形式:
ant beta:messages create --beta compact-2026-01-12 \
  --transform 'content.#(type=="text").text' --raw-output <<'YAML'
model: claude-opus-4-7
max_tokens: 4096
messages:
  - role: user
    content: Help me build a Python web scraper
context_management:
  edits:
    - type: compact_20260112
      trigger:
        type: input_tokens
        value: 100000
      pause_after_compaction: true
YAML
import anthropic
from typing import Any

client = anthropic.Anthropic()

messages: list[dict[str, Any]] = []


def chat(user_message: str) -> str:
    messages.append({"role": "user", "content": user_message})

    response = client.beta.messages.create(
        betas=["compact-2026-01-12"],
        model="claude-opus-4-7",
        max_tokens=4096,
        messages=messages,
        context_management={
            "edits": [
                {
                    "type": "compact_20260112",
                    "trigger": {"type": "input_tokens", "value": 100000},
                    "pause_after_compaction": True,
                }
            ]
        },
    )

    # 检查 compaction 是否发生并暂停
    if response.stop_reason == "compaction":
        # 从响应中获取 compaction 块
        compaction_block = response.content[0]

        # 保留先前的交换 + 当前用户消息(3 条消息)
        # 通过在 compaction 块之后包含它们
        preserved_messages = messages[-3:] if len(messages) >= 3 else messages

        # 构建新的消息列表:compaction + 保留的消息
        new_assistant_content = [compaction_block]
        messages_after_compaction = [
            {"role": "assistant", "content": new_assistant_content}
        ] + preserved_messages

        # 使用压缩后的上下文 + 保留的消息继续请求
        response = client.beta.messages.create(
            betas=["compact-2026-01-12"],
            model="claude-opus-4-7",
            max_tokens=4096,
            messages=messages_after_compaction,
            context_management={"edits": [{"type": "compact_20260112"}]},
        )

        # 更新消息列表以反映 compaction
        messages.clear()
        messages.extend(messages_after_compaction)

    # 追加最终响应
    messages.append({"role": "assistant", "content": response.content})

    # 返回文本内容
    return next(block.text for block in response.content if block.type == "text")


# 运行长对话
print(chat("Help me build a Python web scraper"))
print(chat("Add support for JavaScript-rendered pages"))
print(chat("Now add rate limiting and error handling"))
# ... 根据需要继续
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

let messages: Anthropic.Beta.BetaMessageParam[] = [];

async function chat(userMessage: string): Promise<string> {
  messages.push({ role: "user", content: userMessage });

  let response = await client.beta.messages.create({
    betas: ["compact-2026-01-12"],
    model: "claude-opus-4-7",
    max_tokens: 4096,
    messages,
    context_management: {
      edits: [
        {
          type: "compact_20260112",
          trigger: { type: "input_tokens", value: 100000 },
          pause_after_compaction: true
        }
      ]
    }
  });

  // 检查 compaction 是否发生并暂停
  if (response.stop_reason === "compaction") {
    // 从响应中获取 compaction 块
    const compactionBlock = response.content[0];

    // 保留先前的交换 + 当前用户消息(3 条消息)
    // 通过在 compaction 块之后包含它们
    const preservedMessages = messages.length >= 3 ? messages.slice(-3) : [...messages];

    // 构建新的消息列表:compaction + 保留的消息
    const messagesAfterCompaction: Anthropic.Beta.BetaMessageParam[] = [
      { role: "assistant", content: [compactionBlock] },
      ...preservedMessages
    ];

    // 使用压缩后的上下文 + 保留的消息继续请求
    response = await client.beta.messages.create({
      betas: ["compact-2026-01-12"],
      model: "claude-opus-4-7",
      max_tokens: 4096,
      messages: messagesAfterCompaction,
      context_management: {
        edits: [{ type: "compact_20260112" }]
      }
    });

    // 更新消息列表以反映 compaction
    messages = messagesAfterCompaction;
  }

  // 追加最终响应
  messages.push({ role: "assistant", content: response.content });

  // 返回文本内容
  const textBlock = response.content.find((block) => block.type === "text");
  return textBlock?.text ?? "";
}

// 运行长对话
console.log(await chat("Help me build a Python web scraper"));
console.log(await chat("Add support for JavaScript-rendered pages"));
console.log(await chat("Now add rate limiting and error handling"));
// ... 根据需要继续
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Beta.Messages;

public class CompactionExample
{
    private static AnthropicClient client = new();
    private static List<BetaMessageParam> messages = new();

    static async Task<string> Chat(string userMessage)
    {
        messages.Add(new() { Role = Role.User, Content = userMessage });

        var response = await client.Beta.Messages.Create(new()
        {
            Betas = ["compact-2026-01-12"],
            Model = "claude-opus-4-7",
            MaxTokens = 4096,
            Messages = messages,
            ContextManagement = new BetaContextManagementConfig
            {
                Edits = [new BetaCompact20260112Edit
                {
                    Trigger = new BetaInputTokensTrigger(100000),
                    PauseAfterCompaction = true
                }]
            }
        });

        if (response.StopReason == BetaStopReason.Compaction)
        {
            if (!response.Content[0].TryPickCompaction(out var cb))
                throw new InvalidOperationException("Expected compaction block");

            var preserved = messages.Count >= 3
                ? messages.Skip(messages.Count - 3).ToList()
                : new List<BetaMessageParam>(messages);

            var messagesAfterCompaction = new List<BetaMessageParam>
            {
                new()
                {
                    Role = Role.Assistant,
                    Content = new List<BetaContentBlockParam> { new BetaCompactionBlockParam(cb.Content) }
                }
            };
            messagesAfterCompaction.AddRange(preserved);

            response = await client.Beta.Messages.Create(new()
            {
                Betas = ["compact-2026-01-12"],
                Model = "claude-opus-4-7",
                MaxTokens = 4096,
                Messages = messagesAfterCompaction,
                ContextManagement = new BetaContextManagementConfig
                {
                    Edits = [new BetaCompact20260112Edit()]
                }
            });

            messages = messagesAfterCompaction;
        }

        messages.Add(new()
        {
            Role = Role.Assistant,
            Content = response.Content.Select(b => new BetaContentBlockParam(b.Json)).ToList()
        });

        return response.Content
            .Select(b => b.Value)
            .OfType<BetaTextBlock>()
            .Select(tb => tb.Text)
            .FirstOrDefault() ?? "";
    }

    static async Task Main()
    {
        Console.WriteLine(await Chat("Help me build a Python web scraper"));
        Console.WriteLine(await Chat("Add support for JavaScript-rendered pages"));
        Console.WriteLine(await Chat("Now add rate limiting and error handling"));
    }
}
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

var (
	client   = anthropic.NewClient()
	messages []anthropic.BetaMessageParam
)

func chat(userMessage string) string {
	messages = append(messages, anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock(userMessage)))

	compactEdit := anthropic.BetaContextManagementConfigParam{
		Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
			{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{
				Trigger:              anthropic.BetaInputTokensTriggerParam{Value: 100000},
				PauseAfterCompaction: anthropic.Bool(true),
			}},
		},
	}

	response, err := client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
		Model:             anthropic.ModelClaudeOpus4_7,
		MaxTokens:         4096,
		Messages:          messages,
		ContextManagement: compactEdit,
		Betas:             []anthropic.AnthropicBeta{"compact-2026-01-12"},
	})
	if err != nil {
		log.Fatal(err)
	}

	if response.StopReason == "compaction" {
		compactionParam := response.Content[0].ToParam()

		var preserved []anthropic.BetaMessageParam
		if len(messages) >= 3 {
			preserved = messages[len(messages)-3:]
		} else {
			preserved = messages
		}

		messagesAfterCompaction := []anthropic.BetaMessageParam{
			{Role: anthropic.BetaMessageParamRoleAssistant, Content: []anthropic.BetaContentBlockParamUnion{compactionParam}},
		}
		messagesAfterCompaction = append(messagesAfterCompaction, preserved...)

		response, err = client.Beta.Messages.New(context.TODO(), anthropic.BetaMessageNewParams{
			Model:     anthropic.ModelClaudeOpus4_7,
			MaxTokens: 4096,
			Messages:  messagesAfterCompaction,
			ContextManagement: anthropic.BetaContextManagementConfigParam{
				Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
					{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{}},
				},
			},
			Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
		})
		if err != nil {
			log.Fatal(err)
		}

		messages = messagesAfterCompaction
	}

	messages = append(messages, response.ToParam())

	for _, block := range response.Content {
		if textBlock, ok := block.AsAny().(anthropic.BetaTextBlock); ok {
			return textBlock.Text
		}
	}
	return ""
}

func main() {
	fmt.Println(chat("Help me build a Python web scraper"))
	fmt.Println(chat("Add support for JavaScript-rendered pages"))
	fmt.Println(chat("Now add rate limiting and error handling"))
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.messages.MessageCreateParams;
import com.anthropic.models.beta.messages.BetaMessage;
import com.anthropic.models.beta.messages.BetaMessageParam;
import com.anthropic.models.beta.messages.BetaContextManagementConfig;
import com.anthropic.models.beta.messages.BetaCompact20260112Edit;
import com.anthropic.models.beta.messages.BetaInputTokensTrigger;
import com.anthropic.models.beta.messages.BetaStopReason;
import java.util.ArrayList;
import java.util.List;

public class CompactionExample {
    private static final AnthropicClient client = AnthropicOkHttpClient.fromEnv();
    private static final List<BetaMessageParam> messages = new ArrayList<>();

    public static String chat(String userMessage) {
        messages.add(BetaMessageParam.builder()
            .role(BetaMessageParam.Role.USER)
            .content(userMessage)
            .build());

        MessageCreateParams params = MessageCreateParams.builder()
            .addBeta("compact-2026-01-12")
            .model("claude-opus-4-7")
            .maxTokens(4096L)
            .messages(messages)
            .contextManagement(BetaContextManagementConfig.builder()
                .addEdit(BetaCompact20260112Edit.builder()
                    .trigger(BetaInputTokensTrigger.builder()
                        .value(100000L)
                        .build())
                    .pauseAfterCompaction(true)
                    .build())
                .build())
            .build();

        BetaMessage response = client.beta().messages().create(params);

        // 检查 compaction 是否发生并暂停
        if (response.stopReason().isPresent()
                && response.stopReason().get().equals(BetaStopReason.COMPACTION)) {
            // 保留先前的交换 + 当前用户消息(3 条消息)
            List<BetaMessageParam> preservedMessages = messages.size() >= 3
                ? new ArrayList<>(messages.subList(messages.size() - 3, messages.size()))
                : new ArrayList<>(messages);

            // 构建新的消息列表:compaction + 保留的消息
            List<BetaMessageParam> messagesAfterCompaction = new ArrayList<>();
            messagesAfterCompaction.add(response.toParam());
            messagesAfterCompaction.addAll(preservedMessages);

            // 使用压缩后的上下文 + 保留的消息继续请求
            MessageCreateParams continueParams = MessageCreateParams.builder()
                .addBeta("compact-2026-01-12")
                .model("claude-opus-4-7")
                .maxTokens(4096L)
                .messages(messagesAfterCompaction)
                .contextManagement(BetaContextManagementConfig.builder()
                    .addEdit(BetaCompact20260112Edit.builder().build())
                    .build())
                .build();

            response = client.beta().messages().create(continueParams);

            // 更新消息列表以反映 compaction
            messages.clear();
            messages.addAll(messagesAfterCompaction);
        }

        // 追加最终响应
        messages.add(response.toParam());

        return response.content().stream()
            .filter(block -> block.text().isPresent())
            .map(block -> block.text().get().text())
            .findFirst()
            .orElse("");
    }

    public static void main(String[] args) {
        System.out.println(chat("Help me build a Python web scraper"));
        System.out.println(chat("Add support for JavaScript-rendered pages"));
        System.out.println(chat("Now add rate limiting and error handling"));
    }
}
<?php

use Anthropic\Client;

// PHP SDK 尚未暴露 `compaction` 停止原因的类型常量;直接比较字符串值。
$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
$messages = [];

function chat($client, &$messages, $userMessage) {
    $messages[] = ['role' => 'user', 'content' => $userMessage];

    $response = $client->beta->messages->create(
        maxTokens: 4096,
        messages: $messages,
        model: 'claude-opus-4-7',
        betas: ['compact-2026-01-12'],
        contextManagement: [
            'edits' => [
                [
                    'type' => 'compact_20260112',
                    'trigger' => ['type' => 'input_tokens', 'value' => 100000],
                    'pause_after_compaction' => true
                ]
            ]
        ]
    );

    if ($response->stopReason === 'compaction') {
        $compactionBlock = $response->content[0];

        $preserved = count($messages) >= 3
            ? array_slice($messages, -3)
            : $messages;

        $messagesAfterCompaction = array_merge(
            [['role' => 'assistant', 'content' => [$compactionBlock]]],
            $preserved
        );

        $response = $client->beta->messages->create(
            maxTokens: 4096,
            messages: $messagesAfterCompaction,
            model: 'claude-opus-4-7',
            betas: ['compact-2026-01-12'],
            contextManagement: [
                'edits' => [['type' => 'compact_20260112']]
            ]
        );

        $messages = $messagesAfterCompaction;
    }

    $messages[] = ['role' => 'assistant', 'content' => $response->content];

    foreach ($response->content as $block) {
        if ($block->type === 'text') {
            return $block->text;
        }
    }
    return '';
}

echo chat($client, $messages, "Help me build a Python web scraper") . "\n";
echo chat($client, $messages, "Add support for JavaScript-rendered pages") . "\n";
echo chat($client, $messages, "Now add rate limiting and error handling") . "\n";
require "anthropic"

client = Anthropic::Client.new
messages = []

def chat(client, messages, user_message)
  messages << { role: "user", content: user_message }

  response = client.beta.messages.create(
    betas: ["compact-2026-01-12"],
    model: "claude-opus-4-7",
    max_tokens: 4096,
    messages: messages,
    context_management: {
      edits: [
        {
          type: "compact_20260112",
          trigger: { type: "input_tokens", value: 100000 },
          pause_after_compaction: true
        }
      ]
    }
  )

  if response.stop_reason == :compaction
    compaction_block = response.content[0]

    preserved = messages.length >= 3 ? messages[-3..-1] : messages.dup

    messages_after_compaction = [
      { role: "assistant", content: [compaction_block] }
    ] + preserved

    response = client.beta.messages.create(
      betas: ["compact-2026-01-12"],
      model: "claude-opus-4-7",
      max_tokens: 4096,
      messages: messages_after_compaction,
      context_management: {
        edits: [{ type: "compact_20260112" }]
      }
    )

    messages.clear
    messages.concat(messages_after_compaction)
  end

  messages << { role: "assistant", content: response.content }

  response.content.find { |block| block.type == :text }&.text || ""
end

puts chat(client, messages, "Help me build a Python web scraper")
puts chat(client, messages, "Add support for JavaScript-rendered pages")
puts chat(client, messages, "Now add rate limiting and error handling")

当前限制

  • 使用相同模型进行摘要: 请求中指定的模型用于摘要。没有选项使用不同的(例如更便宜的)模型进行摘要。

  • 定义工具时 compaction 可能失败: 当您的请求包含 tools 时,模型偶尔会在内部摘要步骤中调用工具而不是编写摘要。发生这种情况时,响应包含一个 content: nullcompaction 块。为防止这种情况,请将 instructions 设置为明确告诉模型不要调用工具的 prompt,例如:

    Summarize the transcript inside <summary></summary> tags. Include relevant information in the summary for continuing the task in the next context window. Do not call any tools while writing this summary; respond with text only.
    

后续步骤