自适应思考

让 Claude 通过自适应思考模式动态决定何时使用以及使用多少扩展思考。

Note

此功能符合零数据保留 (ZDR) 条件。当您的组织拥有 ZDR 协议时，通过此功能发送的数据在 API 响应返回后不会被存储。

自适应思考是在 Claude Opus 4.7、Claude Opus 4.6 和 Claude Sonnet 4.6 上使用扩展思考的推荐方式，也是 Claude Mythos Preview 上的默认模式（当未设置 thinking 时自动应用）。自适应思考无需手动设置思考 token 预算，而是让 Claude 根据每个请求的复杂度动态决定何时使用以及使用多少扩展思考。在 Claude Opus 4.7 上，自适应思考是唯一支持的思考模式；手动设置 thinking: {type: "enabled", budget_tokens: N} 不再被接受。

Tip

对于许多工作负载，自适应思考可以比使用固定 budget_tokens 的扩展思考带来更好的性能，特别是双峰任务和长时间代理工作流。无需 beta 请求头。

如果您的工作负载需要可预测的延迟或精确控制思考成本，使用 budget_tokens 的扩展思考在 Claude Opus 4.6 和 Claude Sonnet 4.6 上仍然可用，但已被弃用且不再推荐。请参阅下方的警告。

支持的模型

以下模型支持自适应思考：

Claude Mythos Preview (claude-mythos-preview)，自适应思考是默认模式；不支持 thinking: {type: "disabled"}
Claude Opus 4.7 (claude-opus-4-7)，自适应思考是唯一支持的思考模式。除非您在请求中明确设置 thinking: {type: "adaptive"}，否则思考功能处于关闭状态；手动设置 thinking: {type: "enabled"} 将返回 400 错误。
Claude Opus 4.6 (claude-opus-4-6)
Claude Sonnet 4.6 (claude-sonnet-4-6)

Warning

thinking.type: "enabled" 和 budget_tokens 在 Opus 4.6 和 Sonnet 4.6 上已被弃用，将在未来的模型版本中移除。请改用 thinking.type: "adaptive" 配合 effort 参数。现有的 budget_tokens 配置仍然可用但不再推荐；请计划迁移。

较旧的模型（Sonnet 4.5、Opus 4.5 等）不支持自适应思考，需要使用 thinking.type: "enabled" 配合 budget_tokens。

自适应思考的工作原理

在自适应模式下，思考对模型来说是可选的。Claude 会评估每个请求的复杂度，并决定是否使用以及使用多少扩展思考。在默认的 effort 级别（high）下，Claude 几乎总是会进行思考。在较低的 effort 级别下，Claude 可能会跳过对较简单问题的思考。

自适应思考还会自动启用交错思考。这意味着 Claude 可以在工具调用之间进行思考，使其对代理工作流特别有效。

如何使用自适应思考

在 API 请求中将 thinking.type 设置为 "adaptive"：

curl https://api.anthropic.com/v1/messages \
     --header "x-api-key: $ANTHROPIC_API_KEY" \
     --header "anthropic-version: 2023-06-01" \
     --header "content-type: application/json" \
     --data \
'{
    "model": "claude-opus-4-7",
    "max_tokens": 16000,
    "thinking": {
        "type": "adaptive"
    },
    "messages": [
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
}'

ant messages create \
  --model claude-opus-4-7 \
  --max-tokens 16000 \
  --thinking '{type: adaptive}' \
  --message '{role: user, content: Explain why the sum of two even numbers is always even.}' \
  --transform content --format jsonl |
  jq -r '
    if   .type == "thinking" then "\nThinking: \(.thinking)"
    elif .type == "text"     then "\nResponse: \(.text)"
    else empty end'

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even.",
        }
    ],
)

for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nResponse: {block.text}")

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const response = await client.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 16000,
  thinking: {
    type: "adaptive"
  },
  messages: [
    {
      role: "user",
      content: "Explain why the sum of two even numbers is always even."
    }
  ]
});

for (const block of response.content) {
  if (block.type === "thinking") {
    console.log(`\nThinking: ${block.thinking}`);
  } else if (block.type === "text") {
    console.log(`\nResponse: ${block.text}`);
  }
}

using System;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Messages;

class Program
{
    static async Task Main(string[] args)
    {
        AnthropicClient client = new();

        var parameters = new MessageCreateParams
        {
            Model = Model.ClaudeOpus4_7,
            MaxTokens = 16000,
            Thinking = new ThinkingConfigAdaptive(),
            Messages = [
                new() {
                    Role = Role.User,
                    Content = "Explain why the sum of two even numbers is always even."
                }
            ]
        };

        var message = await client.Messages.Create(parameters);

        foreach (var block in message.Content)
        {
            if (block.TryPickThinking(out ThinkingBlock? thinking))
            {
                Console.WriteLine({{CONTENT}}quot;\nThinking: {thinking.Thinking}");
            }
            else if (block.TryPickText(out TextBlock? text))
            {
                Console.WriteLine({{CONTENT}}quot;\nResponse: {text.Text}");
            }
        }
    }
}

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()

	response, err := client.Messages.New(context.TODO(), anthropic.MessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 16000,
		Thinking: anthropic.ThinkingConfigParamUnion{
			OfAdaptive: &anthropic.ThinkingConfigAdaptiveParam{},
		},
		Messages: []anthropic.MessageParam{
			anthropic.NewUserMessage(anthropic.NewTextBlock("Explain why the sum of two even numbers is always even.")),
		},
	})
	if err != nil {
		log.Fatal(err)
	}

	for _, block := range response.Content {
		switch v := block.AsAny().(type) {
		case anthropic.ThinkingBlock:
			fmt.Printf("\nThinking: %s", v.Thinking)
		case anthropic.TextBlock:
			fmt.Printf("\nResponse: %s", v.Text)
		}
	}
}

import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.MessageCreateParams;
import com.anthropic.models.messages.Message;
import com.anthropic.models.messages.Model;
import com.anthropic.models.messages.ThinkingConfigAdaptive;

public class ExtendedThinkingExample {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        MessageCreateParams params = MessageCreateParams.builder()
            .model(Model.CLAUDE_OPUS_4_7)
            .maxTokens(16000L)
            .thinking(ThinkingConfigAdaptive.builder().build())
            .addUserMessage("Explain why the sum of two even numbers is always even.")
            .build();

        Message response = client.messages().create(params);

        response.content().forEach(block -> {
            block.thinking().ifPresent(thinkingBlock ->
                System.out.println("\nThinking: " + thinkingBlock.thinking())
            );
            block.text().ifPresent(textBlock ->
                System.out.println("\nResponse: " + textBlock.text())
            );
        });
    }
}

<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));

$message = $client->messages->create(
    maxTokens: 16000,
    messages: [
        [
            'role' => 'user',
            'content' => 'Explain why the sum of two even numbers is always even.'
        ]
    ],
    model: 'claude-opus-4-7',
    thinking: ['type' => 'adaptive'],
);

foreach ($message->content as $block) {
    if ($block->type === 'thinking') {
        echo "\nThinking: " . $block->thinking;
    } elseif ($block->type === 'text') {
        echo "\nResponse: " . $block->text;
    }
}

require "anthropic"

client = Anthropic::Client.new

message = client.messages.create(
  model: "claude-opus-4-7",
  max_tokens: 16000,
  thinking: {
    type: "adaptive"
  },
  messages: [
    {
      role: "user",
      content: "Explain why the sum of two even numbers is always even."
    }
  ]
)

message.content.each do |block|
  case block.type
  when :thinking
    puts "\nThinking: #{block.thinking}"
  when :text
    puts "\nResponse: #{block.text}"
  end
end

使用 effort 参数的自适应思考

您可以将自适应思考与 effort 参数结合使用，以引导 Claude 的思考量。effort 级别作为 Claude 思考分配的软指导：

Effort 级别	思考行为
`max`	Claude 始终进行思考，对思考深度没有限制。可在 Claude Mythos Preview、Claude Opus 4.7、Claude Opus 4.6 和 Claude Sonnet 4.6 上使用。
`xhigh`	Claude 始终进行深度思考并进行扩展探索。可在 Claude Opus 4.7 上使用。
`high`（默认）	Claude 始终进行思考。对复杂任务提供深度推理。
`medium`	Claude 使用适度思考。对于非常简单的查询可能会跳过思考。
`low`	Claude 最小化思考。对于速度优先的简单任务跳过思考。

curl https://api.anthropic.com/v1/messages \
     --header "x-api-key: $ANTHROPIC_API_KEY" \
     --header "anthropic-version: 2023-06-01" \
     --header "content-type: application/json" \
     --data \
'{
    "model": "claude-opus-4-7",
    "max_tokens": 16000,
    "thinking": {
        "type": "adaptive"
    },
    "output_config": {
        "effort": "medium"
    },
    "messages": [
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ]
}'

ant messages create \
  --transform 'content.0.text' --raw-output <<'YAML'
model: claude-opus-4-7
max_tokens: 16000
thinking:
  type: adaptive
output_config:
  effort: medium
messages:
  - role: user
    content: What is the capital of France?
YAML

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    output_config={"effort": "medium"},
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)

print(response.content[0].text)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const response = await client.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 16000,
  thinking: {
    type: "adaptive"
  },
  output_config: {
    effort: "medium"
  },
  messages: [
    {
      role: "user",
      content: "What is the capital of France?"
    }
  ]
});

for (const block of response.content) {
  if (block.type === "text") {
    console.log(block.text);
  }
}

using System;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Messages;

class Program
{
    static async Task Main(string[] args)
    {
        AnthropicClient client = new();

        var parameters = new MessageCreateParams
        {
            Model = Model.ClaudeOpus4_7,
            MaxTokens = 16000,
            Thinking = new ThinkingConfigAdaptive(),
            OutputConfig = new OutputConfig { Effort = Effort.Medium },
            Messages = [new() { Role = Role.User, Content = "What is the capital of France?" }]
        };

        var message = await client.Messages.Create(parameters);
        Console.WriteLine(message);
    }
}

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()

	response, err := client.Messages.New(context.TODO(), anthropic.MessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 16000,
		Thinking: anthropic.ThinkingConfigParamUnion{
			OfAdaptive: &anthropic.ThinkingConfigAdaptiveParam{},
		},
		OutputConfig: anthropic.OutputConfigParam{
			Effort: anthropic.OutputConfigEffortMedium,
		},
		Messages: []anthropic.MessageParam{
			anthropic.NewUserMessage(anthropic.NewTextBlock("What is the capital of France?")),
		},
	})
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(response.Content[0].Text)
}

import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.MessageCreateParams;
import com.anthropic.models.messages.Message;
import com.anthropic.models.messages.Model;
import com.anthropic.models.messages.OutputConfig;
import com.anthropic.models.messages.ThinkingConfigAdaptive;

public class Main {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        MessageCreateParams params = MessageCreateParams.builder()
            .model(Model.CLAUDE_OPUS_4_7)
            .maxTokens(16000L)
            .thinking(ThinkingConfigAdaptive.builder().build())
            .outputConfig(OutputConfig.builder()
                .effort(OutputConfig.Effort.MEDIUM)
                .build())
            .addUserMessage("What is the capital of France?")
            .build();

        Message response = client.messages().create(params);
        response.content().stream()
            .flatMap(block -> block.text().stream())
            .forEach(textBlock -> System.out.println(textBlock.text()));
    }
}

<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));

$message = $client->messages->create(
    maxTokens: 16000,
    messages: [
        ['role' => 'user', 'content' => 'What is the capital of France?']
    ],
    model: 'claude-opus-4-7',
    thinking: ['type' => 'adaptive'],
    outputConfig: ['effort' => 'medium'],
);

echo $message->content[0]->text;

require "anthropic"

client = Anthropic::Client.new

message = client.messages.create(
  model: "claude-opus-4-7",
  max_tokens: 16000,
  thinking: {
    type: "adaptive"
  },
  output_config: {
    effort: "medium"
  },
  messages: [
    { role: "user", content: "What is the capital of France?" }
  ]
)

puts message.content.first.text

使用自适应思考进行流式传输

自适应思考可以与流式传输无缝配合。思考块通过 thinking_delta 事件进行流式传输，与手动思考模式相同：

ant messages create --stream --format jsonl \
  --model claude-opus-4-7 \
  --max-tokens 16000 \
  --thinking '{type: adaptive}' \
  --message '{role: user, content: What is the greatest common divisor of 1071 and 462?}'

import anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "What is the greatest common divisor of 1071 and 462?",
        }
    ],
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            print(f"\nStarting {event.content_block.type} block...")
        elif event.type == "content_block_delta":
            if event.delta.type == "thinking_delta":
                print(event.delta.thinking, end="", flush=True)
            elif event.delta.type == "text_delta":
                print(event.delta.text, end="", flush=True)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const stream = await client.messages.stream({
  model: "claude-opus-4-7",
  max_tokens: 16000,
  thinking: { type: "adaptive" },
  messages: [{ role: "user", content: "What is the greatest common divisor of 1071 and 462?" }]
});

for await (const event of stream) {
  if (event.type === "content_block_start") {
    console.log(`\nStarting ${event.content_block.type} block...`);
  } else if (event.type === "content_block_delta") {
    if (event.delta.type === "thinking_delta") {
      process.stdout.write(event.delta.thinking);
    } else if (event.delta.type === "text_delta") {
      process.stdout.write(event.delta.text);
    }
  }
}

using System;
using System.Threading.Tasks;
using Anthropic;
using Anthropic.Models.Messages;

class Program
{
    static async Task Main(string[] args)
    {
        AnthropicClient client = new();

        var parameters = new MessageCreateParams
        {
            Model = Model.ClaudeOpus4_7,
            MaxTokens = 16000,
            Thinking = new ThinkingConfigAdaptive(),
            Messages = [new() { Role = Role.User, Content = "What is the greatest common divisor of 1071 and 462?" }]
        };

        await foreach (var msg in client.Messages.CreateStreaming(parameters))
        {
            Console.Write(msg);
        }
    }
}

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()

	stream := client.Messages.NewStreaming(context.TODO(), anthropic.MessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 16000,
		Thinking: anthropic.ThinkingConfigParamUnion{
			OfAdaptive: &anthropic.ThinkingConfigAdaptiveParam{},
		},
		Messages: []anthropic.MessageParam{
			anthropic.NewUserMessage(anthropic.NewTextBlock("What is the greatest common divisor of 1071 and 462?")),
		},
	})

	for stream.Next() {
		event := stream.Current()
		switch eventVariant := event.AsAny().(type) {
		case anthropic.ContentBlockStartEvent:
			fmt.Printf("\nStarting %s block...\n", eventVariant.ContentBlock.Type)
		case anthropic.ContentBlockDeltaEvent:
			switch deltaVariant := eventVariant.Delta.AsAny().(type) {
			case anthropic.ThinkingDelta:
				fmt.Print(deltaVariant.Thinking)
			case anthropic.TextDelta:
				fmt.Print(deltaVariant.Text)
			}
		}
	}
	if err := stream.Err(); err != nil {
		log.Fatal(err)
	}
}

import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.MessageCreateParams;
import com.anthropic.models.messages.Model;
import com.anthropic.models.messages.ThinkingConfigAdaptive;

public class StreamingThinkingExample {
    public static void main(String[] args) {
        AnthropicClient client = AnthropicOkHttpClient.fromEnv();

        MessageCreateParams params = MessageCreateParams.builder()
            .model(Model.CLAUDE_OPUS_4_7)
            .maxTokens(16000L)
            .thinking(ThinkingConfigAdaptive.builder().build())
            .addUserMessage("What is the greatest common divisor of 1071 and 462?")
            .build();

        try (var streamResponse = client.messages().createStreaming(params)) {
            streamResponse.stream().forEach(event -> {
                if (event.contentBlockStart().isPresent()) {
                    var startEvent = event.contentBlockStart().get();
                    var block = startEvent.contentBlock();
                    if (block.isThinking()) {
                        System.out.println("\nStarting thinking block...");
                    } else if (block.isText()) {
                        System.out.println("\nStarting text block...");
                    }
                } else if (event.contentBlockDelta().isPresent()) {
                    var deltaEvent = event.contentBlockDelta().get();
                    deltaEvent.delta().thinking().ifPresent(td ->
                        System.out.print(td.thinking())
                    );
                    deltaEvent.delta().text().ifPresent(td ->
                        System.out.print(td.text())
                    );
                }
            });
        }
    }
}

<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));

$stream = $client->messages->createStream(
    maxTokens: 16000,
    messages: [
        ['role' => 'user', 'content' => 'What is the greatest common divisor of 1071 and 462?']
    ],
    model: 'claude-opus-4-7',
    thinking: ['type' => 'adaptive'],
);

foreach ($stream as $event) {
    if ($event->type === 'content_block_start') {
        echo "\nStarting {$event->contentBlock->type} block...\n";
    } elseif ($event->type === 'content_block_delta') {
        if ($event->delta->type === 'thinking_delta') {
            echo $event->delta->thinking;
        } elseif ($event->delta->type === 'text_delta') {
            echo $event->delta->text;
        }
    }
}

require "anthropic"

client = Anthropic::Client.new

stream = client.messages.stream(
  model: "claude-opus-4-7",
  max_tokens: 16000,
  thinking: { type: "adaptive" },
  messages: [
    { role: "user", content: "What is the greatest common divisor of 1071 and 462?" }
  ]
)

stream.each do |event|
  case event
  when Anthropic::Streaming::ThinkingEvent
    print event.thinking
  when Anthropic::Streaming::TextEvent
    print event.text
  end
end

自适应、手动和禁用思考模式的比较

模式	配置	可用性	使用场景
自适应	`thinking: {type: "adaptive"}`	Claude Mythos Preview（默认）、Opus 4.7（唯一模式）、Opus 4.6、Sonnet 4.6	Claude 决定何时使用以及使用多少扩展思考。使用 `effort` 进行引导。
手动	`thinking: {type: "enabled", budget_tokens: N}`	除 Claude Opus 4.7 外的所有模型（会被拒绝）。在 Opus 4.6 和 Sonnet 4.6 上已弃用（建议改用自适应模式）。	需要精确控制思考 token 消耗时使用。
禁用	省略 `thinking` 参数或传递 `{type: "disabled"}`	除 Claude Mythos Preview 外的所有模型	不需要扩展思考且需要最低延迟时使用。

Note

自适应思考在 Claude Mythos Preview、Claude Opus 4.7、Opus 4.6 和 Sonnet 4.6 上可用。在 Mythos Preview 上，自适应思考是默认模式，当未设置 thinking 时自动应用。在 Claude Opus 4.7 上，自适应思考是唯一支持的模式，使用 budget_tokens 的 type: "enabled" 会被拒绝。较旧的模型仅支持使用 budget_tokens 的 type: "enabled"。在 Opus 4.6 和 Sonnet 4.6 上，使用 budget_tokens 的 type: "enabled" 仍然可用但已被弃用。

各模式的交错思考可用性：

自适应模式： 在 Claude Mythos Preview、Claude Opus 4.7、Opus 4.6 和 Sonnet 4.6 上自动启用交错思考。在 Mythos Preview 和 Opus 4.7 上，工具间推理始终位于思考块内。
Sonnet 4.6 上的手动模式： 通过 interleaved-thinking-2025-05-14 beta 请求头支持交错思考。
Opus 4.6 上的手动模式： 不支持交错思考。如果您的代理工作流需要在 Opus 4.6 上进行工具调用间的思考，请使用自适应模式。

重要注意事项

验证变更

使用自适应思考时，之前的助手轮次不需要以思考块开头。这比手动模式更灵活，手动模式的 API 会强制要求启用思考的轮次以思考块开始。

Prompt caching

使用 adaptive 思考的连续请求会保留 prompt cache 断点。但是，在 adaptive 和 enabled/disabled 思考模式之间切换会破坏消息的缓存断点。无论模式如何更改，系统提示和工具定义都会保持缓存。

调整思考行为

自适应思考的触发行为是可提示的。如果 Claude 的思考频率不符合您的期望，您可以在系统提示中添加指导：

Extended thinking adds latency and should only be used when it
will meaningfully improve answer quality — typically for problems
that require multi-step reasoning. When in doubt, respond directly.

Warning

引导 Claude 减少思考频率可能会降低对推理有益的任务的质量。在将基于提示的调优部署到生产环境之前，请衡量对您特定工作负载的影响。建议先使用较低的 effort 级别进行测试。

成本控制

使用 max_tokens 作为总输出（思考 + 响应文本）的硬性限制。effort 参数为 Claude 分配的思考量提供额外的软性指导。两者结合可以让您有效控制成本。

在 high 和 max effort 级别下，Claude 可能会进行更广泛的思考，并且更有可能耗尽 max_tokens 预算。如果您在响应中观察到 stop_reason: "max_tokens"，请考虑增加 max_tokens 以给模型更多空间，或降低 effort 级别。

使用思考块

以下概念适用于所有支持扩展思考的模型，无论您使用的是自适应模式还是手动模式。

摘要思考

启用扩展思考后，Claude 4 模型的 Messages API 会返回 Claude 完整思考过程的摘要。摘要思考提供了扩展思考的完整智能优势，同时防止滥用。这是 Claude 4 模型在思考配置的 display 字段未设置或设置为 "summarized" 时的默认行为。在 Claude Opus 4.7 和 Claude Mythos Preview 上，display 默认为 "omitted"，因此您必须明确设置 display: "summarized" 才能接收摘要思考。

以下是摘要思考的一些重要注意事项：

您需要为原始请求生成的完整思考 token 付费，而不是摘要 token。
计费的输出 token 数量将不匹配您在响应中看到的 token 数量。
在 Claude 4 模型上，思考输出的前几行更详细，提供了对提示工程特别有用的详细推理。Claude Mythos Preview 从第一个 token 开始摘要，因此其思考块不会显示这种详细的前导内容。
随着 Anthropic 不断改进扩展思考功能，摘要行为可能会发生变化。
摘要保留了 Claude 思考过程的关键思想，同时添加最少的延迟，实现了可流式传输的用户体验。
摘要由与您请求目标不同的模型处理。思考模型看不到摘要输出。

Note

在极少数情况下，如果您需要访问 Claude 4 模型的完整思考输出，请联系 Anthropic 销售团队。

控制思考显示

思考配置上的 display 字段控制 API 响应中思考内容的返回方式。它接受两个值：

"summarized"：思考块包含摘要思考文本。详情请参阅摘要思考。这是 Claude Opus 4.6、Claude Sonnet 4.6 及更早 Claude 4 模型的默认值。
"omitted"：思考块返回时 thinking 字段为空。signature 字段仍然携带加密的完整思考，用于多轮连续性（参见思考加密）。这是 Claude Opus 4.7 和 Claude Mythos Preview 的默认值。

当您的应用不向用户展示思考内容时，设置 display: "omitted" 非常有用。主要好处是流式传输时更快地获得第一个文本 token： 服务器完全跳过流式传输思考 token，仅传输签名，因此最终文本响应会更快开始流式传输。

以下是省略思考的一些重要注意事项：

您仍然需要为完整的思考 token 付费。省略可以减少延迟，但不能减少成本。
如果您在多轮对话中传回思考块，请原样传递。服务器解密 signature 以重建原始思考用于提示构造（参见保留思考块）。您在往返的省略块的 thinking 字段中放置的任何文本都会被忽略。
display 与 thinking.type: "disabled" 不兼容（没有可显示的内容）。
使用 thinking.type: "adaptive" 且模型跳过简单请求的思考时，无论 display 如何设置，都不会产生思考块。

Note

无论 display 是 "summarized" 还是 "omitted"，signature 字段都是相同的。支持在对话的轮次之间切换 display 值。

Note

在 Claude Opus 4.7 上，thinking.display 默认为 "omitted"。思考块仍然会出现在响应流中，但除非您明确选择加入，否则其 thinking 字段为空。这与 Claude Opus 4.6 的默认值 "summarized" 相比是一个静默变更。要在 Claude Opus 4.7 上恢复摘要思考文本，请明确将 thinking.display 设置为 "summarized"：

thinking = {
    "type": "adaptive",
    "display": "summarized",
}

有关使用 display: "omitted" 的代码示例和流式传输行为，请参阅扩展思考页面上的控制思考显示。那里的示例使用 type: "enabled"；使用自适应思考时，请使用：

thinking = {"type": "adaptive", "display": "omitted"}

思考加密

完整的思考内容会被加密并返回在 signature 字段中。此字段用于在将思考块传回 API 时验证它们是由 Claude 生成的。

Note

只有在使用带有扩展思考的工具时才需要传回思考块。否则，您可以省略之前轮次的思考块。如果您传回它们，API 是否保留或剥离它们取决于模型：Opus 4.5+ 和 Sonnet 4.6+ 默认将它们保留在上下文中；较早的 Opus/Sonnet 模型和所有 Haiku 模型会剥离它们。请参阅上下文编辑进行配置。

如果传回思考块，请将所有内容原样传回，以确保一致性并避免潜在问题。

以下是关于思考加密的一些重要注意事项：

在流式传输响应时，签名通过 content_block_delta 事件中的 signature_delta 在 content_block_stop 事件之前添加。
Claude 4 模型中的 signature 值比之前的模型长得多。
signature 字段是一个不透明字段，不应被解释或解析。
signature 值在各平台之间兼容（Claude API、Amazon Bedrock 和 Vertex AI）。在一个平台上生成的值将与另一个平台兼容。

定价

有关包括基础费率、缓存写入、缓存命中和输出 token 在内的完整定价信息，请参阅定价页面。

思考过程会产生以下费用：

思考期间使用的 token（输出 token）
保留在上下文中的之前助手轮次的思考块：在较早的 Opus/Sonnet 模型和所有 Haiku 模型上仅保留最后一轮；在 Opus 4.5+ 和 Sonnet 4.6+ 上默认保留所有轮次（输入 token）
标准文本输出 token

Note

启用扩展思考时，会自动包含一个专门的系统提示来支持此功能。

使用摘要思考时：

输入 token： 原始请求中的 token（不包括之前轮次的思考 token）
输出 token（计费）： Claude 内部生成的原始思考 token
输出 token（可见）： 您在响应中看到的摘要思考 token
不收费： 用于生成摘要的 token

使用 display: "omitted" 时：

输入 token： 原始请求中的 token（与摘要相同）
输出 token（计费）： Claude 内部生成的原始思考 token（与摘要相同）
输出 token（可见）： 零思考 token（thinking 字段为空）

Warning

计费的输出 token 数量将不匹配响应中可见的 token 数量。您需要为完整的思考过程付费，而不是响应中可见的思考内容。

其他主题

扩展思考页面更详细地介绍了几个主题，并提供了特定模式的代码示例：

带思考的工具使用：相同的规则适用于自适应思考：在工具调用之间保留思考块，并注意思考激活时 tool_choice 的限制。
Prompt caching：使用自适应思考时，使用相同思考模式的连续请求会保留缓存断点。在 adaptive 和 enabled/disabled 模式之间切换会破坏消息的缓存断点（系统提示和工具定义保持缓存）。
上下文窗口：思考 token 如何与 max_tokens 和上下文窗口限制交互。

后续步骤

◆扩展思考

了解有关扩展思考的更多信息，包括手动模式、工具使用和 prompt caching。

◆Effort 参数

使用 effort 参数控制 Claude 响应的详细程度。