PDF 支持
使用 Claude 处理 PDF。从文档中提取文本、分析图表并理解视觉内容。
此功能符合零数据保留 (ZDR) 条件。当您的组织有 ZDR 安排时,通过此功能发送的数据在 API 响应返回后不会被存储。
您可以向 Claude 询问 PDF 中的任何文本、图片、图表和表格。一些示例用例:
- 分析财务报告并理解图表/表格
- 从法律文档中提取关键信息
- 文档翻译辅助
- 将文档信息转换为结构化格式
开始之前
检查 PDF 要求
Claude 支持任何标准 PDF。请确保您的请求大小满足以下要求:
| 要求 | 限制 |
|---|---|
| 最大请求大小 | 32 MB(因平台而异) |
| 每个请求最大页数 | 600(200k-token 上下文窗口的模型为 100) |
| 格式 | 标准 PDF(无密码/加密) |
这两个限制都是针对整个请求负载的,包括与 PDF 一起发送的任何其他内容。对于大型 PDF,考虑使用 Files API 上传并通过 file_id 引用,以保持请求负载较小。
密集的 PDF(许多小字体页面、复杂表格或大量图形)可能在达到页面限制之前就填满上下文窗口。使用 Files API 时,大型 PDF 的请求也可能在达到页面限制之前失败。尝试将文档分成多个部分;对于大型文件,由于每页都作为图像处理,对嵌入图像进行降采样也会有所帮助。
由于 PDF 支持依赖于 Claude 的视觉能力,因此与其他视觉任务有相同的限制和注意事项。
支持的平台和模型
PDF 支持可在 Claude API、Claude Platform on AWS、Amazon Bedrock(参见 Amazon Bedrock PDF 支持)、Vertex AI 和 Microsoft Foundry 上使用。所有活跃模型都支持 PDF 处理。
Amazon Bedrock PDF 支持
通过 Bedrock 的 Converse API 使用 PDF 支持时,有两种不同的文档处理模式:
重要: 要在 Converse API 中访问 Claude 的完整视觉 PDF 理解能力,您必须启用引用。未启用引用时,API 会回退到仅基本文本提取。了解更多关于使用引用的信息。
文档处理模式
-
Converse Document Chat(原始模式 - 仅文本提取)
- 提供基本的 PDF 文本提取
- 无法分析 PDF 中的图像、图表或视觉布局
- 3 页 PDF 大约使用 1,000 tokens
- 未启用引用时自动使用
-
Claude PDF Chat(新模式 - 完整视觉理解)
- 提供完整的 PDF 视觉分析
- 能够理解和分析图表、图形、图像和视觉布局
- 将每页同时作为文本和图像处理以实现全面理解
- 3 页 PDF 大约使用 7,000 tokens
- 需要在 Converse API 中启用引用
主要限制
- Converse API:视觉 PDF 分析需要启用引用。目前没有选项可以在不启用引用的情况下使用视觉分析(不同于 InvokeModel API)。
- InvokeModel API:提供对 PDF 处理的完全控制,无需强制引用。
常见问题
如果在使用 Converse API 时 Claude 看不到 PDF 中的图像或图表,您可能需要启用引用标志。没有它,Converse 会回退到仅基本文本提取。
这是 Converse API 的已知限制。对于需要在没有引用的情况下进行视觉 PDF 分析的应用,请考虑使用 InvokeModel API。
对于 .csv、.xlsx、.docx、.md 或 .txt 等非 PDF 文件,请参阅使用其他文件格式。
使用 Claude 处理 PDF
发送您的第一个 PDF 请求
让我们从使用 Messages API 的简单示例开始。您可以通过三种方式向 Claude 提供 PDF:
- 作为在线托管 PDF 的 URL 引用
- 作为
document内容块中的 base64 编码 PDF - 通过 Files API 的
file_id
在 Amazon Bedrock 和 Vertex AI 上,目前仅支持 base64 编码源。
选项 1:基于 URL 的 PDF 文档
最简单的方法是直接从 URL 引用 PDF:
curl https://api.anthropic.com/v1/messages \
-H "content-type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [{
"role": "user",
"content": [{
"type": "document",
"source": {
"type": "url",
"url": "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf"
}
},
{
"type": "text",
"text": "What are the key findings in this document?"
}]
}]
}'
ant messages create --transform content --format yaml <<'YAML'
model: claude-opus-4-7
max_tokens: 1024
messages:
- role: user
content:
- type: document
source:
type: url
url: https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf
- type: text
text: What are the key findings in this document?
YAML
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "url",
"url": "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf",
},
},
{"type": "text", "text": "What are the key findings in this document?"},
],
}
],
)
print(message.content)
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic();
const response = await anthropic.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [
{
role: "user",
content: [
{
type: "document",
source: {
type: "url",
url: "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf"
}
},
{
type: "text",
text: "What are the key findings in this document?"
}
]
}
]
});
console.log(response);
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.*;
import java.util.List;
public class PdfUrlExample {
public static void main(String[] args) {
AnthropicClient client = AnthropicOkHttpClient.fromEnv();
// Create document block with URL
DocumentBlockParam documentParam = DocumentBlockParam.builder()
.source(
UrlPdfSource.builder()
.url(
"https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf"
)
.build()
)
.build();
// Create a message with document and text content blocks
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.addUserMessageOfBlockParams(
List.of(
ContentBlockParam.ofDocument(documentParam),
ContentBlockParam.ofText(
TextBlockParam.builder()
.text("What are the key findings in this document?")
.build()
)
)
)
.build();
Message message = client.messages().create(params);
System.out.println(message.content());
}
}
选项 2:Base64 编码的 PDF 文档
如果您需要从本地系统发送 PDF 或 URL 不可用时:
cd "$(mktemp -d)"
# Method 1: Fetch and encode a remote PDF
curl -s "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf" | base64 | tr -d '\n' > pdf_base64.txt
# Method 2: Encode a local PDF file
# base64 document.pdf | tr -d '\n' > pdf_base64.txt
# Create a JSON request file using the pdf_base64.txt content
jq -n --rawfile PDF_BASE64 pdf_base64.txt '{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [{
"role": "user",
"content": [{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": $PDF_BASE64
}
},
{
"type": "text",
"text": "What are the key findings in this document?"
}]
}]
}' > request.json
# Send the API request using the JSON file
curl https://api.anthropic.com/v1/messages \
-H "content-type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d @request.json
cd "$(mktemp -d)"
curl -sSo document.pdf https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf
ant messages create \
--model claude-opus-4-7 \
--max-tokens 1024 \
--transform content --format yaml <<'YAML'
messages:
- role: user
content:
- type: document
source:
type: base64
media_type: application/pdf
data: "@./document.pdf"
- type: text
text: What are the key findings in this document?
YAML
import anthropic
import base64
import httpx
# First, load and encode the PDF
pdf_url = "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf"
pdf_data = base64.standard_b64encode(httpx.get(pdf_url).content).decode("utf-8")
# Alternative: Load from a local file
# with open("document.pdf", "rb") as f:
# pdf_data = base64.standard_b64encode(f.read()).decode("utf-8")
# Send to Claude using base64 encoding
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": pdf_data,
},
},
{"type": "text", "text": "What are the key findings in this document?"},
],
}
],
)
print(message.content)
import Anthropic from "@anthropic-ai/sdk";
async function main() {
// Method 1: Fetch and encode a remote PDF
const pdfURL =
"https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf";
const pdfResponse = await fetch(pdfURL);
const arrayBuffer = await pdfResponse.arrayBuffer();
const pdfBase64 = Buffer.from(arrayBuffer).toString("base64");
// Method 2: Load from a local file
// import { readFile } from "node:fs/promises";
// const pdfBase64 = (await readFile('document.pdf')).toString('base64');
// Send the API request with base64-encoded PDF
const anthropic = new Anthropic();
const response = await anthropic.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [
{
role: "user",
content: [
{
type: "document",
source: {
type: "base64",
media_type: "application/pdf",
data: pdfBase64
}
},
{
type: "text",
text: "What are the key findings in this document?"
}
]
}
]
});
console.log(response);
}
main();
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.Base64PdfSource;
import com.anthropic.models.messages.ContentBlockParam;
import com.anthropic.models.messages.DocumentBlockParam;
import com.anthropic.models.messages.Message;
import com.anthropic.models.messages.MessageCreateParams;
import com.anthropic.models.messages.Model;
import com.anthropic.models.messages.TextBlockParam;
import java.io.IOException;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.Base64;
import java.util.List;
public class PdfBase64Example {
public static void main(String[] args) throws IOException, InterruptedException {
AnthropicClient client = AnthropicOkHttpClient.fromEnv();
// Method 1: Download and encode a remote PDF
String pdfUrl =
"https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf";
HttpClient httpClient = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder().uri(URI.create(pdfUrl)).GET().build();
HttpResponse<byte[]> response = httpClient.send(
request,
HttpResponse.BodyHandlers.ofByteArray()
);
String pdfBase64 = Base64.getEncoder().encodeToString(response.body());
// Method 2: Load from a local file
// byte[] fileBytes = Files.readAllBytes(Path.of("document.pdf"));
// String pdfBase64 = Base64.getEncoder().encodeToString(fileBytes);
// Create document block with base64 data
DocumentBlockParam documentParam = DocumentBlockParam.builder()
.source(Base64PdfSource.builder().data(pdfBase64).build())
.build();
// Create a message with document and text content blocks
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.addUserMessageOfBlockParams(
List.of(
ContentBlockParam.ofDocument(documentParam),
ContentBlockParam.ofText(
TextBlockParam.builder()
.text("What are the key findings in this document?")
.build()
)
)
)
.build();
Message message = client.messages().create(params);
System.out.println(message.content());
}
}
选项 3:Files API
对于您会重复使用的 PDF,或者当您想避免编码开销时,请使用 Files API:
cd "$(mktemp -d)"
curl -sSo document.pdf https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf
# First, upload your PDF to the Files API
curl -X POST https://api.anthropic.com/v1/files \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: files-api-2025-04-14" \
-F "file=@document.pdf"
# Then use the returned file_id in your message
curl https://api.anthropic.com/v1/messages \
-H "content-type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: files-api-2025-04-14" \
-d '{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [{
"role": "user",
"content": [{
"type": "document",
"source": {
"type": "file",
"file_id": "file_abc123"
}
},
{
"type": "text",
"text": "What are the key findings in this document?"
}]
}]
}'
cd "$(mktemp -d)"
curl -sSo document.pdf https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf
# First, upload your PDF to the Files API
FILE_ID=$(ant beta:files upload \
--file ./document.pdf \
--transform id --raw-output)
# Then use the returned file_id in your message
ant beta:messages create \
--beta files-api-2025-04-14 \
--transform content --format yaml <<YAML
model: claude-opus-4-7
max_tokens: 1024
messages:
- role: user
content:
- type: document
source:
type: file
file_id: $FILE_ID
- type: text
text: What are the key findings in this document?
YAML
import anthropic
client = anthropic.Anthropic()
# Upload the PDF file
with open("document.pdf", "rb") as f:
file_upload = client.beta.files.upload(file=("document.pdf", f, "application/pdf"))
# Use the uploaded file in a message
message = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
betas=["files-api-2025-04-14"],
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {"type": "file", "file_id": file_upload.id},
},
{"type": "text", "text": "What are the key findings in this document?"},
],
}
],
)
print(message.content)
import Anthropic, { toFile } from "@anthropic-ai/sdk";
import fs from "fs";
const anthropic = new Anthropic();
// Upload the PDF file
const fileUpload = await anthropic.beta.files.upload({
file: await toFile(fs.createReadStream("document.pdf"), undefined, {
type: "application/pdf"
})
});
// Use the uploaded file in a message
const response = await anthropic.beta.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
betas: ["files-api-2025-04-14"],
messages: [
{
role: "user",
content: [
{
type: "document",
source: {
type: "file",
file_id: fileUpload.id
}
},
{
type: "text",
text: "What are the key findings in this document?"
}
]
}
]
});
console.log(response);
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.Model;
import com.anthropic.models.beta.files.FileMetadata;
import com.anthropic.models.beta.files.FileUploadParams;
import com.anthropic.models.beta.messages.BetaContentBlockParam;
import com.anthropic.models.beta.messages.BetaFileDocumentSource;
import com.anthropic.models.beta.messages.BetaMessage;
import com.anthropic.models.beta.messages.BetaRequestDocumentBlock;
import com.anthropic.models.beta.messages.BetaTextBlockParam;
import com.anthropic.models.beta.messages.MessageCreateParams;
import java.nio.file.Path;
import java.util.List;
public class PdfFilesExample {
public static void main(String[] args) {
AnthropicClient client = AnthropicOkHttpClient.fromEnv();
// Upload the PDF file
FileMetadata file = client
.beta()
.files()
.upload(FileUploadParams.builder().file(Path.of("document.pdf")).build());
// Use the uploaded file in a message
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.addBeta("files-api-2025-04-14")
.maxTokens(1024)
.addUserMessageOfBetaContentBlockParams(
List.of(
BetaContentBlockParam.ofDocument(
BetaRequestDocumentBlock.builder()
.source(
BetaFileDocumentSource.builder()
.fileId(file.id())
.build()
)
.build()
),
BetaContentBlockParam.ofText(
BetaTextBlockParam.builder()
.text("What are the key findings in this document?")
.build()
)
)
)
.build();
BetaMessage message = client.beta().messages().create(params);
System.out.println(message.content());
}
}
PDF 支持的工作原理
当您向 Claude 发送 PDF 时,会发生以下步骤:
系统提取文档内容。
- 系统将文档的每一页转换为图像。
- 每页的文本被提取出来,并与每页的图像一起提供。
Claude 分析文本和图像以更好地理解文档。
- 文档以文本和图像的组合形式提供用于分析。
- 这允许用户询问 PDF 视觉元素的见解,如图表、图表和其他非文本内容。
Claude 响应时引用 PDF 的相关内容。
Claude 在响应时可以引用文本和视觉内容。您可以通过将 PDF 支持与以下功能集成来进一步提高性能:
- Prompt caching:提高重复分析的性能。
- 批处理:用于大批量文档处理。
- 工具使用:从文档中提取特定信息用作工具输入。
估算成本
PDF 文件的 token 数量取决于从文档中提取的总文本以及页数:
- 文本 token 成本:每页通常使用 1,500-3,000 tokens,具体取决于内容密度。标准 API 定价适用,无额外 PDF 费用。
- 图像 token 成本:由于每页都被转换为图像,因此应用相同的基于图像的成本计算。
您可以使用 token 计数来估算特定 PDF 的成本。
优化 PDF 处理
提高性能
遵循以下最佳实践以获得最佳结果:
- 在请求中将 PDF 放在文本之前
- 使用标准字体
- 确保文本清晰可读
- 将页面旋转到正确的直立方向
- 在提示中使用逻辑页码(来自 PDF 查看器)
- 需要时将大型 PDF 分成块
- 为重复分析启用 prompt caching
扩展实现
对于大批量处理,请考虑以下方法:
使用 prompt caching
缓存 PDF 以提高重复查询的性能:
cd "$(mktemp -d)"
curl -s "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf" | base64 | tr -d '\n' > pdf_base64.txt
# Create a JSON request file using the pdf_base64.txt content
jq -n --rawfile PDF_BASE64 pdf_base64.txt '{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [{
"role": "user",
"content": [{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": $PDF_BASE64
},
"cache_control": {
"type": "ephemeral"
}
},
{
"type": "text",
"text": "Which model has the highest human preference win rates across each use-case?"
}]
}]
}' > request.json
# Then make the API call using the JSON file
curl https://api.anthropic.com/v1/messages \
-H "content-type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d @request.json
cd "$(mktemp -d)"
curl -sSo document.pdf https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf
ant messages create <<'YAML'
model: claude-opus-4-7
max_tokens: 1024
messages:
- role: user
content:
- type: document
source:
type: base64
media_type: application/pdf
data: "@./document.pdf"
cache_control:
type: ephemeral
- type: text
text: Which model has the highest human preference win rates across each use-case?
YAML
import anthropic
import base64
from pypdf import PdfWriter
import io
client = anthropic.Anthropic()
buf = io.BytesIO()
writer = PdfWriter()
writer.add_blank_page(width=72, height=72)
writer.write(buf)
pdf_data = base64.standard_b64encode(buf.getvalue()).decode("utf-8")
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": pdf_data,
},
"cache_control": {"type": "ephemeral"},
},
{"type": "text", "text": "Analyze this document."},
],
}
],
)
const response = await anthropic.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [
{
content: [
{
type: "document",
source: {
media_type: "application/pdf",
type: "base64",
data: pdfBase64
},
cache_control: { type: "ephemeral" }
},
{
type: "text",
text: "Which model has the highest human preference win rates across each use-case?"
}
],
role: "user"
}
]
});
console.log(response);
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.Base64PdfSource;
import com.anthropic.models.messages.CacheControlEphemeral;
import com.anthropic.models.messages.ContentBlockParam;
import com.anthropic.models.messages.DocumentBlockParam;
import com.anthropic.models.messages.Message;
import com.anthropic.models.messages.MessageCreateParams;
import com.anthropic.models.messages.Model;
import com.anthropic.models.messages.TextBlockParam;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;
public class MessagesDocumentExample {
public static void main(String[] args) throws IOException {
AnthropicClient client = AnthropicOkHttpClient.fromEnv();
// Read PDF file as base64
byte[] pdfBytes = Files.readAllBytes(Paths.get("pdf_base64.txt"));
String pdfBase64 = new String(pdfBytes);
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.addUserMessageOfBlockParams(
List.of(
ContentBlockParam.ofDocument(
DocumentBlockParam.builder()
.source(Base64PdfSource.builder().data(pdfBase64).build())
.cacheControl(CacheControlEphemeral.builder().build())
.build()
),
ContentBlockParam.ofText(
TextBlockParam.builder()
.text(
"Which model has the highest human preference win rates across each use-case?"
)
.build()
)
)
)
.build();
Message message = client.messages().create(params);
System.out.println(message);
}
}
处理文档批处理
使用 Message Batches API 进行大批量工作流:
cd "$(mktemp -d)"
curl -s "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf" | base64 | tr -d '\n' > pdf_base64.txt
# Create a JSON request file using the pdf_base64.txt content
jq -n --rawfile PDF_BASE64 pdf_base64.txt '
{
"requests": [
{
"custom_id": "my-first-request",
"params": {
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": $PDF_BASE64
}
},
{
"type": "text",
"text": "Which model has the highest human preference win rates across each use-case?"
}
]
}
]
}
},
{
"custom_id": "my-second-request",
"params": {
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": $PDF_BASE64
}
},
{
"type": "text",
"text": "Extract 5 key insights from this document."
}
]
}
]
}
}
]
}
' > request.json
# Then make the API call using the JSON file
curl https://api.anthropic.com/v1/messages/batches \
-H "content-type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d @request.json
cd "$(mktemp -d)"
curl -sSo document.pdf https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf
ant messages:batches create <<'YAML'
requests:
- custom_id: my-first-request
params:
model: claude-opus-4-7
max_tokens: 1024
messages:
- role: user
content:
- type: document
source:
type: base64
media_type: application/pdf
data: "@./document.pdf"
- type: text
text: >-
Which model has the highest human preference win rates
across each use-case?
- custom_id: my-second-request
params:
model: claude-opus-4-7
max_tokens: 1024
messages:
- role: user
content:
- type: document
source:
type: base64
media_type: application/pdf
data: "@./document.pdf"
- type: text
text: Extract 5 key insights from this document.
YAML
import anthropic
import base64
from pypdf import PdfWriter
import io
client = anthropic.Anthropic()
buf = io.BytesIO()
writer = PdfWriter()
writer.add_blank_page(width=72, height=72)
writer.write(buf)
pdf_data = base64.standard_b64encode(buf.getvalue()).decode("utf-8")
message_batch = client.messages.batches.create(
requests=[
{
"custom_id": "doc1",
"params": {
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": pdf_data,
},
},
{"type": "text", "text": "Summarize this document."},
],
}
],
},
}
]
)
const response = await anthropic.messages.batches.create({
requests: [
{
custom_id: "my-first-request",
params: {
max_tokens: 1024,
messages: [
{
content: [
{
type: "document",
source: {
media_type: "application/pdf",
type: "base64",
data: pdfBase64
}
},
{
type: "text",
text: "Which model has the highest human preference win rates across each use-case?"
}
],
role: "user"
}
],
model: "claude-opus-4-7"
}
},
{
custom_id: "my-second-request",
params: {
max_tokens: 1024,
messages: [
{
content: [
{
type: "document",
source: {
media_type: "application/pdf",
type: "base64",
data: pdfBase64
}
},
{
type: "text",
text: "Extract 5 key insights from this document."
}
],
role: "user"
}
],
model: "claude-opus-4-7"
}
}
]
});
console.log(response);
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.*;
import com.anthropic.models.messages.batches.*;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;
public class MessagesBatchDocumentExample {
public static void main(String[] args) throws IOException {
AnthropicClient client = AnthropicOkHttpClient.fromEnv();
// Read PDF file as base64
byte[] pdfBytes = Files.readAllBytes(Paths.get("pdf_base64.txt"));
String pdfBase64 = new String(pdfBytes);
BatchCreateParams params = BatchCreateParams.builder()
.addRequest(
BatchCreateParams.Request.builder()
.customId("my-first-request")
.params(
BatchCreateParams.Request.Params.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.addUserMessageOfBlockParams(
List.of(
ContentBlockParam.ofDocument(
DocumentBlockParam.builder()
.source(Base64PdfSource.builder().data(pdfBase64).build())
.build()
),
ContentBlockParam.ofText(
TextBlockParam.builder()
.text(
"Which model has the highest human preference win rates across each use-case?"
)
.build()
)
)
)
.build()
)
.build()
)
.addRequest(
BatchCreateParams.Request.builder()
.customId("my-second-request")
.params(
BatchCreateParams.Request.Params.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.addUserMessageOfBlockParams(
List.of(
ContentBlockParam.ofDocument(
DocumentBlockParam.builder()
.source(Base64PdfSource.builder().data(pdfBase64).build())
.build()
),
ContentBlockParam.ofText(
TextBlockParam.builder()
.text("Extract 5 key insights from this document.")
.build()
)
)
)
.build()
)
.build()
)
.build();
MessageBatch batch = client.messages().batches().create(params);
System.out.println(batch);
}
}