视觉
Claude 的视觉能力使其能够理解和分析图像,为多模态交互开辟了令人兴奋的可能性。
本指南介绍了如何在 Claude 中使用图像,包括最佳实践、代码示例和需要注意的限制。
如何使用视觉功能
通过以下方式使用 Claude 的视觉能力:
- claude.ai。像上传文件一样上传图像,或将图像直接拖放到聊天窗口中。
- Console Workbench。每个 User 消息块的右上角都有一个添加图像的按钮。
- API 请求。请参阅本指南中的示例。
一个请求中可以包含多张图像,Claude 会在生成响应时联合分析这些图像。这对于比较或对比图像非常有帮助。
上传前须知
通用限制
每条消息或每个请求的最大图像数量为:
- claude.ai 上每条消息 20 张。
- API 上每个请求 100 张(适用于具有 200k token 上下文窗口的模型)。
- API 上每个请求 600 张(适用于所有其他模型)。
每张图像的最大尺寸为 8000x8000 像素。如果在一个 API 请求中提交超过 20 张图像,此限制将降低为 2000x2000 像素。
评估图像大小
一张图像大约消耗 width * height / 750 个 token,其中 width 和 height 以像素为单位。
最大原生图像分辨率为:
- Claude Opus 4.7:4784 个 token,长边最多 2576 像素。
- 其他模型:1568 个 token,长边最多 1568 像素。
如果您的输入图像大于此原生分辨率,它将首先被缩放到尽可能大的尺寸,同时保持宽高比。此外,图像会在底部和右角填充到 28 像素的倍数。
当要求 Claude 输出坐标(点、边界框等)时,这些坐标将相对于缩放/填充后的图像表示,需要在客户端根据原始尺寸和缩放后的尺寸进行相应的重新缩放/转换。
为了最小化延迟并简化基于坐标的工作流,建议在上传图像之前先调整其大小。
计算图像成本
您在请求中包含的每张图像都会计入您的 token 使用量。要计算近似成本,请将上述计算的近似图像 token 数乘以您使用的模型的每 token 价格。
以下是基于 Claude Sonnet 4.6 每百万输入 token 3 美元的每 token 价格,在 API 尺寸约束内不同图像大小的近似分词和成本示例:
| 图像大小 | # of Tokens | 成本 / 图像 | 成本 / 1k 图像 |
|---|---|---|---|
| 200x200 px(0.04 百万像素) | ~54 | ~$0.00016 | ~$0.16 |
| 1000x1000 px(1 百万像素) | ~1334 | ~$0.004 | ~$4.00 |
| 1092x1092 px(1.19 百万像素) | ~1568 | ~$0.0047 | ~$4.70 |
| 1920x1080 px(2.07 百万像素) | ~1568 | ~$0.0047 | ~$4.70 |
| 2000x1500 px(3 百万像素) | ~1568 | ~$0.0047 | ~$4.70 |
请注意,最后三张图像在处理前会被缩小。
Claude Opus 4.7 的高分辨率图像支持
Claude Opus 4.7 是首个支持高分辨率图像的 Claude 模型。最大图像分辨率为长边 2576 像素,高于之前模型的 1568 像素。这在视觉密集型工作负载上带来了性能提升,对计算机使用、屏幕截图理解和文档分析尤其有价值。
Claude Opus 4.7 上的高分辨率支持是自动的,无需 beta 头或客户端选择加入。
Claude Opus 4.7 上的高分辨率图像可能比之前模型使用约 3 倍多的图像 token(每张图像 4784 对比 1568 个 token)。如果不需要额外的保真度,请在发送前对图像进行降采样以控制 token 成本。
以下是针对 Claude Opus 4.7 分词的相同图像大小,基于其每百万输入 token 5 美元的每 token 价格:
| 图像大小 | # of Tokens | 成本 / 图像 | 成本 / 1k 图像 |
|---|---|---|---|
| 200x200 px(0.04 百万像素) | ~54 | ~$0.00027 | ~$0.27 |
| 1000x1000 px(1 百万像素) | ~1334 | ~$0.0067 | ~$6.70 |
| 1092x1092 px(1.19 百万像素) | ~1590 | ~$0.0080 | ~$8.00 |
| 1920x1080 px(2.07 百万像素) | ~2765 | ~$0.014 | ~$14.00 |
| 2000x1500 px(3 百万像素) | ~4000 | ~$0.020 | ~$20.00 |
确保图像质量
向 Claude 提供图像时,请注意以下几点以获得最佳结果:
- 图像格式:使用支持的图像格式:JPEG、PNG、GIF 或 WebP。
不支持动画,仅使用第一帧。 - 图像清晰度:确保图像清晰,不要过于模糊或像素化。
- 文本:如果图像包含重要文本,请确保其清晰可读且不会太小。避免为了放大文本而裁剪掉关键的视觉上下文。
- 调整大小:请注意,如果图像太大,可能会被调整大小(见上文);这可能会使文本变得更难阅读。考虑预先调整图像大小、裁剪图像,或两者同时进行。
- 图像压缩:在发送前使用有损格式(如 JPEG 或 WebP(有损模式))压缩图像,可以通过减少请求大小来降低延迟。然而,这可能会引入对模型性能有害的伪影,特别是当多次应用压缩时。例如,严重的 JPEG 压缩会使文本难以阅读。通过检查发送到 API 的实际图像来确认您的压缩设置适合该任务。
提示示例
许多适用于 Claude 文本交互的提示技术也可以应用于基于图像的提示。
这些示例展示了涉及图像的最佳实践提示结构。
正如将长文档放在查询之前可以改善文本提示的结果一样,Claude 在图像放在文本之前时效果最好。放在文本之后或与文本穿插的图像仍然表现良好,但如果您的用例允许,建议采用图像-然后-文本的结构。
关于提示示例
以下示例演示了如何使用各种编程语言和方法来使用 Claude 的视觉功能。您可以通过三种方式向 Claude 提供图像:
- 作为
image内容块中的 base64 编码图像 - 作为在线托管图像的 URL 引用
- 使用 Files API(上传一次,多次使用)
在 Amazon Bedrock 和 Vertex AI 上,目前仅支持 base64 编码的来源。
base64 示例提示使用以下变量:
# For URL-based images, you can use the URL directly in your JSON request
# For base64-encoded images, you need to first encode the image
# Example of how to encode an image to base64 in bash:
BASE64_IMAGE_DATA=$(curl -s "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" | base64)
# The encoded data can now be used in your API calls
import base64
import httpx
# For base64-encoded images
image1_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
image1_media_type = "image/jpeg"
image1_data = base64.standard_b64encode(httpx.get(image1_url).content).decode("utf-8")
image2_url = "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg"
image2_media_type = "image/jpeg"
image2_data = base64.standard_b64encode(httpx.get(image2_url).content).decode("utf-8")
# For URL-based images, you can use the URLs directly in your requests
import axios from "axios";
// For base64-encoded images
async function getBase64Image(url: string): Promise<string> {
const response = await axios.get(url, { responseType: "arraybuffer" });
return Buffer.from(response.data, "binary").toString("base64");
}
// Usage
async function prepareImages() {
const imageData = await getBase64Image(
"https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
);
// Now you can use imageData in your API calls
}
// For URL-based images, you can use the URLs directly in your requests
using System;
using System.Net.Http;
using System.Threading.Tasks;
// For base64-encoded images
async Task<string> DownloadAndEncodeImageAsync(string url)
{
using var client = new HttpClient();
var bytes = await client.GetByteArrayAsync(url);
return Convert.ToBase64String(bytes);
}
// Usage:
// var imageData = await DownloadAndEncodeImageAsync("https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg");
// For URL-based images, you can use the URLs directly in your requests
package main
import (
"encoding/base64"
"fmt"
"io"
"net/http"
)
func downloadAndEncodeImage(url string) (string, error) {
req, err := http.NewRequest("GET", url, nil)
if err != nil {
return "", err
}
req.Header.Set("User-Agent", "AnthropicDocsBot/1.0")
resp, err := http.DefaultClient.Do(req)
if err != nil {
return "", err
}
defer resp.Body.Close()
data, err := io.ReadAll(resp.Body)
if err != nil {
return "", err
}
return base64.StdEncoding.EncodeToString(data), nil
}
func main() {
imageData, err := downloadAndEncodeImage("https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg")
if err != nil {
panic(err)
}
fmt.Println(imageData[:50])
}
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.util.Base64;
public class ImageHandlingExample {
public static void main(String[] args) throws IOException, InterruptedException {
// For base64-encoded images
String image1Url =
"https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg";
String image1MediaType = "image/jpeg";
String image1Data = downloadAndEncodeImage(image1Url);
String image2Url =
"https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg";
String image2MediaType = "image/jpeg";
String image2Data = downloadAndEncodeImage(image2Url);
// For URL-based images, you can use the URLs directly in your requests
}
private static String downloadAndEncodeImage(String imageUrl) throws IOException {
try (InputStream inputStream = new URL(imageUrl).openStream()) {
return Base64.getEncoder().encodeToString(inputStream.readAllBytes());
}
}
}
<?php
// For base64-encoded images
function downloadAndEncodeImage($url) {
$imageData = file_get_contents($url);
return base64_encode($imageData);
}
$image1Url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg";
$image1MediaType = "image/jpeg";
$image1Data = downloadAndEncodeImage($image1Url);
// For URL-based images, you can use the URLs directly in your requests
require "base64"
require "net/http"
require "uri"
# For base64-encoded images
def download_and_encode_image(url)
uri = URI.parse(url)
response = Net::HTTP.get_response(uri)
Base64.strict_encode64(response.body)
end
image1_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
image1_media_type = "image/jpeg"
image1_data = download_and_encode_image(image1_url)
# For URL-based images, you can use the URLs directly in your requests
以下示例展示了如何在 Messages API 请求中使用 base64 编码图像和 URL 引用来包含图像:
Base64 编码图像示例
BASE64_IMAGE_DATA=$(curl -s "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" | base64 | tr -d '\n')
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d @- <<EOF
{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": "$BASE64_IMAGE_DATA"
}
},
{
"type": "text",
"text": "Describe this image."
}
]
}
]
}
EOF
curl -sSo ./image.jpg \
https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg
ant messages create <<'YAML'
model: claude-opus-4-7
max_tokens: 1024
messages:
- role: user
content:
- type: image
source:
type: base64
media_type: image/jpeg
data: "@./image.jpg"
- type: text
text: Describe this image.
YAML
import anthropic
image1_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image1_media_type = "image/png"
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": image1_media_type,
"data": image1_data,
},
},
{"type": "text", "text": "Describe this image."},
],
}
],
)
print(message)
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY
});
const message = await anthropic.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [
{
role: "user",
content: [
{
type: "image",
source: {
type: "base64",
media_type: "image/jpeg",
data: imageData // Base64-encoded image data as string
}
},
{
type: "text",
text: "Describe this image."
}
]
}
]
});
console.log(message);
using System.Collections.Generic;
using Anthropic;
using Anthropic.Models.Messages;
AnthropicClient client = new();
string imageData = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC";
var message = await client.Messages.Create(new MessageCreateParams
{
Model = Model.ClaudeOpus4_7,
MaxTokens = 1024,
Messages =
[
new()
{
Role = Role.User,
Content = new MessageParamContent(new List<ContentBlockParam>
{
new ContentBlockParam(new ImageBlockParam(
new ImageBlockParamSource(new Base64ImageSource()
{
Data = imageData,
MediaType = MediaType.ImagePng,
})
)),
new ContentBlockParam(new TextBlockParam("Describe this image.")),
}),
}
]
});
Console.WriteLine(message);
package main
import (
"context"
"fmt"
"log"
"github.com/anthropics/anthropic-sdk-go"
)
func main() {
client := anthropic.NewClient()
imageData := "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
message, err := client.Messages.New(context.TODO(), anthropic.MessageNewParams{
Model: anthropic.ModelClaudeOpus4_7,
MaxTokens: 1024,
Messages: []anthropic.MessageParam{
anthropic.NewUserMessage(
anthropic.NewImageBlockBase64("image/png", imageData),
anthropic.NewTextBlock("Describe this image."),
),
},
})
if err != nil {
log.Fatal(err)
}
fmt.Println(message)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.*;
import java.util.List;
public class VisionExample {
public static void main(String[] args) {
AnthropicClient client = AnthropicOkHttpClient.fromEnv();
String imageData = ""; // Base64-encoded image data as string
List<ContentBlockParam> contentBlockParams = List.of(
ContentBlockParam.ofImage(
ImageBlockParam.builder()
.source(
Base64ImageSource.builder()
.mediaType(Base64ImageSource.MediaType.IMAGE_JPEG)
.data(imageData)
.build()
)
.build()
),
ContentBlockParam.ofText(TextBlockParam.builder().text("Describe this image.").build())
);
Message message = client
.messages()
.create(
MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.addUserMessageOfBlockParams(contentBlockParams)
.build()
);
System.out.println(message);
}
}
<?php
use Anthropic\Client;
$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
$imageData = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC";
$message = $client->messages->create(
maxTokens: 1024,
messages: [
[
'role' => 'user',
'content' => [
[
'type' => 'image',
'source' => [
'type' => 'base64',
'media_type' => 'image/png',
'data' => $imageData,
],
],
['type' => 'text', 'text' => 'Describe this image.'],
],
],
],
model: 'claude-opus-4-7',
);
echo $message->content[0]->text;
require "anthropic"
client = Anthropic::Client.new
image_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
message = client.messages.create(
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [
{
role: "user",
content: [
{
type: "image",
source: {
type: "base64",
media_type: "image/png",
data: image_data
}
},
{ type: "text", text: "Describe this image." }
]
}
]
)
puts message
基于 URL 的图像示例
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "url",
"url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
}
},
{
"type": "text",
"text": "Describe this image."
}
]
}
]
}'
ant messages create <<'YAML'
model: claude-opus-4-7
max_tokens: 1024
messages:
- role: user
content:
- type: image
source:
type: url
url: https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg
- type: text
text: Describe this image.
YAML
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "url",
"url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
},
},
{"type": "text", "text": "Describe this image."},
],
}
],
)
print(message)
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY
});
const message = await anthropic.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [
{
role: "user",
content: [
{
type: "image",
source: {
type: "url",
url: "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
}
},
{
type: "text",
text: "Describe this image."
}
]
}
]
});
console.log(message);
using System.Collections.Generic;
using Anthropic;
using Anthropic.Models.Messages;
AnthropicClient client = new();
var message = await client.Messages.Create(new MessageCreateParams
{
Model = Model.ClaudeOpus4_7,
MaxTokens = 1024,
Messages =
[
new()
{
Role = Role.User,
Content = new MessageParamContent(new List<ContentBlockParam>
{
new ContentBlockParam(new ImageBlockParam(
new ImageBlockParamSource(new UrlImageSource()
{
Url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
})
)),
new ContentBlockParam(new TextBlockParam("Describe this image.")),
}),
}
]
});
Console.WriteLine(message);
package main
import (
"context"
"fmt"
"log"
"github.com/anthropics/anthropic-sdk-go"
)
func main() {
client := anthropic.NewClient()
message, err := client.Messages.New(context.TODO(), anthropic.MessageNewParams{
Model: anthropic.ModelClaudeOpus4_7,
MaxTokens: 1024,
Messages: []anthropic.MessageParam{
anthropic.NewUserMessage(
anthropic.NewImageBlock(anthropic.URLImageSourceParam{
URL: "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
}),
anthropic.NewTextBlock("Describe this image."),
),
},
})
if err != nil {
log.Fatal(err)
}
fmt.Println(message)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.*;
import java.io.IOException;
import java.util.List;
public class VisionExample {
public static void main(String[] args) throws IOException, InterruptedException {
AnthropicClient client = AnthropicOkHttpClient.fromEnv();
List<ContentBlockParam> contentBlockParams = List.of(
ContentBlockParam.ofImage(
ImageBlockParam.builder()
.source(
UrlImageSource.builder()
.url(
"https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
)
.build()
)
.build()
),
ContentBlockParam.ofText(TextBlockParam.builder().text("Describe this image.").build())
);
Message message = client
.messages()
.create(
MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.addUserMessageOfBlockParams(contentBlockParams)
.build()
);
System.out.println(message);
}
}
<?php
use Anthropic\Client;
$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
$message = $client->messages->create(
maxTokens: 1024,
messages: [
[
'role' => 'user',
'content' => [
[
'type' => 'image',
'source' => [
'type' => 'url',
'url' => 'https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg',
],
],
['type' => 'text', 'text' => 'Describe this image.'],
],
],
],
model: 'claude-opus-4-7',
);
echo $message->content[0]->text;
require "anthropic"
client = Anthropic::Client.new
message = client.messages.create(
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [
{
role: "user",
content: [
{
type: "image",
source: {
type: "url",
url: "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
}
},
{ type: "text", text: "Describe this image." }
]
}
]
)
puts message
Files API 图像示例
对于需要重复使用的图像,或当您想要避免编码开销时,请使用 Files API。上传图像一次,然后在后续消息中引用返回的 file_id,而不是重新发送 base64 数据。
在多轮对话和代理工作流中,每个请求都会重新发送完整的对话历史记录。如果图像是 base64 编码的,完整的图像字节会在每一轮中包含在负载中,随着对话的增长,这可能会显著增加请求大小和延迟。将图像上传到 Files API 并通过 file_id 引用它们,无论对话历史中累积了多少图像,都可以保持请求负载较小。
cd "$(mktemp -d)"
curl -sSo image.jpg https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg
# First, upload your image to the Files API
curl -X POST https://api.anthropic.com/v1/files \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: files-api-2025-04-14" \
-F "file=@image.jpg"
# Then use the returned file_id in your message
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: files-api-2025-04-14" \
-H "content-type: application/json" \
-d '{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "file",
"file_id": "file_abc123"
}
},
{
"type": "text",
"text": "Describe this image."
}
]
}
]
}'
cd "$(mktemp -d)"
curl -sSo image.jpg \
https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg
# First, upload your image to the Files API
FILE_ID=$(ant beta:files upload \
--file ./image.jpg \
--transform id --raw-output)
# Then use the returned file_id in your message
ant beta:messages create \
--beta files-api-2025-04-14 \
--transform content --format yaml <<YAML
model: claude-opus-4-7
max_tokens: 1024
messages:
- role: user
content:
- type: image
source:
type: file
file_id: $FILE_ID
- type: text
text: Describe this image.
YAML
import anthropic
client = anthropic.Anthropic()
# Upload the image file
with open("image.jpg", "rb") as f:
file_upload = client.beta.files.upload(file=("image.jpg", f, "image/jpeg"))
# Use the uploaded file in a message
message = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
betas=["files-api-2025-04-14"],
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {"type": "file", "file_id": file_upload.id},
},
{"type": "text", "text": "Describe this image."},
],
}
],
)
print(message.content)
import Anthropic, { toFile } from "@anthropic-ai/sdk";
import fs from "fs";
const anthropic = new Anthropic();
// Upload the image file
const fileUpload = await anthropic.beta.files.upload({
file: await toFile(fs.createReadStream("image.jpg"), undefined, { type: "image/jpeg" })
});
// Use the uploaded file in a message
const response = await anthropic.beta.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
betas: ["files-api-2025-04-14"],
messages: [
{
role: "user",
content: [
{
type: "image",
source: {
type: "file",
file_id: fileUpload.id
}
},
{
type: "text",
text: "Describe this image."
}
]
}
]
});
console.log(response);
using Anthropic;
var client = new AnthropicClient();
// Upload the image file
var fileUpload = await client.Beta.Files.Upload(
new FileUploadParams { File = File.OpenRead("image.jpg") });
// Use the uploaded file in a message
var response = await client.Beta.Messages.Create(
new MessageCreateParams
{
Model = "claude-opus-4-7",
MaxTokens = 1024,
Betas = new[] { "files-api-2025-04-14" },
Messages = new[]
{
new BetaMessageParam
{
Role = "user",
Content = new object[]
{
new
{
type = "image",
source = new { type = "file", file_id = fileUpload.Id }
},
new { type = "text", text = "Describe this image." }
}
}
}
});
Console.WriteLine(response);
package main
import (
"context"
"fmt"
"log"
"os"
"github.com/anthropics/anthropic-sdk-go"
)
func main() {
client := anthropic.NewClient()
// Upload the image file
file, err := os.Open("image.jpg")
if err != nil {
log.Fatal(err)
}
defer file.Close()
fileUpload, err := client.Beta.Files.Upload(context.Background(),
anthropic.BetaFileUploadParams{
File: file,
})
if err != nil {
log.Fatal(err)
}
// Use the uploaded file in a message
message, err := client.Beta.Messages.New(context.Background(),
anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeOpus4_7,
MaxTokens: 1024,
Betas: []anthropic.AnthropicBeta{anthropic.AnthropicBetaFilesAPI2025_04_14},
Messages: []anthropic.BetaMessageParam{
anthropic.NewBetaUserMessage(
anthropic.NewBetaImageBlock(anthropic.BetaFileImageSourceParam{
FileID: fileUpload.ID,
}),
anthropic.NewBetaTextBlock("Describe this image."),
),
},
})
if err != nil {
log.Fatal(err)
}
fmt.Println(message.Content)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.files.FileMetadata;
import com.anthropic.models.beta.files.FileUploadParams;
import com.anthropic.models.messages.*;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.List;
public class ImageFilesExample {
public static void main(String[] args) throws IOException {
AnthropicClient client = AnthropicOkHttpClient.fromEnv();
// Upload the image file
FileMetadata file = client
.beta()
.files()
.upload(
FileUploadParams.builder().file(Files.newInputStream(Path.of("image.jpg"))).build()
);
// Use the uploaded file in a message
ImageBlockParam imageParam = ImageBlockParam.builder().fileSource(file.id()).build();
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_7)
.maxTokens(1024)
.addUserMessageOfBlockParams(
List.of(
ContentBlockParam.ofImage(imageParam),
ContentBlockParam.ofText(
TextBlockParam.builder().text("Describe this image.").build()
)
)
)
.build();
Message message = client.messages().create(params);
System.out.println(message.content());
}
}
<?php
use Anthropic\Client;
$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
// Upload the image file
$fileUpload = $client->beta->files->upload(
file: fopen('image.jpg', 'r'),
);
// Use the uploaded file in a message
$message = $client->beta->messages->create(
maxTokens: 1024,
messages: [
[
'role' => 'user',
'content' => [
[
'type' => 'image',
'source' => ['type' => 'file', 'file_id' => $fileUpload->id],
],
['type' => 'text', 'text' => 'Describe this image.'],
],
],
],
model: 'claude-opus-4-7',
betas: ['files-api-2025-04-14'],
);
echo $message->content[0]->text;
require "anthropic"
client = Anthropic::Client.new
# Upload the image file
file_upload = client.beta.files.upload(
file: File.open("image.jpg", "rb")
)
# Use the uploaded file in a message
message = client.beta.messages.create(
model: "claude-opus-4-7",
max_tokens: 1024,
betas: ["files-api-2025-04-14"],
messages: [
{
role: "user",
content: [
{
type: "image",
source: { type: "file", file_id: file_upload.id }
},
{ type: "text", text: "Describe this image." }
]
}
]
)
puts message.content
请参阅 Messages API 示例 了解更多示例代码和参数详情。
示例:单张图像
最好将图像放在关于它们的问题或使用它们的任务指令之前。
让 Claude 描述一张图像。
| 角色 | 内容 |
|---|---|
| User | [Image] Describe this image. |
import anthropic
client = anthropic.Anthropic()
image1_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image1_media_type = "image/png"
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": image1_media_type,
"data": image1_data,
},
},
{"type": "text", "text": "Describe this image."},
],
}
],
)
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "url",
"url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
},
},
{"type": "text", "text": "Describe this image."},
],
}
],
)
示例:多张图像
在有多张图像的情况下,请用 Image 1: 和 Image 2: 等来介绍每张图像。图像之间或图像与提示之间不需要换行。
让 Claude 描述多张图像之间的差异。
| 角色 | 内容 |
|---|---|
| User | Image 1: [Image 1] Image 2: [Image 2] How are these images different? |
import anthropic
client = anthropic.Anthropic()
image1_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image1_media_type = "image/png"
image2_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image2_media_type = "image/png"
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Image 1:"},
{
"type": "image",
"source": {
"type": "base64",
"media_type": image1_media_type,
"data": image1_data,
},
},
{"type": "text", "text": "Image 2:"},
{
"type": "image",
"source": {
"type": "base64",
"media_type": image2_media_type,
"data": image2_data,
},
},
{"type": "text", "text": "How are these images different?"},
],
}
],
)
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Image 1:"},
{
"type": "image",
"source": {
"type": "url",
"url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
},
},
{"type": "text", "text": "Image 2:"},
{
"type": "image",
"source": {
"type": "url",
"url": "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg",
},
},
{"type": "text", "text": "How are these images different?"},
],
}
],
)
示例:多张图像配合系统提示
让 Claude 描述多张图像之间的差异,同时为其提供如何响应的系统提示。
| 内容 | |
|---|---|
| System | Respond only in Spanish. |
| User | Image 1: [Image 1] Image 2: [Image 2] How are these images different? |
import anthropic
client = anthropic.Anthropic()
image1_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image1_media_type = "image/png"
image2_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image2_media_type = "image/png"
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
system="Respond only in Spanish.",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Image 1:"},
{
"type": "image",
"source": {
"type": "base64",
"media_type": image1_media_type,
"data": image1_data,
},
},
{"type": "text", "text": "Image 2:"},
{
"type": "image",
"source": {
"type": "base64",
"media_type": image2_media_type,
"data": image2_data,
},
},
{"type": "text", "text": "How are these images different?"},
],
}
],
)
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
system="Respond only in Spanish.",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Image 1:"},
{
"type": "image",
"source": {
"type": "url",
"url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
},
},
{"type": "text", "text": "Image 2:"},
{
"type": "image",
"source": {
"type": "url",
"url": "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg",
},
},
{"type": "text", "text": "How are these images different?"},
],
}
],
)
示例:跨两轮对话的四张图像
Claude 的视觉能力在混合图像和文本的多模态对话中表现出色。您可以与 Claude 进行持续的来回交流,随时添加新图像或后续问题。这为迭代图像分析、比较或将视觉与其他知识结合的强大工作流提供了可能。
让 Claude 对比两张图像,然后提出后续问题将前两张图像与两张新图像进行比较。
| 角色 | 内容 |
|---|---|
| User | Image 1: [Image 1] Image 2: [Image 2] How are these images different? |
| Assistant | [Claude's response] |
| User | Image 1: [Image 3] Image 2: [Image 4] Are these images similar to the first two? |
| Assistant | [Claude's response] |
使用 API 时,请将新图像作为任何标准多轮对话结构的一部分插入到 user 角色的消息数组中。
限制
虽然 Claude 的图像理解能力是前沿的,但需要注意一些限制:
- 人物识别:Claude 不能被用于识别图像中的人物,并且会拒绝这样做。
- 准确性:在解释低质量、旋转或非常小(低于 200 像素)的图像时,Claude 可能会产生幻觉或犯错。
- 空间推理:Claude 的空间推理能力有限。它可能在需要精确定位或布局的任务上表现困难,例如读取模拟时钟面或描述棋子的确切位置。
- 计数:Claude 可以给出图像中物体的近似计数,但可能并不总是精确,特别是对于大量小物体。
- AI 生成的图像:Claude 不知道图像是否是 AI 生成的,如果被问到可能会不正确。不要依赖它来检测虚假或合成图像。
- 不当内容:Claude 不处理违反可接受使用政策的不当或明确图像。
- 医疗应用:虽然 Claude 可以分析一般的医学图像,但它并非用于解释复杂的诊断扫描(如 CT 或 MRI)。Claude 的输出不应被视为专业医疗建议或诊断的替代品。
始终仔细审查和验证 Claude 的图像解释,特别是对于高风险用例。不要在没有人工监督的情况下将 Claude 用于需要完美精度或敏感图像分析的任务。
常见问题
Claude 支持哪些图像文件类型?
Claude 目前支持 JPEG、PNG、GIF 和 WebP 图像格式,具体为:
image/jpegimage/pngimage/gifimage/webp
Claude 能读取图像 URL 吗?
是的,Claude 可以通过 API 中的 URL 图像源块处理来自 URL 的图像。 只需在 API 请求中使用 "url" 源类型代替 "base64"。 示例:
{
"type": "image",
"source": {
"type": "url",
"url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
}
}
上传的图像文件大小有限制吗?
是的,有限制:
- API:每张图像最大 5 MB
- claude.ai:每张图像最大 10 MB
超过这些限制的图像会被拒绝,使用 API 时会返回错误。
一个请求中可以包含多少张图像?
图像限制为:
- Messages API:每个请求最多 600 张图像(具有 200k token 上下文窗口的模型为 100 张)
- claude.ai:每轮最多 20 张图像
超过这些限制的请求会被拒绝并返回错误。包含大量大图像的请求也可能在达到这些限制之前失败;详情请参阅通用限制。
Claude 会读取图像元数据吗?
不会,Claude 不会解析或接收传递给它的图像的任何元数据。
我可以删除已上传的图像吗?
不可以。图像上传是临时的,不会在 API 请求持续时间之外存储。上传的图像在处理完成后会自动删除。
在哪里可以找到图像上传的数据隐私详情?
请参阅 Anthropic 隐私政策页面,了解有关上传图像和其他数据如何处理的信息。Anthropic 不会使用上传的图像来训练模型。
如果 Claude 的图像解释似乎不正确怎么办?
如果 Claude 的图像解释似乎不正确:
- 确保图像清晰、高质量且方向正确。
- 尝试使用提示工程技术来改善结果。
- 如果问题仍然存在,请在 claude.ai 中标记输出(点赞/踩)或联系支持团队。
您的反馈有助于改进 Claude!
Claude 能生成或编辑图像吗?
不能,Claude 仅是图像理解模型。它可以解释和分析图像,但不能生成、制作、编辑、操作或创建图像。
深入了解视觉
准备好开始使用 Claude 构建图像应用了吗?以下是一些有用的资源:
- 多模态 cookbook:此 cookbook 包含图像入门和最佳实践技术的提示,以确保图像的最高质量性能。了解如何有效地使用图像提示 Claude 来执行解释和分析图表或从表单中提取内容等任务。
- API 参考:Messages API 的文档,包括涉及图像的 API 调用示例。