视觉

Claude 的视觉能力使其能够理解和分析图像,为多模态交互开辟了令人兴奋的可能性。


本指南介绍了如何在 Claude 中使用图像,包括最佳实践、代码示例和需要注意的限制。


如何使用视觉功能

通过以下方式使用 Claude 的视觉能力:

  • claude.ai。像上传文件一样上传图像,或将图像直接拖放到聊天窗口中。
  • Console Workbench。每个 User 消息块的右上角都有一个添加图像的按钮。
  • API 请求。请参阅本指南中的示例。

一个请求中可以包含多张图像,Claude 会在生成响应时联合分析这些图像。这对于比较或对比图像非常有帮助。


上传前须知

通用限制

每条消息或每个请求的最大图像数量为:

  • claude.ai 上每条消息 20 张。
  • API 上每个请求 100 张(适用于具有 200k token 上下文窗口的模型)。
  • API 上每个请求 600 张(适用于所有其他模型)。

每张图像的最大尺寸为 8000x8000 像素。如果在一个 API 请求中提交超过 20 张图像,此限制将降低为 2000x2000 像素。

Note

虽然 API 支持每个请求最多 600 张图像,但可能会先达到请求大小限制(标准端点为 32 MB;某些合作伙伴运营的平台如 Amazon Bedrock 和 Vertex AI 上的限制更低)。对于大量图像,请考虑使用 Files API 上传并通过 file_id 引用,以保持请求负载较小。

即使使用 Files API,包含大量大图像的请求也可能在达到 600 张图像限制之前失败。在上传之前,请减小图像尺寸或文件大小(例如通过降采样)。请参阅评估图像大小

评估图像大小

一张图像大约消耗 width * height / 750 个 token,其中 width 和 height 以像素为单位。

最大原生图像分辨率为:

  • Claude Opus 4.7:4784 个 token,长边最多 2576 像素。
  • 其他模型:1568 个 token,长边最多 1568 像素。

如果您的输入图像大于此原生分辨率,它将首先被缩放到尽可能大的尺寸,同时保持宽高比。此外,图像会在底部和右角填充到 28 像素的倍数。

Note

当要求 Claude 输出坐标(点、边界框等)时,这些坐标将相对于缩放/填充后的图像表示,需要在客户端根据原始尺寸和缩放后的尺寸进行相应的重新缩放/转换。

为了最小化延迟并简化基于坐标的工作流,建议在上传图像之前先调整其大小。

计算图像成本

您在请求中包含的每张图像都会计入您的 token 使用量。要计算近似成本,请将上述计算的近似图像 token 数乘以您使用的模型的每 token 价格

以下是基于 Claude Sonnet 4.6 每百万输入 token 3 美元的每 token 价格,在 API 尺寸约束内不同图像大小的近似分词和成本示例:

图像大小# of Tokens成本 / 图像成本 / 1k 图像
200x200 px(0.04 百万像素)~54~$0.00016~$0.16
1000x1000 px(1 百万像素)~1334~$0.004~$4.00
1092x1092 px(1.19 百万像素)~1568~$0.0047~$4.70
1920x1080 px(2.07 百万像素)~1568~$0.0047~$4.70
2000x1500 px(3 百万像素)~1568~$0.0047~$4.70

请注意,最后三张图像在处理前会被缩小。

Claude Opus 4.7 的高分辨率图像支持

Claude Opus 4.7 是首个支持高分辨率图像的 Claude 模型。最大图像分辨率为长边 2576 像素,高于之前模型的 1568 像素。这在视觉密集型工作负载上带来了性能提升,对计算机使用、屏幕截图理解和文档分析尤其有价值。

Claude Opus 4.7 上的高分辨率支持是自动的,无需 beta 头或客户端选择加入。

Claude Opus 4.7 上的高分辨率图像可能比之前模型使用约 3 倍多的图像 token(每张图像 4784 对比 1568 个 token)。如果不需要额外的保真度,请在发送前对图像进行降采样以控制 token 成本。

以下是针对 Claude Opus 4.7 分词的相同图像大小,基于其每百万输入 token 5 美元的每 token 价格:

图像大小# of Tokens成本 / 图像成本 / 1k 图像
200x200 px(0.04 百万像素)~54~$0.00027~$0.27
1000x1000 px(1 百万像素)~1334~$0.0067~$6.70
1092x1092 px(1.19 百万像素)~1590~$0.0080~$8.00
1920x1080 px(2.07 百万像素)~2765~$0.014~$14.00
2000x1500 px(3 百万像素)~4000~$0.020~$20.00

确保图像质量

向 Claude 提供图像时,请注意以下几点以获得最佳结果:

  • 图像格式:使用支持的图像格式:JPEG、PNG、GIF 或 WebP。
    不支持动画,仅使用第一帧。
  • 图像清晰度:确保图像清晰,不要过于模糊或像素化。
  • 文本:如果图像包含重要文本,请确保其清晰可读且不会太小。避免为了放大文本而裁剪掉关键的视觉上下文。
  • 调整大小:请注意,如果图像太大,可能会被调整大小(见上文);这可能会使文本变得更难阅读。考虑预先调整图像大小、裁剪图像,或两者同时进行。
  • 图像压缩:在发送前使用有损格式(如 JPEG 或 WebP(有损模式))压缩图像,可以通过减少请求大小来降低延迟。然而,这可能会引入对模型性能有害的伪影,特别是当多次应用压缩时。例如,严重的 JPEG 压缩会使文本难以阅读。通过检查发送到 API 的实际图像来确认您的压缩设置适合该任务。

提示示例

许多适用于 Claude 文本交互的提示技术也可以应用于基于图像的提示。

这些示例展示了涉及图像的最佳实践提示结构。

Tip

正如将长文档放在查询之前可以改善文本提示的结果一样,Claude 在图像放在文本之前时效果最好。放在文本之后或与文本穿插的图像仍然表现良好,但如果您的用例允许,建议采用图像-然后-文本的结构。

关于提示示例

以下示例演示了如何使用各种编程语言和方法来使用 Claude 的视觉功能。您可以通过三种方式向 Claude 提供图像:

  1. 作为 image 内容块中的 base64 编码图像
  2. 作为在线托管图像的 URL 引用
  3. 使用 Files API(上传一次,多次使用)
Note

在 Amazon Bedrock 和 Vertex AI 上,目前仅支持 base64 编码的来源。

base64 示例提示使用以下变量:

    # For URL-based images, you can use the URL directly in your JSON request

    # For base64-encoded images, you need to first encode the image
    # Example of how to encode an image to base64 in bash:
    BASE64_IMAGE_DATA=$(curl -s "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" | base64)

    # The encoded data can now be used in your API calls
import base64
import httpx

# For base64-encoded images
image1_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
image1_media_type = "image/jpeg"
image1_data = base64.standard_b64encode(httpx.get(image1_url).content).decode("utf-8")

image2_url = "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg"
image2_media_type = "image/jpeg"
image2_data = base64.standard_b64encode(httpx.get(image2_url).content).decode("utf-8")

# For URL-based images, you can use the URLs directly in your requests
import axios from "axios";

// For base64-encoded images
async function getBase64Image(url: string): Promise<string> {
  const response = await axios.get(url, { responseType: "arraybuffer" });
  return Buffer.from(response.data, "binary").toString("base64");
}

// Usage
async function prepareImages() {
  const imageData = await getBase64Image(
    "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
  );
  // Now you can use imageData in your API calls
}

// For URL-based images, you can use the URLs directly in your requests
using System;
using System.Net.Http;
using System.Threading.Tasks;

// For base64-encoded images
async Task<string> DownloadAndEncodeImageAsync(string url)
{
    using var client = new HttpClient();
    var bytes = await client.GetByteArrayAsync(url);
    return Convert.ToBase64String(bytes);
}

// Usage:
// var imageData = await DownloadAndEncodeImageAsync("https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg");
// For URL-based images, you can use the URLs directly in your requests
package main

import (
	"encoding/base64"
	"fmt"
	"io"
	"net/http"
)

func downloadAndEncodeImage(url string) (string, error) {
	req, err := http.NewRequest("GET", url, nil)
	if err != nil {
		return "", err
	}
	req.Header.Set("User-Agent", "AnthropicDocsBot/1.0")

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		return "", err
	}
	defer resp.Body.Close()

	data, err := io.ReadAll(resp.Body)
	if err != nil {
		return "", err
	}

	return base64.StdEncoding.EncodeToString(data), nil
}

func main() {
	imageData, err := downloadAndEncodeImage("https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg")
	if err != nil {
		panic(err)
	}
	fmt.Println(imageData[:50])
}
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.util.Base64;

public class ImageHandlingExample {

  public static void main(String[] args) throws IOException, InterruptedException {
    // For base64-encoded images
    String image1Url =
      "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg";
    String image1MediaType = "image/jpeg";
    String image1Data = downloadAndEncodeImage(image1Url);

    String image2Url =
      "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg";
    String image2MediaType = "image/jpeg";
    String image2Data = downloadAndEncodeImage(image2Url);

    // For URL-based images, you can use the URLs directly in your requests
  }

  private static String downloadAndEncodeImage(String imageUrl) throws IOException {
    try (InputStream inputStream = new URL(imageUrl).openStream()) {
      return Base64.getEncoder().encodeToString(inputStream.readAllBytes());
    }
  }
}
<?php
// For base64-encoded images
function downloadAndEncodeImage($url) {
    $imageData = file_get_contents($url);
    return base64_encode($imageData);
}

$image1Url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg";
$image1MediaType = "image/jpeg";
$image1Data = downloadAndEncodeImage($image1Url);

// For URL-based images, you can use the URLs directly in your requests
require "base64"
require "net/http"
require "uri"

# For base64-encoded images
def download_and_encode_image(url)
  uri = URI.parse(url)
  response = Net::HTTP.get_response(uri)
  Base64.strict_encode64(response.body)
end

image1_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
image1_media_type = "image/jpeg"
image1_data = download_and_encode_image(image1_url)

# For URL-based images, you can use the URLs directly in your requests

以下示例展示了如何在 Messages API 请求中使用 base64 编码图像和 URL 引用来包含图像:

Base64 编码图像示例

BASE64_IMAGE_DATA=$(curl -s "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" | base64 | tr -d '\n')

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d @- <<EOF
{
  "model": "claude-opus-4-7",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "image",
          "source": {
            "type": "base64",
            "media_type": "image/jpeg",
            "data": "$BASE64_IMAGE_DATA"
          }
        },
        {
          "type": "text",
          "text": "Describe this image."
        }
      ]
    }
  ]
}
EOF
curl -sSo ./image.jpg \
  https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg

ant messages create <<'YAML'
model: claude-opus-4-7
max_tokens: 1024
messages:
  - role: user
    content:
      - type: image
        source:
          type: base64
          media_type: image/jpeg
          data: "@./image.jpg"
      - type: text
        text: Describe this image.
YAML
import anthropic

image1_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image1_media_type = "image/png"

client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image1_media_type,
                        "data": image1_data,
                    },
                },
                {"type": "text", "text": "Describe this image."},
            ],
        }
    ],
)
print(message)
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY
});

const message = await anthropic.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: {
            type: "base64",
            media_type: "image/jpeg",
            data: imageData // Base64-encoded image data as string
          }
        },
        {
          type: "text",
          text: "Describe this image."
        }
      ]
    }
  ]
});

console.log(message);
using System.Collections.Generic;
using Anthropic;
using Anthropic.Models.Messages;

AnthropicClient client = new();

string imageData = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC";

var message = await client.Messages.Create(new MessageCreateParams
{
    Model = Model.ClaudeOpus4_7,
    MaxTokens = 1024,
    Messages =
    [
        new()
        {
            Role = Role.User,
            Content = new MessageParamContent(new List<ContentBlockParam>
            {
                new ContentBlockParam(new ImageBlockParam(
                    new ImageBlockParamSource(new Base64ImageSource()
                    {
                        Data = imageData,
                        MediaType = MediaType.ImagePng,
                    })
                )),
                new ContentBlockParam(new TextBlockParam("Describe this image.")),
            }),
        }
    ]
});

Console.WriteLine(message);
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()

	imageData := "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"

	message, err := client.Messages.New(context.TODO(), anthropic.MessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 1024,
		Messages: []anthropic.MessageParam{
			anthropic.NewUserMessage(
				anthropic.NewImageBlockBase64("image/png", imageData),
				anthropic.NewTextBlock("Describe this image."),
			),
		},
	})
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(message)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.*;
import java.util.List;

public class VisionExample {

  public static void main(String[] args) {
    AnthropicClient client = AnthropicOkHttpClient.fromEnv();
    String imageData = ""; // Base64-encoded image data as string

    List<ContentBlockParam> contentBlockParams = List.of(
      ContentBlockParam.ofImage(
        ImageBlockParam.builder()
          .source(
            Base64ImageSource.builder()
              .mediaType(Base64ImageSource.MediaType.IMAGE_JPEG)
              .data(imageData)
              .build()
          )
          .build()
      ),
      ContentBlockParam.ofText(TextBlockParam.builder().text("Describe this image.").build())
    );
    Message message = client
      .messages()
      .create(
        MessageCreateParams.builder()
          .model(Model.CLAUDE_OPUS_4_7)
          .maxTokens(1024)
          .addUserMessageOfBlockParams(contentBlockParams)
          .build()
      );

    System.out.println(message);
  }
}
<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));

$imageData = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC";

$message = $client->messages->create(
    maxTokens: 1024,
    messages: [
        [
            'role' => 'user',
            'content' => [
                [
                    'type' => 'image',
                    'source' => [
                        'type' => 'base64',
                        'media_type' => 'image/png',
                        'data' => $imageData,
                    ],
                ],
                ['type' => 'text', 'text' => 'Describe this image.'],
            ],
        ],
    ],
    model: 'claude-opus-4-7',
);

echo $message->content[0]->text;
require "anthropic"

client = Anthropic::Client.new

image_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"

message = client.messages.create(
  model: "claude-opus-4-7",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: {
            type: "base64",
            media_type: "image/png",
            data: image_data
          }
        },
        { type: "text", text: "Describe this image." }
      ]
    }
  ]
)

puts message

基于 URL 的图像示例

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-7",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image",
            "source": {
              "type": "url",
              "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
            }
          },
          {
            "type": "text",
            "text": "Describe this image."
          }
        ]
      }
    ]
  }'
ant messages create <<'YAML'
model: claude-opus-4-7
max_tokens: 1024
messages:
  - role: user
    content:
      - type: image
        source:
          type: url
          url: https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg
      - type: text
        text: Describe this image.
YAML
import anthropic

client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
                    },
                },
                {"type": "text", "text": "Describe this image."},
            ],
        }
    ],
)
print(message)
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY
});

const message = await anthropic.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: {
            type: "url",
            url: "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
          }
        },
        {
          type: "text",
          text: "Describe this image."
        }
      ]
    }
  ]
});

console.log(message);
using System.Collections.Generic;
using Anthropic;
using Anthropic.Models.Messages;

AnthropicClient client = new();

var message = await client.Messages.Create(new MessageCreateParams
{
    Model = Model.ClaudeOpus4_7,
    MaxTokens = 1024,
    Messages =
    [
        new()
        {
            Role = Role.User,
            Content = new MessageParamContent(new List<ContentBlockParam>
            {
                new ContentBlockParam(new ImageBlockParam(
                    new ImageBlockParamSource(new UrlImageSource()
                    {
                        Url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
                    })
                )),
                new ContentBlockParam(new TextBlockParam("Describe this image.")),
            }),
        }
    ]
});

Console.WriteLine(message);
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()

	message, err := client.Messages.New(context.TODO(), anthropic.MessageNewParams{
		Model:     anthropic.ModelClaudeOpus4_7,
		MaxTokens: 1024,
		Messages: []anthropic.MessageParam{
			anthropic.NewUserMessage(
				anthropic.NewImageBlock(anthropic.URLImageSourceParam{
					URL: "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
				}),
				anthropic.NewTextBlock("Describe this image."),
			),
		},
	})
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(message)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.messages.*;
import java.io.IOException;
import java.util.List;

public class VisionExample {

  public static void main(String[] args) throws IOException, InterruptedException {
    AnthropicClient client = AnthropicOkHttpClient.fromEnv();

    List<ContentBlockParam> contentBlockParams = List.of(
      ContentBlockParam.ofImage(
        ImageBlockParam.builder()
          .source(
            UrlImageSource.builder()
              .url(
                "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
              )
              .build()
          )
          .build()
      ),
      ContentBlockParam.ofText(TextBlockParam.builder().text("Describe this image.").build())
    );
    Message message = client
      .messages()
      .create(
        MessageCreateParams.builder()
          .model(Model.CLAUDE_OPUS_4_7)
          .maxTokens(1024)
          .addUserMessageOfBlockParams(contentBlockParams)
          .build()
      );
    System.out.println(message);
  }
}
<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));

$message = $client->messages->create(
    maxTokens: 1024,
    messages: [
        [
            'role' => 'user',
            'content' => [
                [
                    'type' => 'image',
                    'source' => [
                        'type' => 'url',
                        'url' => 'https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg',
                    ],
                ],
                ['type' => 'text', 'text' => 'Describe this image.'],
            ],
        ],
    ],
    model: 'claude-opus-4-7',
);

echo $message->content[0]->text;
require "anthropic"

client = Anthropic::Client.new

message = client.messages.create(
  model: "claude-opus-4-7",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: {
            type: "url",
            url: "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
          }
        },
        { type: "text", text: "Describe this image." }
      ]
    }
  ]
)

puts message

Files API 图像示例

对于需要重复使用的图像,或当您想要避免编码开销时,请使用 Files API。上传图像一次,然后在后续消息中引用返回的 file_id,而不是重新发送 base64 数据。

Tip

在多轮对话和代理工作流中,每个请求都会重新发送完整的对话历史记录。如果图像是 base64 编码的,完整的图像字节会在每一轮中包含在负载中,随着对话的增长,这可能会显著增加请求大小和延迟。将图像上传到 Files API 并通过 file_id 引用它们,无论对话历史中累积了多少图像,都可以保持请求负载较小。

cd "$(mktemp -d)"
curl -sSo image.jpg https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg
# First, upload your image to the Files API
curl -X POST https://api.anthropic.com/v1/files \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: files-api-2025-04-14" \
  -F "file=@image.jpg"

# Then use the returned file_id in your message
curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: files-api-2025-04-14" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-7",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image",
            "source": {
              "type": "file",
              "file_id": "file_abc123"
            }
          },
          {
            "type": "text",
            "text": "Describe this image."
          }
        ]
      }
    ]
  }'
cd "$(mktemp -d)"
curl -sSo image.jpg \
  https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg

# First, upload your image to the Files API
FILE_ID=$(ant beta:files upload \
  --file ./image.jpg \
  --transform id --raw-output)

# Then use the returned file_id in your message
ant beta:messages create \
  --beta files-api-2025-04-14 \
  --transform content --format yaml <<YAML
model: claude-opus-4-7
max_tokens: 1024
messages:
  - role: user
    content:
      - type: image
        source:
          type: file
          file_id: $FILE_ID
      - type: text
        text: Describe this image.
YAML
import anthropic

client = anthropic.Anthropic()

# Upload the image file
with open("image.jpg", "rb") as f:
    file_upload = client.beta.files.upload(file=("image.jpg", f, "image/jpeg"))

# Use the uploaded file in a message
message = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    betas=["files-api-2025-04-14"],
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {"type": "file", "file_id": file_upload.id},
                },
                {"type": "text", "text": "Describe this image."},
            ],
        }
    ],
)

print(message.content)
import Anthropic, { toFile } from "@anthropic-ai/sdk";
import fs from "fs";

const anthropic = new Anthropic();

// Upload the image file
const fileUpload = await anthropic.beta.files.upload({
  file: await toFile(fs.createReadStream("image.jpg"), undefined, { type: "image/jpeg" })
});

// Use the uploaded file in a message
const response = await anthropic.beta.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 1024,
  betas: ["files-api-2025-04-14"],
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: {
            type: "file",
            file_id: fileUpload.id
          }
        },
        {
          type: "text",
          text: "Describe this image."
        }
      ]
    }
  ]
});

console.log(response);
using Anthropic;

var client = new AnthropicClient();

// Upload the image file
var fileUpload = await client.Beta.Files.Upload(
    new FileUploadParams { File = File.OpenRead("image.jpg") });

// Use the uploaded file in a message
var response = await client.Beta.Messages.Create(
    new MessageCreateParams
    {
        Model = "claude-opus-4-7",
        MaxTokens = 1024,
        Betas = new[] { "files-api-2025-04-14" },
        Messages = new[]
        {
            new BetaMessageParam
            {
                Role = "user",
                Content = new object[]
                {
                    new
                    {
                        type = "image",
                        source = new { type = "file", file_id = fileUpload.Id }
                    },
                    new { type = "text", text = "Describe this image." }
                }
            }
        }
    });

Console.WriteLine(response);
package main

import (
	"context"
	"fmt"
	"log"
	"os"

	"github.com/anthropics/anthropic-sdk-go"
)

func main() {
	client := anthropic.NewClient()

	// Upload the image file
	file, err := os.Open("image.jpg")
	if err != nil {
		log.Fatal(err)
	}
	defer file.Close()

	fileUpload, err := client.Beta.Files.Upload(context.Background(),
		anthropic.BetaFileUploadParams{
			File: file,
		})
	if err != nil {
		log.Fatal(err)
	}

	// Use the uploaded file in a message
	message, err := client.Beta.Messages.New(context.Background(),
		anthropic.BetaMessageNewParams{
			Model:     anthropic.ModelClaudeOpus4_7,
			MaxTokens: 1024,
			Betas:     []anthropic.AnthropicBeta{anthropic.AnthropicBetaFilesAPI2025_04_14},
			Messages: []anthropic.BetaMessageParam{
				anthropic.NewBetaUserMessage(
					anthropic.NewBetaImageBlock(anthropic.BetaFileImageSourceParam{
						FileID: fileUpload.ID,
					}),
					anthropic.NewBetaTextBlock("Describe this image."),
				),
			},
		})
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(message.Content)
}
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
import com.anthropic.models.beta.files.FileMetadata;
import com.anthropic.models.beta.files.FileUploadParams;
import com.anthropic.models.messages.*;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.List;

public class ImageFilesExample {

  public static void main(String[] args) throws IOException {
    AnthropicClient client = AnthropicOkHttpClient.fromEnv();

    // Upload the image file
    FileMetadata file = client
      .beta()
      .files()
      .upload(
        FileUploadParams.builder().file(Files.newInputStream(Path.of("image.jpg"))).build()
      );

    // Use the uploaded file in a message
    ImageBlockParam imageParam = ImageBlockParam.builder().fileSource(file.id()).build();

    MessageCreateParams params = MessageCreateParams.builder()
      .model(Model.CLAUDE_OPUS_4_7)
      .maxTokens(1024)
      .addUserMessageOfBlockParams(
        List.of(
          ContentBlockParam.ofImage(imageParam),
          ContentBlockParam.ofText(
            TextBlockParam.builder().text("Describe this image.").build()
          )
        )
      )
      .build();

    Message message = client.messages().create(params);
    System.out.println(message.content());
  }
}
<?php

use Anthropic\Client;

$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));

// Upload the image file
$fileUpload = $client->beta->files->upload(
    file: fopen('image.jpg', 'r'),
);

// Use the uploaded file in a message
$message = $client->beta->messages->create(
    maxTokens: 1024,
    messages: [
        [
            'role' => 'user',
            'content' => [
                [
                    'type' => 'image',
                    'source' => ['type' => 'file', 'file_id' => $fileUpload->id],
                ],
                ['type' => 'text', 'text' => 'Describe this image.'],
            ],
        ],
    ],
    model: 'claude-opus-4-7',
    betas: ['files-api-2025-04-14'],
);

echo $message->content[0]->text;
require "anthropic"

client = Anthropic::Client.new

# Upload the image file
file_upload = client.beta.files.upload(
  file: File.open("image.jpg", "rb")
)

# Use the uploaded file in a message
message = client.beta.messages.create(
  model: "claude-opus-4-7",
  max_tokens: 1024,
  betas: ["files-api-2025-04-14"],
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: { type: "file", file_id: file_upload.id }
        },
        { type: "text", text: "Describe this image." }
      ]
    }
  ]
)

puts message.content

请参阅 Messages API 示例 了解更多示例代码和参数详情。

示例:单张图像

最好将图像放在关于它们的问题或使用它们的任务指令之前。

让 Claude 描述一张图像。

角色内容
User[Image] Describe this image.
import anthropic

client = anthropic.Anthropic()
image1_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image1_media_type = "image/png"

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image1_media_type,
                        "data": image1_data,
                    },
                },
                {"type": "text", "text": "Describe this image."},
            ],
        }
    ],
)
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
                    },
                },
                {"type": "text", "text": "Describe this image."},
            ],
        }
    ],
)

示例:多张图像

在有多张图像的情况下,请用 Image 1:Image 2: 等来介绍每张图像。图像之间或图像与提示之间不需要换行。

让 Claude 描述多张图像之间的差异。

角色内容
UserImage 1: [Image 1] Image 2: [Image 2] How are these images different?
import anthropic

client = anthropic.Anthropic()
image1_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image1_media_type = "image/png"
image2_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image2_media_type = "image/png"

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Image 1:"},
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image1_media_type,
                        "data": image1_data,
                    },
                },
                {"type": "text", "text": "Image 2:"},
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image2_media_type,
                        "data": image2_data,
                    },
                },
                {"type": "text", "text": "How are these images different?"},
            ],
        }
    ],
)
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Image 1:"},
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
                    },
                },
                {"type": "text", "text": "Image 2:"},
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg",
                    },
                },
                {"type": "text", "text": "How are these images different?"},
            ],
        }
    ],
)

示例:多张图像配合系统提示

让 Claude 描述多张图像之间的差异,同时为其提供如何响应的系统提示。

内容
SystemRespond only in Spanish.
UserImage 1: [Image 1] Image 2: [Image 2] How are these images different?
import anthropic

client = anthropic.Anthropic()
image1_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image1_media_type = "image/png"
image2_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4z8AAAAMBAQDJ/pLvAAAAAElFTkSuQmCC"
image2_media_type = "image/png"

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    system="Respond only in Spanish.",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Image 1:"},
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image1_media_type,
                        "data": image1_data,
                    },
                },
                {"type": "text", "text": "Image 2:"},
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image2_media_type,
                        "data": image2_data,
                    },
                },
                {"type": "text", "text": "How are these images different?"},
            ],
        }
    ],
)
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    system="Respond only in Spanish.",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Image 1:"},
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg",
                    },
                },
                {"type": "text", "text": "Image 2:"},
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg",
                    },
                },
                {"type": "text", "text": "How are these images different?"},
            ],
        }
    ],
)

示例:跨两轮对话的四张图像

Claude 的视觉能力在混合图像和文本的多模态对话中表现出色。您可以与 Claude 进行持续的来回交流,随时添加新图像或后续问题。这为迭代图像分析、比较或将视觉与其他知识结合的强大工作流提供了可能。

让 Claude 对比两张图像,然后提出后续问题将前两张图像与两张新图像进行比较。

角色内容
UserImage 1: [Image 1] Image 2: [Image 2] How are these images different?
Assistant[Claude's response]
UserImage 1: [Image 3] Image 2: [Image 4] Are these images similar to the first two?
Assistant[Claude's response]

使用 API 时,请将新图像作为任何标准多轮对话结构的一部分插入到 user 角色的消息数组中。


限制

虽然 Claude 的图像理解能力是前沿的,但需要注意一些限制:

  • 人物识别:Claude 不能被用于识别图像中的人物,并且会拒绝这样做。
  • 准确性:在解释低质量、旋转或非常小(低于 200 像素)的图像时,Claude 可能会产生幻觉或犯错。
  • 空间推理:Claude 的空间推理能力有限。它可能在需要精确定位或布局的任务上表现困难,例如读取模拟时钟面或描述棋子的确切位置。
  • 计数:Claude 可以给出图像中物体的近似计数,但可能并不总是精确,特别是对于大量小物体。
  • AI 生成的图像:Claude 不知道图像是否是 AI 生成的,如果被问到可能会不正确。不要依赖它来检测虚假或合成图像。
  • 不当内容:Claude 不处理违反可接受使用政策的不当或明确图像。
  • 医疗应用:虽然 Claude 可以分析一般的医学图像,但它并非用于解释复杂的诊断扫描(如 CT 或 MRI)。Claude 的输出不应被视为专业医疗建议或诊断的替代品。

始终仔细审查和验证 Claude 的图像解释,特别是对于高风险用例。不要在没有人工监督的情况下将 Claude 用于需要完美精度或敏感图像分析的任务。


常见问题

Claude 支持哪些图像文件类型?

Claude 目前支持 JPEG、PNG、GIF 和 WebP 图像格式,具体为:

  • image/jpeg
  • image/png
  • image/gif
  • image/webp

Claude 能读取图像 URL 吗?

是的,Claude 可以通过 API 中的 URL 图像源块处理来自 URL 的图像。 只需在 API 请求中使用 "url" 源类型代替 "base64"。 示例:

{
  "type": "image",
  "source": {
    "type": "url",
    "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
  }
}

上传的图像文件大小有限制吗?

是的,有限制:

  • API:每张图像最大 5 MB
  • claude.ai:每张图像最大 10 MB

超过这些限制的图像会被拒绝,使用 API 时会返回错误。

一个请求中可以包含多少张图像?

图像限制为:

  • Messages API:每个请求最多 600 张图像(具有 200k token 上下文窗口的模型为 100 张)
  • claude.ai:每轮最多 20 张图像

超过这些限制的请求会被拒绝并返回错误。包含大量大图像的请求也可能在达到这些限制之前失败;详情请参阅通用限制

Claude 会读取图像元数据吗?

不会,Claude 不会解析或接收传递给它的图像的任何元数据。

我可以删除已上传的图像吗?

不可以。图像上传是临时的,不会在 API 请求持续时间之外存储。上传的图像在处理完成后会自动删除。

在哪里可以找到图像上传的数据隐私详情?

请参阅 Anthropic 隐私政策页面,了解有关上传图像和其他数据如何处理的信息。Anthropic 不会使用上传的图像来训练模型。

如果 Claude 的图像解释似乎不正确怎么办?

如果 Claude 的图像解释似乎不正确:

  1. 确保图像清晰、高质量且方向正确。
  2. 尝试使用提示工程技术来改善结果。
  3. 如果问题仍然存在,请在 claude.ai 中标记输出(点赞/踩)或联系支持团队

您的反馈有助于改进 Claude!

Claude 能生成或编辑图像吗?

不能,Claude 仅是图像理解模型。它可以解释和分析图像,但不能生成、制作、编辑、操作或创建图像。


深入了解视觉

准备好开始使用 Claude 构建图像应用了吗?以下是一些有用的资源:

如果您有任何其他问题,请联系支持团队。您还可以加入开发者社区,与其他创建者联系并获得 Anthropic 专家的帮助。