审核 | OpenAI API

使用审核端点来检查文本或图像是否包含潜在有害内容。如果识别出有害内容，你可以采取纠正措施，例如过滤内容或对创建违规内容的用户帐户进行干预。审核端点可免费使用。图像文件限制为 20 MB。

你可以为此端点使用两种模型：

omni-moderation-latest：该模型及其所有快照支持更多分类选项和多模态输入。
text-moderation-latest （旧版）: 较早的模型，仅支持文本输入和较少的输入分类。对于新的应用，较新的 omni-moderation 模型将是最佳选择。

快速入门

使用下面的标签页查看如何使用我们的官方 SDK and the omni-moderation-latest 模型:

获取文本输入的分类信息

python

1
2
3
4
5
6
7
8
9
from openai import OpenAI
client = OpenAI()

response = client.moderations.create(
model="omni-moderation-latest",
input="...text to classify goes here...",
)

print(response)

1
2
3
4
5
6
7
8
9
import OpenAI from "openai";
const openai = new OpenAI();

const moderation = await openai.moderations.create({
model: "omni-moderation-latest",
input: "...text to classify goes here...",
});

console.log(moderation);

1
2
3
4
5
6
7
8
curl https://api.openai.com/v1/moderations \
-X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "omni-moderation-latest",
"input": "...text to classify goes here..."
}'

获取图像和文本输入的分类信息

python

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from openai import OpenAI
client = OpenAI()

response = client.moderations.create(
model="omni-moderation-latest",
input=[
{"type": "text", "text": "...text to classify goes here..."},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.png",

# can also use base64 encoded image URLs

# "url": "data:image/jpeg;base64,abcdefg..."

}
},
],
)

print(response)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import OpenAI from "openai";
const openai = new OpenAI();

const moderation = await openai.moderations.create({
model: "omni-moderation-latest",
input: [
{ type: "text", text: "...text to classify goes here..." },
{
type: "image_url",
image_url: {
url: "https://example.com/image.png"
// can also use base64 encoded image URLs
// url: "data:image/jpeg;base64,abcdefg..."
}
}
],
});

console.log(moderation);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
curl https://api.openai.com/v1/moderations \
-X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "omni-moderation-latest",
"input": [
{ "type": "text", "text": "...text to classify goes here..." },
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.png"
}
}
]
}'

以下是一个完整的输出示例，其中输入是战争电影单帧中的一张图像。模型正确预测了图像中的暴力指标，其 violence 类别得分大于 0.8。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
{
  "id": "modr-970d409ef3bef3b70c73d8232df86e7d",
  "model": "omni-moderation-latest",
  "results": [
    {
      "flagged": true,
      "categories": {
        "sexual": false,
        "sexual/minors": false,
        "harassment": false,
        "harassment/threatening": false,
        "hate": false,
        "hate/threatening": false,
        "illicit": false,
        "illicit/violent": false,
        "self-harm": false,
        "self-harm/intent": false,
        "self-harm/instructions": false,
        "violence": true,
        "violence/graphic": false
      },
      "category_scores": {
        "sexual": 2.34135824776394e-7,
        "sexual/minors": 1.6346470245419304e-7,
        "harassment": 0.0011643905680426018,
        "harassment/threatening": 0.0022121340080906377,
        "hate": 3.1999824407395835e-7,
        "hate/threatening": 2.4923252458203563e-7,
        "illicit": 0.0005227032493135171,
        "illicit/violent": 3.682979260160596e-7,
        "self-harm": 0.0011175734280627694,
        "self-harm/intent": 0.0006264858507989037,
        "self-harm/instructions": 7.368592981140821e-8,
        "violence": 0.8599265510337075,
        "violence/graphic": 0.37701736389561064
      },
      "category_applied_input_types": {
        "sexual": ["image"],
        "sexual/minors": [],
        "harassment": [],
        "harassment/threatening": [],
        "hate": [],
        "hate/threatening": [],
        "illicit": [],
        "illicit/violent": [],
        "self-harm": ["image"],
        "self-harm/intent": ["image"],
        "self-harm/instructions": ["image"],
        "violence": ["image"],
        "violence/graphic": ["image"]
      }
    }
  ]
}

JSON 响应中的输出包含几个类别，这些类别会告诉你输入中存在哪些（如果有）内容类别，以及模型认为这些内容存在的程度。

输出类别	描述
`flagged`	设置为 `true` 如果模型将内容分类为潜在有害， `false` otherwise.
`categories`	包含一个按类别划分的违规标志字典。对于每个类别，该值为 `true` 如果模型将相应的类别标记为已违规， `false` otherwise.
`category_scores`	包含由模型输出的按类别划分的得分字典，表示模型对输入违反 OpenAI 该类别政策的置信度。该值介于 0 和 1 之间，值越高表示置信度越高。
`category_applied_input_types`	此属性包含有关响应中哪些输入类型被标记的信息（按类别划分）。例如，如果提供给模型的图像和文本输入都被标记为“violence/graphic”，则 `violence/graphic` 属性将被设置为 `["image", "text"]`。这仅在 omni 模型上可用。

我们计划持续升级审核端点的底层模型。因此，依赖于 category_scores 的自定义策略可能需要随着时间的推移进行重新校准。

内容分类

下表描述了可以在审核 API 中检测到的内容类型，以及每个类别支持的模型和输入类型。

标记为“仅文本”的类别不支持图像输入。如果你仅向 omni-moderation-latest 模型发送图像（不带伴随文本），它将为这些不支持的类别返回 0 分。图像文件限制为 20 MB。

类别	描述	模型	输入
`harassment`	表达、煽动或宣扬针对任何目标的骚扰语言的内容。	全部	仅文本
`harassment/threatening`	包含针对任何目标的暴力或严重伤害的骚扰内容。	全部	仅文本
`hate`	基于种族、性别、民族、宗教、国籍、性取向、残疾状况或种姓表达、煽动或宣扬仇恨的内容。针对非受保护群体（例如国际象棋玩家）的仇恨内容被视为骚扰。	全部	仅文本
`hate/threatening`	基于种族、性别、民族、宗教、国籍、性取向、残疾状况或种姓，包含针对目标群体的暴力或严重伤害的仇恨内容。	全部	仅文本
`illicit`	提供有关如何实施非法行为建议或指导的内容。诸如“如何入店行窃”之类的短语将属于此类。	仅限 Omni	仅文本
`illicit/violent`	由 `illicit` 类别标记的相同类型的内容，但还包括对暴力的引用或获取武器。	仅限 Omni	仅文本
`self-harm`	宣扬、鼓励或描绘自残行为的内容，例如自杀、自残和饮食失调。	全部	文本和图像
`self-harm/intent`	说话者表示其正在参与或打算参与自残行为（例如自杀、割伤和饮食失调）的内容。	全部	文本和图像
`self-harm/instructions`	鼓励实施自残行为（例如自杀、割伤和饮食失调）的内容，或提供有关如何实施此类行为的指导或建议的内容。	全部	文本和图像
`sexual`	旨在引起性兴奋的内容，例如对性行为的描述，或推销性服务的内容（不包括性教育和性健康）。	全部	文本和图像
`sexual/minors`	包含未满 18 岁个人的色情内容。	全部	仅文本
`violence`	描绘死亡、暴力或人身伤害的内容。	全部	文本和图像
`violence/graphic`	以图形细节描绘死亡、暴力或人身伤害的内容。	全部	文本和图像

推荐

入门

核心概念

Apps SDK

工具

运行与扩展

评估

实时与音频

模型优化

专业模型

正式上线

旧版 API

资源

入门指南

使用 Codex

配置

管理

自动化

学习

发布

核心概念

规划

构建

部署

转化应用

指南

资源

指南

文件上传

API

衡量

广告主 API

API 参考

最新

主题

主题

贡献

分类

主题

项目

活动

快速入门

内容分类