将 Codex 作为 MCP 服务器运行
你可以将 Codex 作为 MCP 服务器运行,并从其他 MCP 客户端(例如,使用 OpenAI Agents SDK MCP 集成).
要启动 Codex 作为 MCP 服务器,你可以使用以下命令:
codex mcp-server
你可以使用 模型上下文协议检查器:
npx @modelcontextprotocol/inspector codex mcp-server
发送一个 tools/list 请求以查看两个工具:
codex:运行一个 Codex 会话。接受与 Codex 匹配的配置参数 Config 结构。该 codex 工具接受以下属性:
| 属性 | 类型 | 描述 |
|---|---|---|
prompt (必填) | string | 用于启动 Codex 对话的初始用户提示词。 |
approval-policy | string | 模型生成的 Shell 命令的审批策略: untrusted, on-request,且 never. |
base-instructions | string | 用于替代默认指令的指令集。 |
config | object | 覆盖 $CODEX_HOME/config.toml. |
cwd | string | 会话的工作目录。如果是相对路径,则相对于服务器进程的当前目录进行解析。 |
include-plan-tool | boolean | 是否在对话中包含计划工具。 |
model | string | 模型名称的可选覆盖(例如, o3, o4-mini). |
profile | string | 配置文件名称;Codex 将加载 $CODEX_HOME/profile-name.config.toml to specify default options. |
sandbox | string | 沙盒模式: read-only, workspace-write, or danger-full-access. |
codex-reply:通过提供线程 ID 和提示词来继续一个 Codex 会话。 codex-reply 工具接受以下属性:
| 属性 | 类型 | 描述 |
|---|---|---|
prompt (必填) | 字符串 | 用于继续 Codex 对话的下一条用户提示词。 |
threadId (必填) | 字符串 | 要继续的线程 ID。 |
conversationId (已弃用) | 字符串 | 已弃用的别名,对应于 threadId (保留以兼容)。 |
使用 threadId from structuredContent.threadId in the tools/call 响应。审批提示 (exec/patch) 还包含 threadId in their params payload.
响应载荷示例:
{
"structuredContent": {
"threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e",
"content": "`ls -lah` (or `ls -alh`) — long listing, includes dotfiles, human-readable sizes."
},
"content": [
{
"type": "text",
"text": "`ls -lah` (or `ls -alh`) — long listing, includes dotfiles, human-readable sizes."
}
]
}
请注意,现代 MCP 客户端通常仅报告 "structuredContent" 作为工具调用的结果(如果存在),不过 Codex MCP 服务器也会返回 "content" 以兼容较旧的 MCP 客户端。
创建多智能体工作流
Codex CLI 的功能远不止运行临时任务。通过将 CLI 作为 模型上下文协议 (MCP) 服务器暴露,并使用 OpenAI Agents SDK 进行编排,你可以创建从单一智能体扩展到完整软件交付管道的、确定性且可审查的工作流。
本指南将逐步演示与 OpenAI Cookbook。你将:
- 中将 Codex CLI 作为持久运行的 MCP 服务器启动、
- 构建一个专用的单智能体工作流以生成可在浏览器中运行的游戏,以及
- 编排一个具备交接、防护机制和完整追踪记录(可供事后审查)的多智能体团队相同的流程。
开始之前,请确保你已:
- Codex CLI 已本地安装,并且
codex命令可用。 - Python 3.10+ 及
pip. - Node.js 18+(如果你想运行上面的 MCP Inspector 示例)。
- 本地存储的 OpenAI API 密钥。你可以在 OpenAI 仪表板.
为指南创建一个工作目录,并将你的 API 密钥添加到 .env file:
mkdir codex-workflows
cd codex-workflows
printf "OPENAI_API_KEY=sk-..." > .env
安装依赖项
Agents SDK 负责处理 Codex 之间的编排、交接和追踪。安装最新的 SDK 包:
python -m venv .venv
source .venv/bin/activate
pip install --upgrade openai openai-agents python-dotenv
激活虚拟环境可将 SDK 依赖项与系统其他部分保持隔离。
将 Codex CLI 初始化为 MCP 服务器
首先将 Codex CLI 变为 Agents SDK 可调用的 MCP 服务器。该服务器暴露两个工具(codex() 用于启动对话和 codex-reply() 用于继续对话),并让 Codex 在多个 Agent 轮次中保持活跃。
创建一个名为 codex_mcp.py and add the following:
import asyncio
from agents import Agent, Runner
from agents.mcp import MCPServerStdio
async def main() -> None:
async with MCPServerStdio(
name="Codex CLI",
params={
"command": "codex",
"args": ["mcp-server"],
},
client_session_timeout_seconds=360000,
) as codex_mcp_server:
print("Codex MCP server started.")
# More logic coming in the next sections.
return
if __name__ == "__main__":
asyncio.run(main())
运行一次该脚本以验证 Codex 是否成功启动:
python codex_mcp.py
脚本在打印 Codex MCP server started.。在接下来的部分中,你将在更丰富的工作流中复用同一个 MCP 服务器。
构建单 Agent 工作流
让我们从一个限定范围的示例开始,该示例使用 Codex MCP 来发布一个小型浏览器游戏。该工作流依赖于两个 Agent:
- 游戏设计师:编写游戏的需求简介。
- 游戏开发者:通过调用 Codex MCP 来实现该游戏。
5 月 28 日 codex_mcp.py 和以下代码。它保留了上面的 MCP 服务器设置,并添加了两个代理。
import asyncio
import os
from dotenv import load_dotenv
from agents import Agent, Runner, set_default_openai_api
from agents.mcp import MCPServerStdio
load_dotenv(override=True)
set_default_openai_api(os.getenv("OPENAI_API_KEY"))
async def main() -> None:
async with MCPServerStdio(
name="Codex CLI",
params={
"command": "codex",
"args": ["mcp-server"],
},
client_session_timeout_seconds=360000,
) as codex_mcp_server:
developer_agent = Agent(
name="Game Developer",
instructions=(
"You are an expert in building simple games using basic html + css + javascript with no dependencies. "
"Save your work in a file called index.html in the current directory. "
"Always call codex with \"approval-policy\": \"never\" and \"sandbox\": \"workspace-write\"."
),
mcp_servers=[codex_mcp_server],
)
designer_agent = Agent(
name="Game Designer",
instructions=(
"You are an indie game connoisseur. Come up with an idea for a single page html + css + javascript game that a developer could build in about 50 lines of code. "
"Format your request as a 3 sentence design brief for a game developer and call the Game Developer coder with your idea."
),
model="gpt-5",
handoffs=[developer_agent],
)
await Runner.run(designer_agent, "Implement a fun new game!")
if __name__ == "__main__":
asyncio.run(main())
执行脚本:
python codex_mcp.py
Codex 将读取设计师的简报,创建一个 index.html 文件,并将完整的游戏写入磁盘。在浏览器中打开生成的文件即可体验成果。每次运行都会产生不同的设计,包含独特的游戏风格变化与润色。
扩展为多代理工作流
现在将单代理设置转变为一个可编排、可追踪的工作流。该系统新增了:
- 项目经理:创建共享需求、协调交接并执行约束。
- 设计师, 前端开发人员, 服务端开发人员,且 测试人员:每个代理都包含限定范围的指令和输出文件夹。
创建一个名为以下内容的新文件 multi_agent_workflow.py:
import asyncio
import os
from dotenv import load_dotenv
from agents import (
Agent,
ModelSettings,
Runner,
WebSearchTool,
set_default_openai_api,
)
from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX
from agents.mcp import MCPServerStdio
from openai.types.shared import Reasoning
load_dotenv(override=True)
set_default_openai_api(os.getenv("OPENAI_API_KEY"))
async def main() -> None:
async with MCPServerStdio(
name="Codex CLI",
params={"command": "codex", "args": ["mcp"]},
client_session_timeout_seconds=360000,
) as codex_mcp_server:
designer_agent = Agent(
name="Designer",
instructions=(
f"""{RECOMMENDED_PROMPT_PREFIX}"""
"You are the Designer.\n"
"Your only source of truth is AGENT_TASKS.md and REQUIREMENTS.md from the Project Manager.\n"
"Do not assume anything that is not written there.\n\n"
"You may use the internet for additional guidance or research."
"Deliverables (write to /design):\n"
"- design_spec.md – a single page describing the UI/UX layout, main screens, and key visual notes as requested in AGENT_TASKS.md.\n"
"- wireframe.md – a simple text or ASCII wireframe if specified.\n\n"
"Keep the output short and implementation-friendly.\n"
"When complete, handoff to the Project Manager with transfer_to_project_manager."
"When creating files, call Codex MCP with {\"approval-policy\":\"never\",\"sandbox\":\"workspace-write\"}."
),
model="gpt-5",
tools=[WebSearchTool()],
mcp_servers=[codex_mcp_server],
)
frontend_developer_agent = Agent(
name="Frontend Developer",
instructions=(
f"""{RECOMMENDED_PROMPT_PREFIX}"""
"You are the Frontend Developer.\n"
"Read AGENT_TASKS.md and design_spec.md. Implement exactly what is described there.\n\n"
"Deliverables (write to /frontend):\n"
"- index.html – main page structure\n"
"- styles.css or inline styles if specified\n"
"- main.js or game.js if specified\n\n"
"Follow the Designer’s DOM structure and any integration points given by the Project Manager.\n"
"Do not add features or branding beyond the provided documents.\n\n"
"When complete, handoff to the Project Manager with transfer_to_project_manager_agent."
"When creating files, call Codex MCP with {\"approval-policy\":\"never\",\"sandbox\":\"workspace-write\"}."
),
model="gpt-5",
mcp_servers=[codex_mcp_server],
)
backend_developer_agent = Agent(
name="Backend Developer",
instructions=(
f"""{RECOMMENDED_PROMPT_PREFIX}"""
"You are the Backend Developer.\n"
"Read AGENT_TASKS.md and REQUIREMENTS.md. Implement the backend endpoints described there.\n\n"
"Deliverables (write to /backend):\n"
"- package.json – include a start script if requested\n"
"- server.js – implement the API endpoints and logic exactly as specified\n\n"
"Keep the code as simple and readable as possible. No external database.\n\n"
"When complete, handoff to the Project Manager with transfer_to_project_manager_agent."
"When creating files, call Codex MCP with {\"approval-policy\":\"never\",\"sandbox\":\"workspace-write\"}."
),
model="gpt-5",
mcp_servers=[codex_mcp_server],
)
tester_agent = Agent(
name="Tester",
instructions=(
f"""{RECOMMENDED_PROMPT_PREFIX}"""
"You are the Tester.\n"
"Read AGENT_TASKS.md and TEST.md. Verify that the outputs of the other roles meet the acceptance criteria.\n\n"
"Deliverables (write to /tests):\n"
"- TEST_PLAN.md – bullet list of manual checks or automated steps as requested\n"
"- test.sh or a simple automated script if specified\n\n"
"Keep it minimal and easy to run.\n\n"
"When complete, handoff to the Project Manager with transfer_to_project_manager."
"When creating files, call Codex MCP with {\"approval-policy\":\"never\",\"sandbox\":\"workspace-write\"}."
),
model="gpt-5",
mcp_servers=[codex_mcp_server],
)
project_manager_agent = Agent(
name="Project Manager",
instructions=(
f"""{RECOMMENDED_PROMPT_PREFIX}"""
"""
You are the Project Manager.
Objective:
Convert the input task list into three project-root files the team will execute against.
Deliverables (write in project root):
- REQUIREMENTS.md: concise summary of product goals, target users, key features, and constraints.
- TEST.md: tasks with [Owner] tags (Designer, Frontend, Backend, Tester) and clear acceptance criteria.
- AGENT_TASKS.md: one section per role containing:
- Project name
- Required deliverables (exact file names and purpose)
- Key technical notes and constraints
Process:
- Resolve ambiguities with minimal, reasonable assumptions. Be specific so each role can act without guessing.
- Create files using Codex MCP with {"approval-policy":"never","sandbox":"workspace-write"}.
- Do not create folders. Only create REQUIREMENTS.md, TEST.md, AGENT_TASKS.md.
Handoffs (gated by required files):
1) After the three files above are created, hand off to the Designer with transfer_to_designer_agent and include REQUIREMENTS.md and AGENT_TASKS.md.
2) Wait for the Designer to produce /design/design_spec.md. Verify that file exists before proceeding.
3) When design_spec.md exists, hand off in parallel to both:
- Frontend Developer with transfer_to_frontend_developer_agent (provide design_spec.md, REQUIREMENTS.md, AGENT_TASKS.md).
- Backend Developer with transfer_to_backend_developer_agent (provide REQUIREMENTS.md, AGENT_TASKS.md).
4) Wait for Frontend to produce /frontend/index.html and Backend to produce /backend/server.js. Verify both files exist.
5) When both exist, hand off to the Tester with transfer_to_tester_agent and provide all prior artifacts and outputs.
6) Do not advance to the next handoff until the required files for that step are present. If something is missing, request the owning agent to supply it and re-check.
PM Responsibilities:
- Coordinate all roles, track file completion, and enforce the above gating checks.
- Do NOT respond with status updates. Just handoff to the next agent until the project is complete.
"""
),
model="gpt-5",
model_settings=ModelSettings(
reasoning=Reasoning(effort="medium"),
),
handoffs=[designer_agent, frontend_developer_agent, backend_developer_agent, tester_agent],
mcp_servers=[codex_mcp_server],
)
designer_agent.handoffs = [project_manager_agent]
frontend_developer_agent.handoffs = [project_manager_agent]
backend_developer_agent.handoffs = [project_manager_agent]
tester_agent.handoffs = [project_manager_agent]
task_list = """
Goal: Build a tiny browser game to showcase a multi-agent workflow.
High-level requirements:
- Single-screen game called "Bug Busters".
- Player clicks a moving bug to earn points.
- Game ends after 20 seconds and shows final score.
- Optional: submit score to a simple backend and display a top-10 leaderboard.
Roles:
- Designer: create a one-page UI/UX spec and basic wireframe.
- Frontend Developer: implement the page and game logic.
- Backend Developer: implement a minimal API (GET /health, GET/POST /scores).
- Tester: write a quick test plan and a simple script to verify core routes.
Constraints:
- No external database—memory storage is fine.
- Keep everything readable for beginners; no frameworks required.
- All outputs should be small files saved in clearly named folders.
"""
result = await Runner.run(project_manager_agent, task_list, max_turns=30)
print(result.final_output)
if __name__ == "__main__":
asyncio.run(main())
运行脚本并查看生成的文件:
python multi_agent_workflow.py
ls -R
项目经理代理编写了 REQUIREMENTS.md, TEST.md,且 AGENT_TASKS.md,然后协调设计、前端、服务器和测试代理之间的交接。每个代理在其专属文件夹中编写指定范围的产出物,然后再将控制权交还给项目经理。
追踪工作流
Codex 会自动记录追踪信息,捕获每一个提示、工具调用和交接过程。多代理运行完成后,打开 跟踪面板 以检查执行时间线。
高层级追踪信息展示了项目经理如何在推进之前验证交接过程。点击各个步骤可查看提示、Codex MCP 调用、写入的文件以及执行持续时间。这些细节使得审查每一次交接以及理解工作流如何逐步演变变得十分直观。这些追踪信息让调试工作流问题、审计代理行为以及随时间衡量性能变得轻而易举,且无需额外的监控机制。