Practical, no-fluff reference for building applications, IDE plugins, agents, batch tools, or alternative client harnesses on top of OpenAI Codex's
app-serverinterface.
This guide distills:
- What
codex app-serveractually is on the wire (transports, handshake, framing). - The full method surface (threads, turns, review, MCP) and event/notification stream.
- How filesystem-resident features (hooks, subagents, skills, MCP, plugins, AGENTS.md) interact with the protocol.
- Where state lives on disk (Codex core state vs. companion state).
- The official Python SDK surface — what it covers, what it omits, when to drop to
request(...). - Third-party clients across Python, Elixir.
- A reference architecture: how the official Claude Code "Codex companion" plugin (
codex-companion.mjs) drives the protocol, including the broker session-reuse trick. - A minimum-viable Python client to copy-paste.
- Practical recipes (CI reviewers, multi-agent fan-out, approval UIs).
All sources linked inline and again at the end.
codex app-server is a stateful, long-lived process that hosts Codex agent threads and exposes them over JSON-RPC 2.0. It's the same backend that powers OpenAI's Codex VS Code extension and JetBrains plugin, and that custom integrations (the openai/codex Python SDK, the JS Claude Code companion, third-party clients) talk to.
Think of it as Codex's IPC interface: you spawn one process per workspace (or reuse a long-lived one), then drive it with structured RPC instead of scraping codex exec output.
Use it when you need any of:
- Streaming reasoning / message / command-output deltas.
- Persistent threads (
thread/resume,thread/fork). - Bidirectional approvals (server asks your client to allow/deny a tool action).
- Structured output (JSON schema-conforming results).
- Multiple turns against the same context.
- Custom UI/UX (IDE plugin, web dashboard, Slack bot, CI reviewer).
If you just want "run prompt → exit", use codex exec non-interactive mode instead — app-server is overkill for one-shots.
References:
- App Server – Codex | OpenAI Developers
- codex/codex-rs/app-server/README.md (openai/codex)
- Unlocking the Codex harness: how we built the App Server | OpenAI
- Non-interactive mode (
codex exec) – Codex
| Transport | How to start | Status | Framing |
|---|---|---|---|
| stdio (default) | codex app-server |
Stable | Newline-delimited JSON (one message per line) over stdin/stdout |
| WebSocket | codex app-server --listen ws://127.0.0.1:4500 |
Experimental, unsupported | One JSON message per WS frame; bounded queues; busy server returns RPC error code -32001 |
Stderr carries human-readable diagnostics; you can mirror it to a log.
Standard JSON-RPC 2.0 with the "jsonrpc":"2.0" member omitted on the wire. Three message kinds:
One message per \n-terminated line on stdio. JSON-RPC error code -32001 means the broker/server is busy — back off and retry.
Mandatory first turn:
// Client →
{"id": 1, "method": "initialize", "params": {
"clientInfo": {"name": "your-app", "title": "Your App", "version": "1.0.0"},
"capabilities": {
"experimentalApi": false,
"optOutNotificationMethods": [
"item/agentMessage/delta",
"item/reasoning/summaryTextDelta",
"item/reasoning/summaryPartAdded",
"item/reasoning/textDelta"
]
}
}}
// Server → response with server capabilities + version
// Then client →
{"method": "initialized", "params": {}}Until the handshake completes the server rejects every other request. optOutNotificationMethods lets you suppress noisy delta channels — useful for batch jobs that only want final results.
experimentalApi: true enables the gated surfaces (dynamic tools, ChatGPT token management, future-flagged features). Don't flip it unless you need them; experimental APIs may break between Codex versions.
The exact parameter and result shapes for the version of Codex you have installed are reproducible:
codex app-server generate-ts # TypeScript types
codex app-server generate-json-schema # JSON Schema bundleUse these to codegen typed clients. Schemas drift between CLI versions — regenerate when you upgrade codex.
References:
| Method | Purpose |
|---|---|
thread/start |
Begin a new agent session. Configure model, reasoning effort, sandbox policy, output schema, dynamic tools, base instructions, personality, working dir, MCP overrides. Returns a threadId. |
thread/resume |
Reopen a stored thread by id. Full history is restored server-side from the rollout JSONL (see §6). Optionally override params for the resumed session. |
thread/fork |
Branch an existing thread to explore a different path without disturbing the original. |
thread/list |
Paginated history; filter by model provider, source kind, archive state, working dir, free-text search. |
thread/rollback |
Drop the last N turns from in-memory context. A rollback marker is appended to the thread's persisted JSONL log. |
thread/name/set |
Rename a thread. |
thread/archive |
Archive (or unarchive) a thread; filtered out of the default thread/list view. |
thread/backgroundTerminals/* |
Manage long-lived shell sessions inside the sandbox. Experimental. |
thread/compact |
Compact history (used by the official Python SDK's thread.compact()). |
| Method | Purpose |
|---|---|
turn/start |
Submit user input (text, images, file paths) and run the agent. Per-turn overrides: model, reasoning effort, personality, sandbox policy, output_schema (structured output), dynamicTools (experimental), skill inputs. Returns a turnId. |
turn/steer |
Append additional input to an in-flight turn without cancelling. |
turn/interrupt |
Abort an in-flight turn cleanly. The companion's /codex:cancel is built on this plus tree-killing the broker process. |
Sandbox policy values seen in clients: read-only, workspace-write, danger-full-access. Reasoning effort: none, minimal, low, medium, high, xhigh (verified from codex-companion.mjs's VALID_REASONING_EFFORTS).
| Method | Purpose |
|---|---|
review/start |
Invoke Codex's built-in code reviewer. Targets: uncommitted working tree ({type:"uncommittedChanges"}), base branch diff ({type:"baseBranch", branch:"main"}), specific commit ranges, or custom targets. Optionally fork into a detached review thread. |
This is what /codex:review in the JS companion calls; the more flexible /codex:adversarial-review instead runs a turn/start against a structured-output schema (schemas/review-output.schema.json in the plugin) so you get parseable JSON.
| Method | Purpose |
|---|---|
mcpServer/tool/call |
Invoke a tool exposed by an MCP server configured for the thread. |
| Method | Purpose |
|---|---|
account/read |
Current ChatGPT/API auth state, refresh token controls. |
config/read |
Effective resolved configuration (model defaults, sandbox defaults, MCP servers, paths). Useful for diagnostics. |
The server can send requests to your client. Most importantly, approvals:
execCommandApproval— server wants to run a shell command, asks for permission.applyPatchApproval— server wants to write/modify a file.- (and others under the same approval umbrella)
Your client responds with accept / decline / cancel. This is how Codex enforces sandboxing with a human-in-the-loop. If you set ApprovalsReviewer to auto_review or guardian, Codex resolves these automatically; set it to user and you'll receive every approval request and must answer.
This is also why the protocol is bidirectional: every client implementation must handle inbound id+method messages, not just inbound responses.
References:
The server streams notifications during turn execution. Subscribe by not opting them out at initialize. Common methods:
| Notification | Meaning |
|---|---|
turn/started |
New turn begins. |
turn/completed |
Turn finished (success or failure). |
item/started |
A new content item (message, reasoning, tool call) starts. |
item/completed |
A content item finished. |
item/agentMessage/delta |
Streaming agent message text. |
item/reasoning/textDelta |
Streaming reasoning text. |
item/reasoning/summaryTextDelta |
Streaming reasoning summary. |
item/reasoning/summaryPartAdded |
New reasoning summary part. |
| (command output deltas, tool call events, etc.) | Various per-tool streams. |
Opt out of high-volume deltas (the four above) for batch / CI jobs. Keep the lifecycle events (turn/*, item/started, item/completed) — those are how you know when the turn is done and what was produced.
These all live as files Codex reads rather than RPC methods you call. They apply to every client of the app-server (TUI, codex exec, your custom Python script) automatically.
Six lifecycle events, configured in config.toml (inline [hooks] table) or sibling hooks.json files at any config layer.
| Event | Tool matcher | Can block? |
|---|---|---|
SessionStart |
source (startup/resume/clear) |
Yes |
PreToolUse |
tool_name (Bash, apply_patch, MCP) |
Yes |
PermissionRequest |
tool_name |
Yes (allow / deny / abstain) |
PostToolUse |
tool_name |
No |
UserPromptSubmit |
– | Yes |
Stop |
– | No |
JSON payload to your hook script: session_id, transcript_path, cwd, hook_event_name, model (+ turn_id for turn-scoped events). Hook output keys: continue, stopReason, systemMessage, suppressOutput. Default timeout: 600s. Multiple matching hooks run concurrently.
Older simpler mechanism: notify = ["/bin/bash", "/path/to/notify.sh"] in config.toml — fires on agent-turn-complete only.
References:
TOML files at ~/.codex/agents/*.toml (personal) or .codex/agents/*.toml (project).
name = "tester"
description = "Runs and explains the test suite for any change"
developer_instructions = "..."
model = "gpt-5.4-codex"
model_reasoning_effort = "high"
sandbox_mode = "workspace-write"
# mcp_servers = [...] / skills.config = {...}Built-in templates (default, worker, explorer) can be overridden by name. Codex spawns subagents only on explicit user request (e.g. "spawn one agent per point") or via /agent slash command in the TUI. Each subagent runs as its own thread; approvals from inactive agent threads bubble up to the active UI labelled by source thread.
Not exposed via dedicated app-server RPC. To replicate the UX from a custom client, start additional thread/start calls yourself with the right config layer pointed at the agent file.
References:
Open standard for packaging reusable agent workflows. Discovery scopes (lowest → highest priority):
- System (bundled with Codex by OpenAI).
- Admin:
/etc/codex/skills. - User:
$HOME/.agents/skills. - Repo:
.agents/skillsin CWD or repo root.
Directory layout:
my-skill/
├── SKILL.md # required, YAML frontmatter (name, description) + instructions
├── scripts/ # optional executable helpers
├── references/ # optional docs the skill can read
├── assets/ # optional binary/static
└── agents/openai.yaml # optional Codex-specific UI/invocation/tool-deps
Progressive disclosure: Codex injects only metadata (name, description, path) at session start, capped at ~8KB total. Full SKILL.md is read only when the skill is selected. Invoked explicitly via /skills or $skill-name mention, or implicitly when prompts match the skill description.
Over the wire: per-turn skill activation is exposed as skill_inputs on turn/start. Skill discovery is filesystem-only.
References:
Configured globally in config.toml:
[mcp_servers.linear]
command = "npx"
args = ["@modelcontextprotocol/server-linear"]
env = { LINEAR_API_KEY = "${env:LINEAR_API_KEY}" }
startup_timeout_sec = 30
tool_timeout_sec = 120
enabled = true
required = false
enabled_tools = ["list_issues", "get_issue"]
# disabled_tools = [...]
[mcp_servers.streamable_example]
url = "https://example.com/mcp"
bearer_token_env_var = "EXAMPLE_TOKEN"
# OAuth: codex mcp login <server-name>Two server kinds: STDIO (command, args, env, cwd) and Streamable HTTP (url, bearer_token_env_var or OAuth via codex mcp login).
App-server methods that touch MCP:
mcpServer/tool/call— invoke a tool against the thread's configured MCP server.dynamicToolsparameter onthread/start— runtime tool registration (gated behindcapabilities.experimentalApi=true). Persisted in the rollout metadata; restored onthread/resumeif not re-supplied.
References:
- Model Context Protocol – Codex
- How to use MCPs with Codex – Composio
- Codex CLI Plugin System (skills + MCP + connectors)
Distribution wrapper that bundles: skills + MCP server configs + app mappings + presentation assets. Skills are the authoring format; plugins are the packaging. The Claude Code "Codex companion" you're integrating with is itself a plugin (lives under ~/.claude/plugins/cache/openai-codex/codex/<version>/).
Codex reads AGENTS.md (or AGENTS.override.md) before doing any work, layering global → project guidance. Same role as CLAUDE.md for Claude Code, or a README-for-agents.
References:
Two completely separate state stores: Codex core state (everything the agent knows) and companion / harness state (the JS plugin's per-job tracking, optional).
~/.codex/
auth.json # ChatGPT/API credentials
config.toml # user config (model, sandbox, MCP servers, hooks, notify)
history.jsonl # global prompt-history log (every user message ever sent)
session_index.jsonl # index over the rollouts below
sessions/ # one JSONL "rollout" per thread — the durable thread state
YYYY/MM/DD/
rollout-<ISO>-<thread-id>.jsonl
archived_sessions/ # archived threads, same format
state_5.sqlite # newer indexes/metadata (sqlite alongside the JSONL)
logs_2.sqlite # internal telemetry/logs
memories/, prompts/, rules/, skills/, plugins/, …
cache/, models_cache.json, generated_images/, shell_snapshots/
.codex-global-state.json # global runtime state
Rollout files are newline-delimited JSON, one event per line. First line is session_meta (id, cwd, model, base_instructions, originator, cli_version), then a stream of turn/item events — same shape you'd see over the wire. Example first line:
{"timestamp":"2026-04-24T12:35:48.797Z","type":"session_meta",
"payload":{"id":"019dbf7d-...","cwd":"/path/to/repo",
"originator":"codex-tui","cli_version":"0.124.0",
"model_provider":"openai", ...}}So ~/.codex/sessions/.../rollout-*.jsonl is the durable, replayable representation of a thread. thread/resume reads from there. thread/list paginates session_index.jsonl. To grep all your past sessions: rg <term> ~/.codex/sessions/.
The Claude Code Codex companion uses its own per-workspace state:
$CLAUDE_PLUGIN_DATA/state/<slug>-<sha256(realpath)[:16]>/
state.json # { version, config:{stopReviewGate}, jobs:[…] } — index, capped at 50
broker.json # { endpoint, pidFile, logFile, sessionDir, pid } for the UDS broker
jobs/
<job-id>.json # full per-job record (request, payload, threadId, turnId, status…)
<job-id>.log # plain-text progress log (NOT JSONL)
Workspace-keyed by <basename>-<sha256(realpath)[:16]>. 50-job cap; older job files garbage-collected on saveState. Locate it with:
echo "${CLAUDE_PLUGIN_DATA:-$TMPDIR/codex-companion}/state"Your own client can ignore all of this; only the JS companion uses it. The rich agent transcripts you actually want are always in ~/.codex/sessions/.
Package: openai-codex-app-server-sdk (in openai/codex repo at sdk/python).
pip install openai-codex-app-server-sdk
# bundles `openai-codex-cli-bin` matching the SDK versionfrom codex_app_server import Codex
with Codex() as codex:
thread = codex.thread_start(model="gpt-5")
result = thread.run("Say hello in one sentence.")
print(result.final_response)
print(result.items)Configure binary location explicitly when running against a non-bundled CLI:
from codex_app_server import Codex, AppServerConfig
with Codex(config=AppServerConfig(codex_bin="/usr/local/bin/codex")) as codex:
...High-level API exposed:
Codex()/AsyncCodex()— process lifecycle, initialize handshake, context-managed shutdown.Thread:start,resume,fork,run,compact.TurnHandle: streaming events,steer,interrupt.- Per-turn config:
model,effort(low/medium/high),output_schema(structured output), image inputs (ImageInput,LocalImageInput),SandboxPolicy,ApprovalsReviewer(user/auto_review/guardian),skill_inputs, custom personality,developer_instructions. - Typed Pydantic notification models with snake_case ↔ camelCase translation.
- Retry helper:
codex_app_server.retry.retry_on_overloadfor-32001busy responses. - Low-level
request(...)escape hatch.
Not (yet) exposed in the high-level wrappers:
- WebSocket transport — stdio only.
thread/list,thread/rollback,thread/name/set,thread/archive.mcpServer/tool/call.dynamicToolsruntime tool registration.thread/backgroundTerminals/*.account/read,config/readintrospection.review/startas first-class — run reviews as turns instead.- Per-action approval callbacks (you can configure
ApprovalsReviewermode but not intercept individualexecCommandApproval/applyPatchApprovalrequests with custom logic).
For everything missing, drop to the request(...) low-level method — the connection and Pydantic models are still useful.
Disclaimers from the README:
- "Experimental Python SDK for
codex app-serverJSON-RPC v2 over stdio." result.final_responseisNoneif a turn ends without a final-answer message.- Schema is bundled per CLI version; mismatched binary will surface as Pydantic validation errors.
- Repo no longer ships codex binaries inside
sdk/python— setcodex_binor rely on the pinnedopenai-codex-cli-binwheel.
References:
- SDK – Codex | OpenAI Developers
- codex/sdk/python on GitHub
- Python SDK – DeepWiki for openai/codex
- Codex app-server python SDK (experimental) – community thread
Async-only Python client by Mariusz Woloszyn. Supports both stdio and WebSocket. Requires Python ≥ 3.12.
uv add codex-app-server-sdk
# or
pip install codex-app-server-sdkimport asyncio
from codex_app_server_sdk import CodexClient
async def main() -> None:
async with CodexClient.connect_stdio() as client:
result = await client.chat_once("Hello from Python")
print(result.final_text)
asyncio.run(main())Helpers: chat_once(...) (one-shot), chat(...) (step-streaming), thread/turn lifecycle handling, plus low-level request(...).
References:
Supports both the codex exec JSONL path and the persistent app-server JSON-RPC path; exposes typed structured results for each.
References:
codex_sdkbynshkrdotcom— full Elixir SDK. (GitHub)ExMCP.ACP.Adapters.Codex— exposes Codex app-server as an ACP adapter inside theex_mcpecosystem.
The Claude Code Codex plugin's codex-companion.mjs is a complete, real-world JSON-RPC client written in plain Node (no SDK dependency). Worth reading as a worked example. See §10 for a tour.
This shows the wire format. Use the official SDK or codex-app-server-sdk for production — they handle retries, typed models, async streaming, and broker reuse.
import json
import subprocess
import threading
import itertools
import queue
class CodexClient:
def __init__(self, cwd="."):
self.proc = subprocess.Popen(
["codex", "app-server"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
bufsize=1,
cwd=cwd,
)
self._ids = itertools.count(1)
self._pending = {}
self._notifications = queue.Queue()
self._reader = threading.Thread(target=self._read_loop, daemon=True)
self._reader.start()
def _read_loop(self):
for line in self.proc.stdout:
line = line.strip()
if not line:
continue
msg = json.loads(line)
if "id" in msg and "method" not in msg:
# Response to one of our requests
fut = self._pending.pop(msg["id"], None)
if fut is not None:
fut.put(msg)
elif "id" in msg and "method" in msg:
# Server → Client request (e.g. approval)
# Reply with method-not-found until you wire approvals
self._send({
"id": msg["id"],
"error": {"code": -32601,
"message": f"Unsupported server request: {msg['method']}"},
})
elif "method" in msg:
# Notification (deltas, lifecycle events)
self._notifications.put(msg)
def _send(self, message):
line = json.dumps(message) + "\n"
self.proc.stdin.write(line)
self.proc.stdin.flush()
def request(self, method, params=None, timeout=120):
i = next(self._ids)
fut = queue.Queue(maxsize=1)
self._pending[i] = fut
self._send({"id": i, "method": method, "params": params or {}})
msg = fut.get(timeout=timeout)
if "error" in msg:
raise RuntimeError(f"{method} failed: {msg['error']}")
return msg.get("result", {})
def notify(self, method, params=None):
self._send({"method": method, "params": params or {}})
def initialize(self, name="my-client", version="0.0.1"):
self.request("initialize", {
"clientInfo": {"name": name, "title": name, "version": version},
"capabilities": {
"experimentalApi": False,
"optOutNotificationMethods": [
"item/agentMessage/delta",
"item/reasoning/summaryTextDelta",
"item/reasoning/summaryPartAdded",
"item/reasoning/textDelta",
],
},
})
self.notify("initialized", {})
def drain_notifications(self):
while not self._notifications.empty():
yield self._notifications.get_nowait()
def close(self):
try:
self.proc.stdin.close()
finally:
self.proc.terminate()
self.proc.wait(timeout=5)
if __name__ == "__main__":
client = CodexClient()
try:
client.initialize()
thread = client.request("thread/start", {"cwd": ".", "model": "gpt-5"})
thread_id = thread["threadId"]
result = client.request("turn/start", {
"threadId": thread_id,
"input": [{"type": "text", "text": "Say hello in one sentence."}],
"sandbox": "read-only",
})
print("Final:", result.get("finalMessage"))
for note in client.drain_notifications():
print("Notif:", note["method"])
finally:
client.close()Real param/result shapes are versioned — generate them with codex app-server generate-json-schema for accuracy. Treat the snippet above as wire-format demonstration.
For real streaming (token-by-token reasoning, multiple parallel threads, or WebSocket), use asyncio + asyncio.subprocess, or jump straight to codex-app-server-sdk (already async).
Don't reject everything with -32601. Implement at minimum:
def _handle_server_request(self, msg):
method = msg["method"]
if method in {"execCommandApproval", "applyPatchApproval"}:
# Show prompt to user / consult policy / log to dashboard
decision = "accept" # or "decline" / "cancel"
self._send({"id": msg["id"], "result": {"decision": decision}})
else:
self._send({
"id": msg["id"],
"error": {"code": -32601, "message": f"Unsupported: {method}"},
})Or set ApprovalsReviewer = "auto_review" / "guardian" on thread/start to let Codex auto-resolve them based on its built-in heuristics, and only intercept when you want custom UX.
The Claude Code Codex plugin's scripts/codex-companion.mjs (under ~/.claude/plugins/cache/openai-codex/codex/<version>/) is a battle-tested example client. Worth understanding when designing your own.
Two transports inside one client (scripts/lib/app-server.mjs):
- Direct: spawn
codex app-serverper command, communicate over stdio JSONL. Simple, but pays full startup cost every invocation. - Broker: a long-lived daemon (
app-server-broker.mjs) is spawned once per workspace. It hosts a persistentcodex app-serverand listens on a Unix domain socket. Subsequent companion invocations connect to the socket vianet.createConnection({ path })instead of spawning a new server. Endpoint, pid, log file, and session dir are tracked in<companion-state>/broker.json.
Why this matters: starting codex app-server involves loading the Codex binary, reading config, attaching MCP servers, and warming caches. The broker amortizes that cost across all subsequent foreground/background companion calls.
Grep result from scripts/lib/codex.mjs:
client.request("thread/start", ...)
client.request("thread/resume", ...)
client.request("thread/name/set", ...)
client.request("thread/list", ...)
client.request("turn/start", ...)
client.request("turn/interrupt", ...)
client.request("review/start", ...)
client.request("account/read", ...)
client.request("config/read", ...)
So the companion is a comprehensive consumer of the protocol — broader than the official Python SDK's high-level surface — but doesn't touch mcpServer/tool/call, dynamicTools, thread/fork, thread/rollback, or thread/archive.
/codex:task in foreground vs background:
- Foreground: opens client → runs turn synchronously → renders result → exits. State persisted to
jobs/<job-id>.jsonand.logfor/codex:statusretrospection. - Background: parent spawns a detached
task-workersubprocess (same script, different subcommand), recordsqueuedstate, returns immediately. Worker reads job record, runs the same turn, updates state./codex:cancelthen sendsturn/interruptto the broker and tree-kills the worker pid.
This pattern is straightforward to port: detached subprocess + shared JSON state file + turn/interrupt for cancellation = a complete background-task system over the protocol.
Optional config stopReviewGate: true makes Codex review your turn before letting Claude Code "stop". Implemented as a turn/start against the adversarial-review prompt + structured-output schema. A practical demonstration of how to layer custom workflows on top of the protocol with zero protocol changes.
Mapping:
| Companion behavior | Python implementation |
|---|---|
| Hooks, MCP, skills, AGENTS.md, subagents, notify | Files in ~/.codex/ / .codex/ / .agents/ — nothing to do, applies automatically. |
/codex:task foreground |
Codex.thread_start(...) → thread.run("...") |
/codex:task --background |
Detached multiprocessing.Process running thread.run(...), write status JSON to your own state dir. |
/codex:task --resume-last |
Track latest thread_id in your state file, then Codex.thread_resume(thread_id). |
/codex:cancel |
Drop to client.request("turn/interrupt", {threadId, turnId}). |
/codex:review |
Use the official SDK's output_schema parameter against an adversarial-review prompt, or drop to client.request("review/start", {...}). |
/codex:status (across sessions) |
client.request("thread/list", {...}) — drop to low-level. |
| Broker daemon (session reuse across CLI invocations) | Either (a) keep one Python process alive (FastAPI / asyncio app), (b) skip and pay the cold-start cost, or (c) build your own UDS daemon mirroring app-server-broker.mjs. |
from codex_app_server import Codex
import json, sys
OUTPUT_SCHEMA = json.load(open("review-output.schema.json"))
with Codex() as codex:
thread = codex.thread_start(
model="gpt-5",
sandbox_policy={"mode": "read-only"},
approvals_reviewer="auto_review",
)
result = thread.run(
"Review the staged diff. Flag bugs, security issues, missing tests. "
"Respond against the provided schema.",
output_schema=OUTPUT_SCHEMA,
)
parsed = json.loads(result.final_response)
if parsed["severity"] in {"high", "critical"}:
sys.exit(1)Wire this to your CI's PR diff + post structured output as a review comment.
Fork the parent thread into N children, run them in parallel, compare outputs:
import asyncio
from codex_app_server import AsyncCodex
async def variant(codex, parent_id, prompt, model, effort):
child = await codex.thread_fork(parent_id)
return await child.run(prompt, model=model, effort=effort)
async def main():
async with AsyncCodex() as codex:
parent = await codex.thread_start(...)
results = await asyncio.gather(
variant(codex, parent.thread_id, "Solve this", "gpt-5", "low"),
variant(codex, parent.thread_id, "Solve this", "gpt-5", "high"),
variant(codex, parent.thread_id, "Solve this", "gpt-5.4-codex","medium"),
)
# rank by self-grading or external rubricThis is how you'd build a "best-of-N" agent or replicate the subagent UX without invoking the TUI's /agent machinery.
Set approvals_reviewer="user" so every action requires explicit approval. Intercept the bidirectional execCommandApproval / applyPatchApproval server-to-client requests, route them to your UI, return the user's decision. The official Python SDK doesn't surface this directly — drop to the third-party codex-app-server-sdk or write your own client (§9).
thread/resume works across days/weeks. Persist thread_id to your DB; pick up where you left off:
with Codex() as codex:
thread = codex.thread_resume(stored_thread_id)
result = thread.run("Pick up where we left off. Next step: …")The full rollout JSONL is on disk under ~/.codex/sessions/, so you can also grep, replay, or export it.
- Bundle skills under
.agents/skills/<my-skill>/SKILL.mdin your repo. - Wire MCP servers in
.codex/config.toml([mcp_servers.<name>]). - Optionally bundle as a Codex plugin for distribution.
- Drive turns from your Python service — skills + MCP tools are picked up automatically.
- Schema versioning:
app-serverschemas drift with CLI versions. Pinopenai-codex-cli-bin(or whatever your distro mechanism is) and regenerate types viagenerate-ts/generate-json-schemaon upgrade. - Experimental surfaces: WebSocket transport, dynamic tools, ChatGPT token management, background terminals — gated behind
experimentalApior explicitly marked unstable. Do not depend on these for production unless you accept regular breaking changes. - Busy responses: RPC error code
-32001means the server (or broker) is overloaded; back off and retry. The official Python SDK shipscodex_app_server.retry.retry_on_overloadfor exactly this. - Bidirectional protocol: every client must handle inbound
id+methodmessages. Replying-32601to everything will work for trivial cases but fails the moment Codex needs an approval. - Stateful, not REST: one process per session (or per workspace if you broker). Don't try to serve multi-tenant workloads with a single shared
app-serverinstance — that's not its model. final_responsemay be null: a turn that ends without an agent message (tool-call-only, errored, etc.) returnsfinal_response = None. Inspectresult.itemsfor the actual content.- Approvals reviewer modes:
user(everything asks),auto_review(Codex decides based on built-in heuristics),guardian(third option, stricter auto). Pick deliberately; defaults are conservative for a reason. - Sandbox + workspace: sandbox policy applies per turn, but the underlying file changes are real.
read-onlyfor review/triage,workspace-writefor edits,danger-full-accessonly when you've thought hard about it.
- App Server – Codex
- SDK – Codex
- CLI – Codex
- Command line options – Codex CLI reference
- Non-interactive mode (
codex exec) - Hooks – Codex
- Subagents – Codex
- Agent Skills – Codex
- Model Context Protocol – Codex
- Customization – Codex
- Configuration Reference
- Advanced Configuration
- Custom instructions with AGENTS.md
- Use Codex with the Agents SDK
- Best practices – Codex
- Changelog – Codex
- Features – Codex CLI
- openai/codex (monorepo)
- codex/codex-rs/app-server (Rust impl)
- codex/codex-rs/app-server/README.md
- codex/sdk/python (official Python SDK)
- codex/docs/config.md
- codex/docs/agents_md.md
- Unlocking the Codex harness: how we built the App Server (OpenAI blog)
- Python SDK – DeepWiki for openai/codex
- Hook would be a great feature (discussion #2150)
- Codex app-server python SDK (experimental) – community
codex-app-server-sdkon PyPI (Python, async, stdio + ws)codex-app-server-sdk0.3.2 on Libraries.iocodex-sdk-pythonon PyPIcodex_sdk(Elixir)- Architecture Guide — Codex SDK v0.14.0 (Elixir)
ExMCP.ACP.Adapters.Codex(Elixir, ACP adapter)
- Codex CLI Plugin System: Bundling Skills, MCP Servers, and App Connectors
- The Codex Python SDK: Embedding Agents in Python Applications
- How to use MCPs with Codex (CLI, IDE, App) – Composio
- How to integrate Make MCP with Codex – Composio
- Awesome Codex Skills – ComposioHQ
- File-based sub-agents for Codex CLI (Hacker News)
betterup/codex-cli-subagentsshakacode/claude-code-with-codex.md(interop notes)- agents.md spec
- Setup Codex CLI notifications on macOS – samwize
- Play sound on macOS when Codex completes a task – Roger Lee
- Mintlify Codex CLI overview
Distilled from a working session reading codex-companion.mjs + searching the public docs and ecosystem. Verify shapes against your installed CLI version with codex app-server generate-json-schema.