Building on `codex app-server`: a developer's guide to OpenAI Codex's JSON-RPC interface

Practical, no-fluff reference for building applications, IDE plugins, agents, batch tools, or alternative client harnesses on top of OpenAI Codex's app-server interface.

This guide distills:

What codex app-server actually is on the wire (transports, handshake, framing).
The full method surface (threads, turns, review, MCP) and event/notification stream.
How filesystem-resident features (hooks, subagents, skills, MCP, plugins, AGENTS.md) interact with the protocol.
Where state lives on disk (Codex core state vs. companion state).
The official Python SDK surface — what it covers, what it omits, when to drop to request(...).
Third-party clients across Python, Elixir.
A reference architecture: how the official Claude Code "Codex companion" plugin (codex-companion.mjs) drives the protocol, including the broker session-reuse trick.
A minimum-viable Python client to copy-paste.
Practical recipes (CI reviewers, multi-agent fan-out, approval UIs).

All sources linked inline and again at the end.

1. What is `codex app-server`?

codex app-server is a stateful, long-lived process that hosts Codex agent threads and exposes them over JSON-RPC 2.0. It's the same backend that powers OpenAI's Codex VS Code extension and JetBrains plugin, and that custom integrations (the openai/codex Python SDK, the JS Claude Code companion, third-party clients) talk to.

Think of it as Codex's IPC interface: you spawn one process per workspace (or reuse a long-lived one), then drive it with structured RPC instead of scraping codex exec output.

Use it when you need any of:

Streaming reasoning / message / command-output deltas.
Persistent threads (thread/resume, thread/fork).
Bidirectional approvals (server asks your client to allow/deny a tool action).
Structured output (JSON schema-conforming results).
Multiple turns against the same context.
Custom UI/UX (IDE plugin, web dashboard, Slack bot, CI reviewer).

If you just want "run prompt → exit", use codex exec non-interactive mode instead — app-server is overkill for one-shots.

References:

2. Wire protocol

2.1 Transports

Transport	How to start	Status	Framing
stdio (default)	`codex app-server`	Stable	Newline-delimited JSON (one message per line) over stdin/stdout
WebSocket	`codex app-server --listen ws://127.0.0.1:4500`	Experimental, unsupported	One JSON message per WS frame; bounded queues; busy server returns RPC error code `-32001`

Stderr carries human-readable diagnostics; you can mirror it to a log.

2.2 JSON-RPC framing

Standard JSON-RPC 2.0 with the "jsonrpc":"2.0" member omitted on the wire. Three message kinds:

// Client → Server request
{"id": 1, "method": "thread/start", "params": {...}}

// Server → Client response (success or error)
{"id": 1, "result": {...}}
{"id": 1, "error": {"code": -32601, "message": "Method not found"}}

// Notification (either direction, no id)
{"method": "item/agentMessage/delta", "params": {...}}

// Server → Client request (bidirectional, e.g. for approvals)
{"id": 42, "method": "execCommandApproval", "params": {...}}

One message per \n-terminated line on stdio. JSON-RPC error code -32001 means the broker/server is busy — back off and retry.

2.3 Handshake

Mandatory first turn:

// Client →
{"id": 1, "method": "initialize", "params": {
  "clientInfo": {"name": "your-app", "title": "Your App", "version": "1.0.0"},
  "capabilities": {
    "experimentalApi": false,
    "optOutNotificationMethods": [
      "item/agentMessage/delta",
      "item/reasoning/summaryTextDelta",
      "item/reasoning/summaryPartAdded",
      "item/reasoning/textDelta"
    ]
  }
}}

// Server → response with server capabilities + version
// Then client →
{"method": "initialized", "params": {}}

Until the handshake completes the server rejects every other request. optOutNotificationMethods lets you suppress noisy delta channels — useful for batch jobs that only want final results.

experimentalApi: true enables the gated surfaces (dynamic tools, ChatGPT token management, future-flagged features). Don't flip it unless you need them; experimental APIs may break between Codex versions.

2.4 Schema generation

The exact parameter and result shapes for the version of Codex you have installed are reproducible:

codex app-server generate-ts            # TypeScript types
codex app-server generate-json-schema   # JSON Schema bundle

Use these to codegen typed clients. Schemas drift between CLI versions — regenerate when you upgrade codex.

References:

3. Method surface

3.1 Threads — conversations

Method	Purpose
`thread/start`	Begin a new agent session. Configure model, reasoning effort, sandbox policy, output schema, dynamic tools, base instructions, personality, working dir, MCP overrides. Returns a `threadId`.
`thread/resume`	Reopen a stored thread by id. Full history is restored server-side from the rollout JSONL (see §6). Optionally override params for the resumed session.
`thread/fork`	Branch an existing thread to explore a different path without disturbing the original.
`thread/list`	Paginated history; filter by model provider, source kind, archive state, working dir, free-text search.
`thread/rollback`	Drop the last N turns from in-memory context. A rollback marker is appended to the thread's persisted JSONL log.
`thread/name/set`	Rename a thread.
`thread/archive`	Archive (or unarchive) a thread; filtered out of the default `thread/list` view.
`thread/backgroundTerminals/*`	Manage long-lived shell sessions inside the sandbox. Experimental.
`thread/compact`	Compact history (used by the official Python SDK's `thread.compact()`).

3.2 Turns — driving the agent

Method	Purpose
`turn/start`	Submit user input (text, images, file paths) and run the agent. Per-turn overrides: `model`, reasoning `effort`, `personality`, `sandbox` policy, `output_schema` (structured output), `dynamicTools` (experimental), skill inputs. Returns a `turnId`.
`turn/steer`	Append additional input to an in-flight turn without cancelling.
`turn/interrupt`	Abort an in-flight turn cleanly. The companion's `/codex:cancel` is built on this plus tree-killing the broker process.

Sandbox policy values seen in clients: read-only, workspace-write, danger-full-access. Reasoning effort: none, minimal, low, medium, high, xhigh (verified from codex-companion.mjs's VALID_REASONING_EFFORTS).

3.3 Review

Method	Purpose
`review/start`	Invoke Codex's built-in code reviewer. Targets: uncommitted working tree (`{type:"uncommittedChanges"}`), base branch diff (`{type:"baseBranch", branch:"main"}`), specific commit ranges, or custom targets. Optionally fork into a detached review thread.

This is what /codex:review in the JS companion calls; the more flexible /codex:adversarial-review instead runs a turn/start against a structured-output schema (schemas/review-output.schema.json in the plugin) so you get parseable JSON.

3.4 MCP tools

Method	Purpose
`mcpServer/tool/call`	Invoke a tool exposed by an MCP server configured for the thread.

3.5 Auth, config, account

Method	Purpose
`account/read`	Current ChatGPT/API auth state, refresh token controls.
`config/read`	Effective resolved configuration (model defaults, sandbox defaults, MCP servers, paths). Useful for diagnostics.

3.6 Server → Client requests (bidirectional)

The server can send requests to your client. Most importantly, approvals:

execCommandApproval — server wants to run a shell command, asks for permission.
applyPatchApproval — server wants to write/modify a file.
(and others under the same approval umbrella)

Your client responds with accept / decline / cancel. This is how Codex enforces sandboxing with a human-in-the-loop. If you set ApprovalsReviewer to auto_review or guardian, Codex resolves these automatically; set it to user and you'll receive every approval request and must answer.

This is also why the protocol is bidirectional: every client implementation must handle inbound id+method messages, not just inbound responses.

References:

4. Notifications / streaming events

The server streams notifications during turn execution. Subscribe by not opting them out at initialize. Common methods:

Notification	Meaning
`turn/started`	New turn begins.
`turn/completed`	Turn finished (success or failure).
`item/started`	A new content item (message, reasoning, tool call) starts.
`item/completed`	A content item finished.
`item/agentMessage/delta`	Streaming agent message text.
`item/reasoning/textDelta`	Streaming reasoning text.
`item/reasoning/summaryTextDelta`	Streaming reasoning summary.
`item/reasoning/summaryPartAdded`	New reasoning summary part.
(command output deltas, tool call events, etc.)	Various per-tool streams.

Opt out of high-volume deltas (the four above) for batch / CI jobs. Keep the lifecycle events (turn/*, item/started, item/completed) — those are how you know when the turn is done and what was produced.

5. Filesystem-resident features (hooks, subagents, skills, MCP, plugins)

These all live as files Codex reads rather than RPC methods you call. They apply to every client of the app-server (TUI, codex exec, your custom Python script) automatically.

5.1 Hooks

Six lifecycle events, configured in config.toml (inline [hooks] table) or sibling hooks.json files at any config layer.

Event	Tool matcher	Can block?
`SessionStart`	`source` (startup/resume/clear)	Yes
`PreToolUse`	`tool_name` (Bash, apply_patch, MCP)	Yes
`PermissionRequest`	`tool_name`	Yes (allow / deny / abstain)
`PostToolUse`	`tool_name`	No
`UserPromptSubmit`	–	Yes
`Stop`	–	No

JSON payload to your hook script: session_id, transcript_path, cwd, hook_event_name, model (+ turn_id for turn-scoped events). Hook output keys: continue, stopReason, systemMessage, suppressOutput. Default timeout: 600s. Multiple matching hooks run concurrently.

Older simpler mechanism: notify = ["/bin/bash", "/path/to/notify.sh"] in config.toml — fires on agent-turn-complete only.

References:

5.2 Subagents

TOML files at ~/.codex/agents/*.toml (personal) or .codex/agents/*.toml (project).

name = "tester"
description = "Runs and explains the test suite for any change"
developer_instructions = "..."
model = "gpt-5.4-codex"
model_reasoning_effort = "high"
sandbox_mode = "workspace-write"
# mcp_servers = [...] / skills.config = {...}

Built-in templates (default, worker, explorer) can be overridden by name. Codex spawns subagents only on explicit user request (e.g. "spawn one agent per point") or via /agent slash command in the TUI. Each subagent runs as its own thread; approvals from inactive agent threads bubble up to the active UI labelled by source thread.

Not exposed via dedicated app-server RPC. To replicate the UX from a custom client, start additional thread/start calls yourself with the right config layer pointed at the agent file.

References:

5.3 Agent Skills

Open standard for packaging reusable agent workflows. Discovery scopes (lowest → highest priority):

System (bundled with Codex by OpenAI).
Admin: /etc/codex/skills.
User: $HOME/.agents/skills.
Repo: .agents/skills in CWD or repo root.

Directory layout:

my-skill/
├── SKILL.md           # required, YAML frontmatter (name, description) + instructions
├── scripts/           # optional executable helpers
├── references/        # optional docs the skill can read
├── assets/            # optional binary/static
└── agents/openai.yaml # optional Codex-specific UI/invocation/tool-deps

Progressive disclosure: Codex injects only metadata (name, description, path) at session start, capped at ~8KB total. Full SKILL.md is read only when the skill is selected. Invoked explicitly via /skills or $skill-name mention, or implicitly when prompts match the skill description.

Over the wire: per-turn skill activation is exposed as skill_inputs on turn/start. Skill discovery is filesystem-only.

References:

5.4 MCP servers

Configured globally in config.toml:

[mcp_servers.linear]
command = "npx"
args = ["@modelcontextprotocol/server-linear"]
env = { LINEAR_API_KEY = "${env:LINEAR_API_KEY}" }
startup_timeout_sec = 30
tool_timeout_sec = 120
enabled = true
required = false
enabled_tools = ["list_issues", "get_issue"]
# disabled_tools = [...]

[mcp_servers.streamable_example]
url = "https://example.com/mcp"
bearer_token_env_var = "EXAMPLE_TOKEN"
# OAuth: codex mcp login <server-name>

Two server kinds: STDIO (command, args, env, cwd) and Streamable HTTP (url, bearer_token_env_var or OAuth via codex mcp login).

App-server methods that touch MCP:

mcpServer/tool/call — invoke a tool against the thread's configured MCP server.
dynamicTools parameter on thread/start — runtime tool registration (gated behind capabilities.experimentalApi=true). Persisted in the rollout metadata; restored on thread/resume if not re-supplied.

References:

5.5 Plugins

Distribution wrapper that bundles: skills + MCP server configs + app mappings + presentation assets. Skills are the authoring format; plugins are the packaging. The Claude Code "Codex companion" you're integrating with is itself a plugin (lives under ~/.claude/plugins/cache/openai-codex/codex/<version>/).

5.6 AGENTS.md

Codex reads AGENTS.md (or AGENTS.override.md) before doing any work, layering global → project guidance. Same role as CLAUDE.md for Claude Code, or a README-for-agents.

References:

6. Where state lives on disk

Two completely separate state stores: Codex core state (everything the agent knows) and companion / harness state (the JS plugin's per-job tracking, optional).

6.1 Codex core state — `~/.codex/`

~/.codex/
  auth.json                     # ChatGPT/API credentials
  config.toml                   # user config (model, sandbox, MCP servers, hooks, notify)
  history.jsonl                 # global prompt-history log (every user message ever sent)
  session_index.jsonl           # index over the rollouts below
  sessions/                     # one JSONL "rollout" per thread — the durable thread state
    YYYY/MM/DD/
      rollout-<ISO>-<thread-id>.jsonl
  archived_sessions/            # archived threads, same format
  state_5.sqlite                # newer indexes/metadata (sqlite alongside the JSONL)
  logs_2.sqlite                 # internal telemetry/logs
  memories/, prompts/, rules/, skills/, plugins/, …
  cache/, models_cache.json, generated_images/, shell_snapshots/
  .codex-global-state.json      # global runtime state

Rollout files are newline-delimited JSON, one event per line. First line is session_meta (id, cwd, model, base_instructions, originator, cli_version), then a stream of turn/item events — same shape you'd see over the wire. Example first line:

{"timestamp":"2026-04-24T12:35:48.797Z","type":"session_meta",
 "payload":{"id":"019dbf7d-...","cwd":"/path/to/repo",
            "originator":"codex-tui","cli_version":"0.124.0",
            "model_provider":"openai", ...}}

So ~/.codex/sessions/.../rollout-*.jsonl is the durable, replayable representation of a thread. thread/resume reads from there. thread/list paginates session_index.jsonl. To grep all your past sessions: rg <term> ~/.codex/sessions/.

6.2 Companion / harness state (the JS plugin)

The Claude Code Codex companion uses its own per-workspace state:

$CLAUDE_PLUGIN_DATA/state/<slug>-<sha256(realpath)[:16]>/
  state.json        # { version, config:{stopReviewGate}, jobs:[…] } — index, capped at 50
  broker.json       # { endpoint, pidFile, logFile, sessionDir, pid } for the UDS broker
  jobs/
    <job-id>.json   # full per-job record (request, payload, threadId, turnId, status…)
    <job-id>.log    # plain-text progress log (NOT JSONL)

Workspace-keyed by <basename>-<sha256(realpath)[:16]>. 50-job cap; older job files garbage-collected on saveState. Locate it with:

echo "${CLAUDE_PLUGIN_DATA:-$TMPDIR/codex-companion}/state"

Your own client can ignore all of this; only the JS companion uses it. The rich agent transcripts you actually want are always in ~/.codex/sessions/.

7. The official Python SDK — what it covers

Package: openai-codex-app-server-sdk (in openai/codex repo at sdk/python).

7.1 Install & minimal example

pip install openai-codex-app-server-sdk
# bundles `openai-codex-cli-bin` matching the SDK version

from codex_app_server import Codex

with Codex() as codex:
    thread = codex.thread_start(model="gpt-5")
    result = thread.run("Say hello in one sentence.")
    print(result.final_response)
    print(result.items)

Configure binary location explicitly when running against a non-bundled CLI:

from codex_app_server import Codex, AppServerConfig
with Codex(config=AppServerConfig(codex_bin="/usr/local/bin/codex")) as codex:
    ...

7.2 Surface area

High-level API exposed:

Codex() / AsyncCodex() — process lifecycle, initialize handshake, context-managed shutdown.
Thread: start, resume, fork, run, compact.
TurnHandle: streaming events, steer, interrupt.
Per-turn config: model, effort (low/medium/high), output_schema (structured output), image inputs (ImageInput, LocalImageInput), SandboxPolicy, ApprovalsReviewer (user / auto_review / guardian), skill_inputs, custom personality, developer_instructions.
Typed Pydantic notification models with snake_case ↔ camelCase translation.
Retry helper: codex_app_server.retry.retry_on_overload for -32001 busy responses.
Low-level request(...) escape hatch.

Not (yet) exposed in the high-level wrappers:

WebSocket transport — stdio only.
thread/list, thread/rollback, thread/name/set, thread/archive.
mcpServer/tool/call.
dynamicTools runtime tool registration.
thread/backgroundTerminals/*.
account/read, config/read introspection.
review/start as first-class — run reviews as turns instead.
Per-action approval callbacks (you can configure ApprovalsReviewer mode but not intercept individual execCommandApproval / applyPatchApproval requests with custom logic).

For everything missing, drop to the request(...) low-level method — the connection and Pydantic models are still useful.

Disclaimers from the README:

"Experimental Python SDK for codex app-server JSON-RPC v2 over stdio."
result.final_response is None if a turn ends without a final-answer message.
Schema is bundled per CLI version; mismatched binary will surface as Pydantic validation errors.
Repo no longer ships codex binaries inside sdk/python — set codex_bin or rely on the pinned openai-codex-cli-bin wheel.

References:

8. Third-party clients

8.1 `codex-app-server-sdk` (Python, third-party)

Async-only Python client by Mariusz Woloszyn. Supports both stdio and WebSocket. Requires Python ≥ 3.12.

uv add codex-app-server-sdk
# or
pip install codex-app-server-sdk

import asyncio
from codex_app_server_sdk import CodexClient

async def main() -> None:
    async with CodexClient.connect_stdio() as client:
        result = await client.chat_once("Hello from Python")
        print(result.final_text)

asyncio.run(main())

Helpers: chat_once(...) (one-shot), chat(...) (step-streaming), thread/turn lifecycle handling, plus low-level request(...).

References:

8.2 `codex-sdk-python` (Python, third-party)

Supports both the codex exec JSONL path and the persistent app-server JSON-RPC path; exposes typed structured results for each.

References:

codex-sdk-python on PyPI

8.3 Elixir

codex_sdk by nshkrdotcom — full Elixir SDK. (GitHub)
ExMCP.ACP.Adapters.Codex — exposes Codex app-server as an ACP adapter inside the ex_mcp ecosystem.

8.4 JavaScript / Node — the reference companion

The Claude Code Codex plugin's codex-companion.mjs is a complete, real-world JSON-RPC client written in plain Node (no SDK dependency). Worth reading as a worked example. See §10 for a tour.

9. Building your own client

9.1 Minimum viable Python client (no dependencies)

This shows the wire format. Use the official SDK or codex-app-server-sdk for production — they handle retries, typed models, async streaming, and broker reuse.

import json
import subprocess
import threading
import itertools
import queue


class CodexClient:
    def __init__(self, cwd="."):
        self.proc = subprocess.Popen(
            ["codex", "app-server"],
            stdin=subprocess.PIPE,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True,
            bufsize=1,
            cwd=cwd,
        )
        self._ids = itertools.count(1)
        self._pending = {}
        self._notifications = queue.Queue()
        self._reader = threading.Thread(target=self._read_loop, daemon=True)
        self._reader.start()

    def _read_loop(self):
        for line in self.proc.stdout:
            line = line.strip()
            if not line:
                continue
            msg = json.loads(line)
            if "id" in msg and "method" not in msg:
                # Response to one of our requests
                fut = self._pending.pop(msg["id"], None)
                if fut is not None:
                    fut.put(msg)
            elif "id" in msg and "method" in msg:
                # Server → Client request (e.g. approval)
                # Reply with method-not-found until you wire approvals
                self._send({
                    "id": msg["id"],
                    "error": {"code": -32601,
                              "message": f"Unsupported server request: {msg['method']}"},
                })
            elif "method" in msg:
                # Notification (deltas, lifecycle events)
                self._notifications.put(msg)

    def _send(self, message):
        line = json.dumps(message) + "\n"
        self.proc.stdin.write(line)
        self.proc.stdin.flush()

    def request(self, method, params=None, timeout=120):
        i = next(self._ids)
        fut = queue.Queue(maxsize=1)
        self._pending[i] = fut
        self._send({"id": i, "method": method, "params": params or {}})
        msg = fut.get(timeout=timeout)
        if "error" in msg:
            raise RuntimeError(f"{method} failed: {msg['error']}")
        return msg.get("result", {})

    def notify(self, method, params=None):
        self._send({"method": method, "params": params or {}})

    def initialize(self, name="my-client", version="0.0.1"):
        self.request("initialize", {
            "clientInfo": {"name": name, "title": name, "version": version},
            "capabilities": {
                "experimentalApi": False,
                "optOutNotificationMethods": [
                    "item/agentMessage/delta",
                    "item/reasoning/summaryTextDelta",
                    "item/reasoning/summaryPartAdded",
                    "item/reasoning/textDelta",
                ],
            },
        })
        self.notify("initialized", {})

    def drain_notifications(self):
        while not self._notifications.empty():
            yield self._notifications.get_nowait()

    def close(self):
        try:
            self.proc.stdin.close()
        finally:
            self.proc.terminate()
            self.proc.wait(timeout=5)


if __name__ == "__main__":
    client = CodexClient()
    try:
        client.initialize()
        thread = client.request("thread/start", {"cwd": ".", "model": "gpt-5"})
        thread_id = thread["threadId"]
        result = client.request("turn/start", {
            "threadId": thread_id,
            "input": [{"type": "text", "text": "Say hello in one sentence."}],
            "sandbox": "read-only",
        })
        print("Final:", result.get("finalMessage"))
        for note in client.drain_notifications():
            print("Notif:", note["method"])
    finally:
        client.close()

Real param/result shapes are versioned — generate them with codex app-server generate-json-schema for accuracy. Treat the snippet above as wire-format demonstration.

9.2 Async / streaming

For real streaming (token-by-token reasoning, multiple parallel threads, or WebSocket), use asyncio + asyncio.subprocess, or jump straight to codex-app-server-sdk (already async).

9.3 Handling server → client approval requests

Don't reject everything with -32601. Implement at minimum:

def _handle_server_request(self, msg):
    method = msg["method"]
    if method in {"execCommandApproval", "applyPatchApproval"}:
        # Show prompt to user / consult policy / log to dashboard
        decision = "accept"  # or "decline" / "cancel"
        self._send({"id": msg["id"], "result": {"decision": decision}})
    else:
        self._send({
            "id": msg["id"],
            "error": {"code": -32601, "message": f"Unsupported: {method}"},
        })

Or set ApprovalsReviewer = "auto_review" / "guardian" on thread/start to let Codex auto-resolve them based on its built-in heuristics, and only intercept when you want custom UX.

10. Reference architecture: the JS Codex companion

The Claude Code Codex plugin's scripts/codex-companion.mjs (under ~/.claude/plugins/cache/openai-codex/codex/<version>/) is a battle-tested example client. Worth understanding when designing your own.

10.1 Process model — direct vs broker

Two transports inside one client (scripts/lib/app-server.mjs):

Direct: spawn codex app-server per command, communicate over stdio JSONL. Simple, but pays full startup cost every invocation.
Broker: a long-lived daemon (app-server-broker.mjs) is spawned once per workspace. It hosts a persistent codex app-server and listens on a Unix domain socket. Subsequent companion invocations connect to the socket via net.createConnection({ path }) instead of spawning a new server. Endpoint, pid, log file, and session dir are tracked in <companion-state>/broker.json.

Why this matters: starting codex app-server involves loading the Codex binary, reading config, attaching MCP servers, and warming caches. The broker amortizes that cost across all subsequent foreground/background companion calls.

10.2 Methods exercised

Grep result from scripts/lib/codex.mjs:

client.request("thread/start", ...)
client.request("thread/resume", ...)
client.request("thread/name/set", ...)
client.request("thread/list", ...)
client.request("turn/start", ...)
client.request("turn/interrupt", ...)
client.request("review/start", ...)
client.request("account/read", ...)
client.request("config/read", ...)

So the companion is a comprehensive consumer of the protocol — broader than the official Python SDK's high-level surface — but doesn't touch mcpServer/tool/call, dynamicTools, thread/fork, thread/rollback, or thread/archive.

10.3 Job model

/codex:task in foreground vs background:

Foreground: opens client → runs turn synchronously → renders result → exits. State persisted to jobs/<job-id>.json and .log for /codex:status retrospection.
Background: parent spawns a detached task-worker subprocess (same script, different subcommand), records queued state, returns immediately. Worker reads job record, runs the same turn, updates state. /codex:cancel then sends turn/interrupt to the broker and tree-kills the worker pid.

This pattern is straightforward to port: detached subprocess + shared JSON state file + turn/interrupt for cancellation = a complete background-task system over the protocol.

10.4 Stop-gate review

Optional config stopReviewGate: true makes Codex review your turn before letting Claude Code "stop". Implemented as a turn/start against the adversarial-review prompt + structured-output schema. A practical demonstration of how to layer custom workflows on top of the protocol with zero protocol changes.

11. Practical recipes

11.1 Replace the JS companion in Python

Mapping:

Companion behavior	Python implementation
Hooks, MCP, skills, AGENTS.md, subagents, notify	Files in `~/.codex/` / `.codex/` / `.agents/` — nothing to do, applies automatically.
`/codex:task` foreground	`Codex.thread_start(...)` → `thread.run("...")`
`/codex:task --background`	Detached `multiprocessing.Process` running `thread.run(...)`, write status JSON to your own state dir.
`/codex:task --resume-last`	Track latest `thread_id` in your state file, then `Codex.thread_resume(thread_id)`.
`/codex:cancel`	Drop to `client.request("turn/interrupt", {threadId, turnId})`.
`/codex:review`	Use the official SDK's `output_schema` parameter against an adversarial-review prompt, or drop to `client.request("review/start", {...})`.
`/codex:status` (across sessions)	`client.request("thread/list", {...})` — drop to low-level.
Broker daemon (session reuse across CLI invocations)	Either (a) keep one Python process alive (FastAPI / asyncio app), (b) skip and pay the cold-start cost, or (c) build your own UDS daemon mirroring `app-server-broker.mjs`.

11.2 CI reviewer

from codex_app_server import Codex
import json, sys

OUTPUT_SCHEMA = json.load(open("review-output.schema.json"))

with Codex() as codex:
    thread = codex.thread_start(
        model="gpt-5",
        sandbox_policy={"mode": "read-only"},
        approvals_reviewer="auto_review",
    )
    result = thread.run(
        "Review the staged diff. Flag bugs, security issues, missing tests. "
        "Respond against the provided schema.",
        output_schema=OUTPUT_SCHEMA,
    )
    parsed = json.loads(result.final_response)
    if parsed["severity"] in {"high", "critical"}:
        sys.exit(1)

Wire this to your CI's PR diff + post structured output as a review comment.

11.3 Multi-agent fan-out

Fork the parent thread into N children, run them in parallel, compare outputs:

import asyncio
from codex_app_server import AsyncCodex

async def variant(codex, parent_id, prompt, model, effort):
    child = await codex.thread_fork(parent_id)
    return await child.run(prompt, model=model, effort=effort)

async def main():
    async with AsyncCodex() as codex:
        parent = await codex.thread_start(...)
        results = await asyncio.gather(
            variant(codex, parent.thread_id, "Solve this", "gpt-5",        "low"),
            variant(codex, parent.thread_id, "Solve this", "gpt-5",        "high"),
            variant(codex, parent.thread_id, "Solve this", "gpt-5.4-codex","medium"),
        )
        # rank by self-grading or external rubric

This is how you'd build a "best-of-N" agent or replicate the subagent UX without invoking the TUI's /agent machinery.

11.4 Custom approval UI (Slack bot, web dashboard)

Set approvals_reviewer="user" so every action requires explicit approval. Intercept the bidirectional execCommandApproval / applyPatchApproval server-to-client requests, route them to your UI, return the user's decision. The official Python SDK doesn't surface this directly — drop to the third-party codex-app-server-sdk or write your own client (§9).

11.5 Long-running research agents

thread/resume works across days/weeks. Persist thread_id to your DB; pick up where you left off:

with Codex() as codex:
    thread = codex.thread_resume(stored_thread_id)
    result = thread.run("Pick up where we left off. Next step: …")

The full rollout JSONL is on disk under ~/.codex/sessions/, so you can also grep, replay, or export it.

11.6 Embedding skills + MCP for an internal product

Bundle skills under .agents/skills/<my-skill>/SKILL.md in your repo.
Wire MCP servers in .codex/config.toml ([mcp_servers.<name>]).
Optionally bundle as a Codex plugin for distribution.
Drive turns from your Python service — skills + MCP tools are picked up automatically.

12. Caveats & operational notes

Schema versioning: app-server schemas drift with CLI versions. Pin openai-codex-cli-bin (or whatever your distro mechanism is) and regenerate types via generate-ts / generate-json-schema on upgrade.
Experimental surfaces: WebSocket transport, dynamic tools, ChatGPT token management, background terminals — gated behind experimentalApi or explicitly marked unstable. Do not depend on these for production unless you accept regular breaking changes.
Busy responses: RPC error code -32001 means the server (or broker) is overloaded; back off and retry. The official Python SDK ships codex_app_server.retry.retry_on_overload for exactly this.
Bidirectional protocol: every client must handle inbound id+method messages. Replying -32601 to everything will work for trivial cases but fails the moment Codex needs an approval.
Stateful, not REST: one process per session (or per workspace if you broker). Don't try to serve multi-tenant workloads with a single shared app-server instance — that's not its model.
final_response may be null: a turn that ends without an agent message (tool-call-only, errored, etc.) returns final_response = None. Inspect result.items for the actual content.
Approvals reviewer modes: user (everything asks), auto_review (Codex decides based on built-in heuristics), guardian (third option, stricter auto). Pick deliberately; defaults are conservative for a reason.
Sandbox + workspace: sandbox policy applies per turn, but the underlying file changes are real. read-only for review/triage, workspace-write for edits, danger-full-access only when you've thought hard about it.

13. Sources & further reading

Official OpenAI documentation

Source code

Background / design

Third-party clients

Ecosystem / how-to

Distilled from a working session reading codex-companion.mjs + searching the public docs and ecosystem. Verify shapes against your installed CLI version with codex app-server generate-json-schema.

oneryalcin/codex-app-server-guide.md

Building on codex app-server: a developer's guide to OpenAI Codex's JSON-RPC interface

1. What is codex app-server?

2. Wire protocol

2.1 Transports

2.2 JSON-RPC framing

2.3 Handshake

2.4 Schema generation

3. Method surface

3.1 Threads — conversations

3.2 Turns — driving the agent

3.3 Review

3.4 MCP tools

3.5 Auth, config, account

3.6 Server → Client requests (bidirectional)

4. Notifications / streaming events

5. Filesystem-resident features (hooks, subagents, skills, MCP, plugins)

5.1 Hooks

5.2 Subagents

5.3 Agent Skills

5.4 MCP servers

5.5 Plugins

5.6 AGENTS.md

6. Where state lives on disk

6.1 Codex core state — ~/.codex/

6.2 Companion / harness state (the JS plugin)

7. The official Python SDK — what it covers

7.1 Install & minimal example

7.2 Surface area

8. Third-party clients

8.1 codex-app-server-sdk (Python, third-party)

8.2 codex-sdk-python (Python, third-party)

8.3 Elixir

8.4 JavaScript / Node — the reference companion

9. Building your own client

9.1 Minimum viable Python client (no dependencies)

9.2 Async / streaming

9.3 Handling server → client approval requests

10. Reference architecture: the JS Codex companion

10.1 Process model — direct vs broker

10.2 Methods exercised

10.3 Job model

10.4 Stop-gate review

11. Practical recipes

11.1 Replace the JS companion in Python

11.2 CI reviewer

11.3 Multi-agent fan-out

11.4 Custom approval UI (Slack bot, web dashboard)

11.5 Long-running research agents

11.6 Embedding skills + MCP for an internal product

12. Caveats & operational notes

13. Sources & further reading

Official OpenAI documentation

Source code

Background / design

Third-party clients

Ecosystem / how-to

Building on `codex app-server`: a developer's guide to OpenAI Codex's JSON-RPC interface

1. What is `codex app-server`?

6.1 Codex core state — `~/.codex/`

8.1 `codex-app-server-sdk` (Python, third-party)

8.2 `codex-sdk-python` (Python, third-party)