Planned Feature - This component is not yet implemented. See GitHub issues for usage context and design.

Codex Component

Wraps OpenAI Codex CLI for code-focused AI agent capabilities with structured output, tool use, and Smithers task lifecycle integration. Provides alternative to Claude with OpenAI’s code-specialized models.

Planned API

interface CodexProps<TSchema extends z.ZodType = z.ZodType> {
  /**
   * Prompt/task for the agent.
   */
  children: ReactNode

  /**
   * Model selection.
   * @default 'gpt-4-turbo'
   */
  model?: 'gpt-4' | 'gpt-4-turbo' | 'gpt-3.5-turbo'

  /**
   * Temperature for randomness (0-2).
   * @default 1
   */
  temperature?: number

  /**
   * System prompt for agent behavior.
   */
  systemPrompt?: string

  /**
   * Working directory for code operations.
   */
  cwd?: string

  /**
   * Environment variables.
   */
  env?: Record<string, string>

  /**
   * Timeout in milliseconds.
   * @default 300000 (5 minutes)
   */
  timeout?: number

  /**
   * Zod schema for structured output.
   */
  schema?: TSchema

  /**
   * Callback on successful completion.
   */
  onFinished?: (result: AgentResult<TSchema>) => void

  /**
   * Callback on error.
   */
  onError?: (error: Error) => void
}

interface AgentResult<TSchema extends z.ZodType = z.ZodType> {
  output: string
  parsedOutput?: z.infer<TSchema>
  toolCalls?: ToolCall[]
  durationMs: number
}

export function Codex<TSchema extends z.ZodType = z.ZodType>(
  props: CodexProps<TSchema>
): JSX.Element

Proposed Usage

Basic Code Generation

import { Codex, Phase, Step } from 'smithers-orchestrator'

export function ImplementFeature() {
  return (
    <Phase name="Implementation">
      <Step name="implement">
        <Codex model="gpt-4-turbo">
          Implement user authentication with JWT tokens.
          Include login, logout, and token refresh endpoints.
        </Codex>
      </Step>
    </Phase>
  )
}

With Structured Output

import { z } from 'zod'

const reviewSchema = z.object({
  decision: z.enum(['approve', 'request_changes']),
  issues: z.array(z.object({
    severity: z.enum(['critical', 'major', 'minor']),
    file: z.string(),
    message: z.string()
  })),
  summary: z.string()
})

<Codex
  model="gpt-4"
  schema={reviewSchema}
  onFinished={(result) => {
    const review = result.parsedOutput
    console.log(`Decision: ${review.decision}`)
    console.log(`Issues: ${review.issues.length}`)
  }}
>
  Review this pull request and provide structured feedback
</Codex>

Custom System Prompt

<Codex
  model="gpt-4-turbo"
  systemPrompt="You are a senior software engineer specializing in React and TypeScript. Focus on performance, type safety, and maintainability."
>
  Refactor this component to use modern React patterns
</Codex>

In Fallback Chain

import { FallbackAgent, Codex, Claude } from 'smithers-orchestrator'

<FallbackAgent>
  <Codex model="gpt-4-turbo">{prompt}</Codex>
  <Claude model="sonnet">{prompt}</Claude>
</FallbackAgent>

Props (Planned)

children

ReactNode

required

Prompt or task for the agent.Can be string or JSX elements. Converted to string for CLI.

<Codex>Implement feature X</Codex>

<Codex>
  Analyze this code:
  {codeSnippet}

  Suggest improvements for performance.
</Codex>

model

'gpt-4' | 'gpt-4-turbo' | 'gpt-3.5-turbo'

default:"gpt-4-turbo"

OpenAI model selection.Options:

gpt-4 - Original GPT-4 (high quality, slower, more expensive)
gpt-4-turbo - Faster GPT-4 variant with extended context
gpt-3.5-turbo - Faster, cheaper, less capable

Use cases:

Complex code: gpt-4
Balanced: gpt-4-turbo
Simple tasks: gpt-3.5-turbo

temperature

number

default:"1"

Controls randomness in responses (0-2).Lower (0-0.5): More deterministic, focused

Code generation
Refactoring
Bug fixing

Medium (0.5-1): Balanced creativity

General tasks
Documentation

Higher (1-2): More creative, diverse

Brainstorming
Exploration

systemPrompt

string

System-level instructions defining agent behavior and role.Examples:

systemPrompt="You are an expert in database optimization and SQL performance tuning."

systemPrompt="Focus on security best practices. Flag any potential vulnerabilities."

Overrides default Codex system prompt.

cwd

string

Working directory for code operations.Respects worktree context if inside <Worktree>.Priority: Explicit cwd > Worktree context > process.cwd()

env

Record<string, string>

Environment variables for CLI execution.Merged with process.env.

<Codex env={{ DEBUG: "true", LOG_LEVEL: "verbose" }}>
  Debug this issue
</Codex>

timeout

number

default:"300000"

Timeout in milliseconds (default 5 minutes).Agent killed if exceeds timeout.

<Codex timeout={600000}>  {/* 10 minutes for complex task */}
  Implement large refactoring
</Codex>

schema

z.ZodType

Zod schema for structured output validation.Agent output parsed and validated against schema.

const schema = z.object({
  files: z.array(z.string()),
  changes: z.number(),
  summary: z.string()
})

<Codex schema={schema}>Analyze changes</Codex>

Benefits:

Type-safe output
Validation errors caught early
Better agent prompting (schema in prompt)

onFinished

(result: AgentResult<TSchema>) => void

Callback on successful completion.Receives parsed output if schema provided.

onFinished={(result) => {
  console.log(`Agent completed in ${result.durationMs}ms`)
  if (result.parsedOutput) {
    // Type-safe access
    processResults(result.parsedOutput)
  }
}}

onError

(error: Error) => void

Callback on error (timeout, API error, validation failure).

onError={(error) => {
  console.error(`Codex failed: ${error.message}`)
  metrics.recordAgentFailure('codex', error)
}}

Note: Prevents error propagation if provided. Omit to throw.

Implementation Status

Design Phase

Component designed for multi-provider failover in review system. View on GitHub

CLI Wrapper (Pending)

Implement Codex CLI wrapper similar to ClaudeCodeCLI.ts pattern.

Structured Output (Pending)

JSON schema generation from Zod, response validation.

Task Integration (Pending)

Smithers task lifecycle, database logging, observability.

Testing (Future)

Unit tests with mocked API, integration tests with real Codex.

Design Rationale

Why Codex Component?

Provider diversity: Reduce dependency on single AI provider Cost optimization: OpenAI pricing may be more favorable for certain workloads Model capabilities: GPT-4 excels at certain code tasks Fallback resilience: Alternative when Claude unavailable

Implementation Pattern

Similar to Claude component:

export function Codex<TSchema extends z.ZodType = z.ZodType>(
  props: CodexProps<TSchema>
): ReactNode {
  const { db } = useSmithers()
  const worktree = useWorktree()
  const taskIdRef = useRef<string | null>(null)

  useMount(() => {
    ;(async () => {
      taskIdRef.current = db.tasks.start('codex_agent', extractPrompt(props.children))

      try {
        const result = await executeCodexCLI({
          prompt: extractPrompt(props.children),
          model: props.model ?? 'gpt-4-turbo',
          temperature: props.temperature,
          systemPrompt: props.systemPrompt,
          cwd: props.cwd ?? worktree?.cwd,
          env: props.env,
          timeout: props.timeout ?? 300000,
          schema: props.schema
        })

        props.onFinished?.(result)
      } catch (error) {
        props.onError?.(error)
        throw error
      } finally {
        db.tasks.complete(taskIdRef.current)
      }
    })()
  })

  return <codex-agent model={props.model} status="running" />
}

Authentication

Uses OPENAI_API_KEY environment variable:

export OPENAI_API_KEY="sk-..."

Required in GitHub Actions secrets for CI integration.

Examples of Use Cases

Use Case 1: Code Review

const reviewSchema = z.object({
  decision: z.enum(['approve', 'request_changes']),
  blocking: z.boolean(),
  issues: z.array(z.object({
    severity: z.enum(['critical', 'major', 'minor']),
    message: z.string(),
    file: z.string().optional()
  })),
  summary: z.string()
})

<Codex
  model="gpt-4"
  schema={reviewSchema}
  systemPrompt="You are a senior code reviewer. Focus on correctness, security, and maintainability."
  onFinished={async (result) => {
    const review = result.parsedOutput
    await db.state.set('lastReview', review)

    if (review.decision === 'request_changes') {
      await postReviewComment(review.issues)
    }
  }}
>
  Review this PR:

  {prDiff}

  Check for:
  - Security vulnerabilities
  - Performance issues
  - Type safety
  - Test coverage
</Codex>

Use Case 2: Refactoring

<Codex
  model="gpt-4-turbo"
  temperature={0.3}  // Lower temp for focused refactoring
  systemPrompt="Expert in React hooks and modern patterns. Preserve functionality while improving code quality."
>
  Refactor this component to:
  1. Use modern React hooks (no class components)
  2. Extract custom hooks for reusable logic
  3. Add proper TypeScript types
  4. Improve performance with memoization

  {componentCode}
</Codex>

Use Case 3: Bug Analysis

const bugAnalysisSchema = z.object({
  rootCause: z.string(),
  affectedFiles: z.array(z.string()),
  suggestedFix: z.string(),
  testCases: z.array(z.string())
})

<Codex
  model="gpt-4"
  schema={bugAnalysisSchema}
  systemPrompt="Expert debugger. Provide root cause analysis and concrete fixes."
>
  Analyze this bug:

  Error: {errorMessage}
  Stack trace: {stackTrace}
  Recent changes: {gitDiff}

  Provide root cause, affected files, and suggested fix.
</Codex>

GitHub Actions Review Loop

Primary use case - multi-provider review system

FallbackAgent Component

Multi-provider failover - Codex used as fallback option

Claude Component

Primary agent - Codex provides alternative/fallback

Gemini Component

Google Gemini agent - another fallback option

Alternatives Considered

Direct OpenAI API calls: No CLI abstraction, harder to test
Generic LLM component: Less specialized, more configuration needed
Wrap existing tools: Codex CLI not widely available, need custom wrapper
Only support Claude: Single point of failure, vendor lock-in

Migration Path

Current pattern (manual OpenAI API):

// Before (manual API calls)
const completion = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [
    { role: 'system', content: systemPrompt },
    { role: 'user', content: prompt }
  ],
  temperature: 0.7
})
const result = completion.choices[0].message.content

With Codex component:

// After (declarative)
<Codex
  model="gpt-4"
  temperature={0.7}
  systemPrompt={systemPrompt}
>
  {prompt}
</Codex>

Benefits: Task tracking, worktree integration, error handling, observability.

Feedback

If you have feedback on this planned component, please open an issue.

Get Started

Core Concepts

Components

Guides

Codex Component

Codex Component

Planned API

Proposed Usage

Basic Code Generation

With Structured Output

Custom System Prompt

In Fallback Chain

Props (Planned)

Implementation Status

Design Rationale

Why Codex Component?

Implementation Pattern

Authentication

Examples of Use Cases

Use Case 1: Code Review

Use Case 2: Refactoring

Use Case 3: Bug Analysis

GitHub Actions Review Loop

FallbackAgent Component

Claude Component

Gemini Component

Alternatives Considered

Migration Path

Feedback

Get Started

Core Concepts

Components

Guides

​Codex Component

​Planned API

​Proposed Usage

​Basic Code Generation

​With Structured Output

​Custom System Prompt

​In Fallback Chain

​Props (Planned)

​Implementation Status

​Design Rationale

​Why Codex Component?

​Implementation Pattern

​Authentication

​Examples of Use Cases

​Use Case 1: Code Review

​Use Case 2: Refactoring

​Use Case 3: Bug Analysis

​Related

GitHub Actions Review Loop

FallbackAgent Component

Claude Component

Gemini Component

​Alternatives Considered

​Migration Path

​Feedback

Codex Component

Planned API

Proposed Usage

Basic Code Generation

With Structured Output

Custom System Prompt

In Fallback Chain

Props (Planned)

Implementation Status

Design Rationale

Why Codex Component?

Implementation Pattern

Authentication

Examples of Use Cases

Use Case 1: Code Review

Use Case 2: Refactoring

Use Case 3: Bug Analysis

Related

Alternatives Considered

Migration Path

Feedback