Planned Feature - This component is not yet implemented.
See GitHub issues for usage context and design.
Codex Component
Wraps OpenAI Codex CLI for code-focused AI agent capabilities with structured output, tool use, and Smithers task lifecycle integration. Provides alternative to Claude with OpenAI’s code-specialized models.
Planned API
interface CodexProps<TSchema extends z.ZodType = z.ZodType> {
/**
* Prompt/task for the agent.
*/
children: ReactNode
/**
* Model selection.
* @default 'gpt-4-turbo'
*/
model?: 'gpt-4' | 'gpt-4-turbo' | 'gpt-3.5-turbo'
/**
* Temperature for randomness (0-2).
* @default 1
*/
temperature?: number
/**
* System prompt for agent behavior.
*/
systemPrompt?: string
/**
* Working directory for code operations.
*/
cwd?: string
/**
* Environment variables.
*/
env?: Record<string, string>
/**
* Timeout in milliseconds.
* @default 300000 (5 minutes)
*/
timeout?: number
/**
* Zod schema for structured output.
*/
schema?: TSchema
/**
* Callback on successful completion.
*/
onFinished?: (result: AgentResult<TSchema>) => void
/**
* Callback on error.
*/
onError?: (error: Error) => void
}
interface AgentResult<TSchema extends z.ZodType = z.ZodType> {
output: string
parsedOutput?: z.infer<TSchema>
toolCalls?: ToolCall[]
durationMs: number
}
export function Codex<TSchema extends z.ZodType = z.ZodType>(
props: CodexProps<TSchema>
): JSX.Element
Proposed Usage
Basic Code Generation
import { Codex, Phase, Step } from 'smithers-orchestrator'
export function ImplementFeature() {
return (
<Phase name="Implementation">
<Step name="implement">
<Codex model="gpt-4-turbo">
Implement user authentication with JWT tokens.
Include login, logout, and token refresh endpoints.
</Codex>
</Step>
</Phase>
)
}
With Structured Output
import { z } from 'zod'
const reviewSchema = z.object({
decision: z.enum(['approve', 'request_changes']),
issues: z.array(z.object({
severity: z.enum(['critical', 'major', 'minor']),
file: z.string(),
message: z.string()
})),
summary: z.string()
})
<Codex
model="gpt-4"
schema={reviewSchema}
onFinished={(result) => {
const review = result.parsedOutput
console.log(`Decision: ${review.decision}`)
console.log(`Issues: ${review.issues.length}`)
}}
>
Review this pull request and provide structured feedback
</Codex>
Custom System Prompt
<Codex
model="gpt-4-turbo"
systemPrompt="You are a senior software engineer specializing in React and TypeScript. Focus on performance, type safety, and maintainability."
>
Refactor this component to use modern React patterns
</Codex>
In Fallback Chain
import { FallbackAgent, Codex, Claude } from 'smithers-orchestrator'
<FallbackAgent>
<Codex model="gpt-4-turbo">{prompt}</Codex>
<Claude model="sonnet">{prompt}</Claude>
</FallbackAgent>
Props (Planned)
Prompt or task for the agent.Can be string or JSX elements. Converted to string for CLI.<Codex>Implement feature X</Codex>
<Codex>
Analyze this code:
{codeSnippet}
Suggest improvements for performance.
</Codex>
model
'gpt-4' | 'gpt-4-turbo' | 'gpt-3.5-turbo'
default:"gpt-4-turbo"
OpenAI model selection.Options:
gpt-4 - Original GPT-4 (high quality, slower, more expensive)
gpt-4-turbo - Faster GPT-4 variant with extended context
gpt-3.5-turbo - Faster, cheaper, less capable
Use cases:
- Complex code:
gpt-4
- Balanced:
gpt-4-turbo
- Simple tasks:
gpt-3.5-turbo
Controls randomness in responses (0-2).Lower (0-0.5): More deterministic, focused
- Code generation
- Refactoring
- Bug fixing
Medium (0.5-1): Balanced creativity
- General tasks
- Documentation
Higher (1-2): More creative, diverse
- Brainstorming
- Exploration
System-level instructions defining agent behavior and role.Examples:systemPrompt="You are an expert in database optimization and SQL performance tuning."
systemPrompt="Focus on security best practices. Flag any potential vulnerabilities."
Overrides default Codex system prompt.
Working directory for code operations.Respects worktree context if inside <Worktree>.Priority: Explicit cwd > Worktree context > process.cwd()
Environment variables for CLI execution.Merged with process.env.<Codex env={{ DEBUG: "true", LOG_LEVEL: "verbose" }}>
Debug this issue
</Codex>
Timeout in milliseconds (default 5 minutes).Agent killed if exceeds timeout.<Codex timeout={600000}> {/* 10 minutes for complex task */}
Implement large refactoring
</Codex>
Zod schema for structured output validation.Agent output parsed and validated against schema.const schema = z.object({
files: z.array(z.string()),
changes: z.number(),
summary: z.string()
})
<Codex schema={schema}>Analyze changes</Codex>
Benefits:
- Type-safe output
- Validation errors caught early
- Better agent prompting (schema in prompt)
onFinished
(result: AgentResult<TSchema>) => void
Callback on successful completion.Receives parsed output if schema provided.onFinished={(result) => {
console.log(`Agent completed in ${result.durationMs}ms`)
if (result.parsedOutput) {
// Type-safe access
processResults(result.parsedOutput)
}
}}
Callback on error (timeout, API error, validation failure).onError={(error) => {
console.error(`Codex failed: ${error.message}`)
metrics.recordAgentFailure('codex', error)
}}
Note: Prevents error propagation if provided. Omit to throw.
Implementation Status
Design Phase
Component designed for multi-provider failover in review system.
View on GitHub CLI Wrapper (Pending)
Implement Codex CLI wrapper similar to ClaudeCodeCLI.ts pattern.
Structured Output (Pending)
JSON schema generation from Zod, response validation.
Task Integration (Pending)
Smithers task lifecycle, database logging, observability.
Testing (Future)
Unit tests with mocked API, integration tests with real Codex.
Design Rationale
Why Codex Component?
Provider diversity: Reduce dependency on single AI provider
Cost optimization: OpenAI pricing may be more favorable for certain workloads
Model capabilities: GPT-4 excels at certain code tasks
Fallback resilience: Alternative when Claude unavailable
Implementation Pattern
Similar to Claude component:
export function Codex<TSchema extends z.ZodType = z.ZodType>(
props: CodexProps<TSchema>
): ReactNode {
const { db } = useSmithers()
const worktree = useWorktree()
const taskIdRef = useRef<string | null>(null)
useMount(() => {
;(async () => {
taskIdRef.current = db.tasks.start('codex_agent', extractPrompt(props.children))
try {
const result = await executeCodexCLI({
prompt: extractPrompt(props.children),
model: props.model ?? 'gpt-4-turbo',
temperature: props.temperature,
systemPrompt: props.systemPrompt,
cwd: props.cwd ?? worktree?.cwd,
env: props.env,
timeout: props.timeout ?? 300000,
schema: props.schema
})
props.onFinished?.(result)
} catch (error) {
props.onError?.(error)
throw error
} finally {
db.tasks.complete(taskIdRef.current)
}
})()
})
return <codex-agent model={props.model} status="running" />
}
Authentication
Uses OPENAI_API_KEY environment variable:
export OPENAI_API_KEY="sk-..."
Required in GitHub Actions secrets for CI integration.
Examples of Use Cases
Use Case 1: Code Review
const reviewSchema = z.object({
decision: z.enum(['approve', 'request_changes']),
blocking: z.boolean(),
issues: z.array(z.object({
severity: z.enum(['critical', 'major', 'minor']),
message: z.string(),
file: z.string().optional()
})),
summary: z.string()
})
<Codex
model="gpt-4"
schema={reviewSchema}
systemPrompt="You are a senior code reviewer. Focus on correctness, security, and maintainability."
onFinished={async (result) => {
const review = result.parsedOutput
await db.state.set('lastReview', review)
if (review.decision === 'request_changes') {
await postReviewComment(review.issues)
}
}}
>
Review this PR:
{prDiff}
Check for:
- Security vulnerabilities
- Performance issues
- Type safety
- Test coverage
</Codex>
Use Case 2: Refactoring
<Codex
model="gpt-4-turbo"
temperature={0.3} // Lower temp for focused refactoring
systemPrompt="Expert in React hooks and modern patterns. Preserve functionality while improving code quality."
>
Refactor this component to:
1. Use modern React hooks (no class components)
2. Extract custom hooks for reusable logic
3. Add proper TypeScript types
4. Improve performance with memoization
{componentCode}
</Codex>
Use Case 3: Bug Analysis
const bugAnalysisSchema = z.object({
rootCause: z.string(),
affectedFiles: z.array(z.string()),
suggestedFix: z.string(),
testCases: z.array(z.string())
})
<Codex
model="gpt-4"
schema={bugAnalysisSchema}
systemPrompt="Expert debugger. Provide root cause analysis and concrete fixes."
>
Analyze this bug:
Error: {errorMessage}
Stack trace: {stackTrace}
Recent changes: {gitDiff}
Provide root cause, affected files, and suggested fix.
</Codex>
Alternatives Considered
- Direct OpenAI API calls: No CLI abstraction, harder to test
- Generic LLM component: Less specialized, more configuration needed
- Wrap existing tools: Codex CLI not widely available, need custom wrapper
- Only support Claude: Single point of failure, vendor lock-in
Migration Path
Current pattern (manual OpenAI API):
// Before (manual API calls)
const completion = await openai.chat.completions.create({
model: 'gpt-4',
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: prompt }
],
temperature: 0.7
})
const result = completion.choices[0].message.content
With Codex component:
// After (declarative)
<Codex
model="gpt-4"
temperature={0.7}
systemPrompt={systemPrompt}
>
{prompt}
</Codex>
Benefits: Task tracking, worktree integration, error handling, observability.
Feedback
If you have feedback on this planned component, please open an issue.