Prerequisites
- A working Smithers workflow (see Tutorial: Build a Workflow)
- At least one
<Task>with an agent
Step 1: Import Scorers
Step 2: Attach Scorers to a Task
Add thescorers prop to any <Task>:
Step 3: Add LLM-based Scoring (Optional)
For LLM-as-judge evaluation, pass an agent to the scorer factory:Step 4: Run Your Workflow
.smithers/workflows, use smithers workflow run <name> instead.
Scorers run asynchronously after each task finishes. They never slow down your workflow.
Step 5: View Scores
CLI
TUI
Open the TUI withsmithers tui, navigate to a task, and switch to the Scores tab to see per-task scoring results.
Step 6: Custom Scorers
Build your own scorer withcreateScorer:
Step 7: LLM-as-Judge Custom Scorers
UsellmJudge to build custom LLM-based scorers:
Batch Evaluation
For testing and offline evaluation, userunScorersBatch directly: