Scenarios
Generic vs. Specific Steps
How you write customer step nodes directly affects how closely the simulated customer follows the script versus improvising naturally. First-person (specific) — Writing steps like a transcript tells the simulation agent to stick closely to that exact phrasing:Scenario Structure and Organization
Roark scenarios are graph-based — each scenario is a tree of customer and agent nodes that models the customer’s journey through a conversation. Each unique path from root to leaf in the tree is treated as a distinct scenario path.Start with a happy path
Begin with a single expected customer path — the “happy path” where everything goes as planned. This is a high-level example — if you’re testing specific flows, you can expand on each step to add more detail:Add branches for edge cases
Once your happy path works, add branches at points where the conversation can diverge. The graph structure means you only define the shared steps once — branches inherit everything above them: Focus on branching at points where:- The agent asks the customer a question (customers can respond in unexpected ways)
- The agent could go on a tangent or lose track of the conversation
- Tool calls or lookups might fail or return unexpected results
Use Links for reusable flows
If you need to model hundreds or thousands of scenarios that share a common flow — such as an IVR menu — use Scenario Link nodes instead of duplicating branches. A Scenario Link node references another scenario graph and embeds it inline. This is particularly useful for IVR trees, authentication flows, or any shared entry sequence: Keep your shared IVR tree in a single scenario, and reference it from other scenarios via Link nodes. This way, if the IVR menu changes, you update it in one place.Templating with Variables
Scenarios support template variables using{{variableName}} syntax. Variables are replaced at runtime, making it easy to reuse a single scenario across different contexts.
{{persona.*}} prefix. These are automatically injected based on the selected persona at runtime:
Generating Scenarios with the AI Assistant
Roark’s AI assistant can generate all of the above for you. Our recommended workflow:- Start from production calls — Generate scenarios from real calls to get representative conversation flows. This gives you a realistic baseline that reflects how customers actually interact with your agent.
- Extend with branches — Once you have a production-based scenario, add branches to cover paths that didn’t occur in the original call but could happen in production.
- Generate across multiple calls — Select several calls and let the assistant identify the unique and custom paths across them, automatically building a branching scenario graph.
Generating scenarios from real calls is the fastest way to get high-quality, representative test coverage. The AI assistant is available in the scenario builder — look for the Generate options when creating a new scenario.
Personas
Personas model the simulated customer throughout the call. A good persona strategy tests your agent across a range of realistic caller profiles.Diversify Voice and Speech
Include personas that cover:- Different accents and languages — Battle-test your transcriber’s accuracy across accents (US, British, Indian, Spanish, etc.) and languages
- Background noise — Verify your endpointing model works well in non-ideal audio conditions (office noise, etc.)
- Different response times — Use varied speech paces (slow, normal, fast) to ensure your agent doesn’t interrupt the customer or timeout prematurely
Test Difficult Customer Types
Set up personas that represent challenging interactions your agent needs to handle gracefully:AI Skeptic
A customer who is suspicious they’re talking to an AI and tests the agent with trick questions
Offensive Caller
A rude or hostile customer — verify your agent always responds politely and professionally
Sensitive Situation
A customer going through a difficult time (bereavement, financial hardship) — ensure your agent is empathetic and considerate
Rapid Switcher
A customer who changes topics frequently and tests your agent’s ability to stay on track
Use Backstories
The backstory field is where personas come to life. It’s a prompt injected into the simulation agent that shapes the customer’s entire behavior. Good backstories provide context that drives realistic, nuanced interactions. Example backstories: James — Bereaved CustomerRun Plan Configuration
A run plan creates a matrix of simulations — one for each combination of agent endpoint, scenario, and persona. How you configure that matrix depends on what you’re testing.Common Patterns
Load Testing
Load Testing
Battle-test your agent against thousands of concurrent calls.
- Pick any persona and a scenario matching your desired call duration
- Set iterations to your target volume (e.g., 1000)
- Set concurrency to the same value so all simulations hit the agent simultaneously
Adaptability Across Personas
Adaptability Across Personas
Test how your agent handles different voices, accents, and personalities on the same flow.
- Select a single scenario (typically your happy path)
- Choose multiple personas with a wide variety of accents, languages, speech paces, and personality types
Instruction Following
Instruction Following
Agents are non-deterministic — it’s critical to verify they don’t go off-script or hit loopholes.
- Create scenarios with multiple branches covering edge cases and recovery paths
- Focus on points where the agent might go on a tangent or fail to recover
Red Teaming
Red Teaming
Test your agent’s resilience against adversarial inputs.
- Set up scenarios covering prompt injection attempts, PII extraction, and social engineering
- Use personas with adversarial backstories
Keeping Simulations Under Control
| Setting | Recommendation |
|---|---|
| Max simulation duration | Set to ~110% of your average call duration. Prevents runaway calls if the agent goes on a tangent. |
| Silence timeout | Ensures calls hang up after a set period of silence, catching cases where either side stops responding. |
| End phrases | Specific phrases that indicate a call has gone off track. When matched, the simulation ends immediately. |
| End reasons | When you don’t have specific phrases, define end reasons instead. An LLM evaluates each turn and ends the call if the reason is met. |
| Concurrency | For non-load-testing runs, keep concurrency at 5 or below to avoid unnecessary load on your agent and manage costs. Reserve high concurrency for dedicated load tests. |
Next Steps
Scenarios
Build and manage scenario graphs
Personas
Create diverse customer profiles
Variables
Templatize scenarios with dynamic values
Run Plans
Configure and execute simulation matrices

