Introduction

The Simulation & Evaluation Engine for Voice AI

Roark is the evaluation layer for voice AI agents. Monitor every production call in real-time, run metrics powered by Roark Prism — our purpose-built evaluation model — and stress-test agents with simulations before they reach customers.

Monitor

Every call is transcribed, analyzed for sentiment, emotions, and speech patterns, and made searchable in real-time.

Evaluate

Collect metrics automatically — from response time to custom LLM evaluations — with pass/fail thresholds.

Simulate

Test agents with synthetic callers across personas, accents, and edge cases before deploying.

How It Works

Get calls into Roark

Connect a voice platform (Vapi, Retell, ElevenLabs, LiveKit, others) and calls sync automatically. Or upload recordings directly via the SDK.

Calls are analyzed automatically

Every call is transcribed, then run through Roark’s voice analysis models — detecting sentiment, 64+ emotions, interruptions, speech pauses, and vocal cues. Active metric policies collect any additional metrics you’ve configured.

Evaluate with metrics and thresholds

Define what good looks like. Use built-in metrics like response time or create custom evaluations with Roark Prism (e.g., “Did the agent verify the caller’s identity?”). Add thresholds to turn values into pass/fail outcomes.

Test before you ship

Run simulations against your agent with synthetic callers. Attach metrics with thresholds to your run plan — if the agent doesn’t meet your bar, you know before customers do.

Quick Start

Upload a Call
Run Metrics on a Call
Connect a Platform

Send a call recording and let Roark handle the rest:

import Roark from '@roarkanalytics/sdk'

const client = new Roark({
  bearerToken: process.env.ROARK_API_BEARER_TOKEN,
})

const call = await client.call.create({
  recordingUrl: 'https://example.com/recording.wav',
  startedAt: '2025-01-15T10:00:00Z',
  interfaceType: 'PHONE',
  callDirection: 'INBOUND',
  agent: { name: 'Support Agent' },
  customer: { phoneNumberE164: '+15551234567' },
})

The call appears in Call History with full transcription, sentiment, and emotion analysis. Any active metric policies run automatically.

Collect specific metrics against existing calls:

// Create a custom metric
const metric = await client.metric.createDefinition({
  name: 'Identity Verified',
  outputType: 'BOOLEAN',
  analysisPackageId: 'your-package-id',
  llmPrompt: 'Did the agent verify the caller identity before proceeding?',
})

// Run it against calls
const job = await client.metricCollectionJob.create({
  callIds: ['call-id-1', 'call-id-2'],
  metrics: [{ id: metric.data.id }],
})

// Fetch results
const results = await client.call.listMetrics('call-id-1')

Or test metrics interactively in the Playground first.

Explore the Docs

Observability

Call history, reports, and dashboards

Metrics

Definitions, policies, thresholds, and the playground

Simulations

Personas, scenarios, run plans, and schedules

Integrations

Vapi, Retell, ElevenLabs, Leaping, LiveKit, Pipecat, and custom

SDKs

Node.js, Python, and MCP Server

API Reference

REST API documentation

Getting Started

Observability

Metrics

Simulations

Recipes

Integrations

SDKs & Libraries

Resources

The Simulation & Evaluation Engine for Voice AI

Monitor

Evaluate

Simulate

How It Works

Quick Start

Explore the Docs

Observability

Metrics

Simulations

Integrations

SDKs

API Reference

Getting Started

Observability

Metrics

Simulations

Recipes

Integrations

SDKs & Libraries

Resources

​The Simulation & Evaluation Engine for Voice AI

Monitor

Evaluate

Simulate

​How It Works

​Quick Start

​Explore the Docs

Observability

Metrics

Simulations

Integrations

SDKs

API Reference

The Simulation & Evaluation Engine for Voice AI

How It Works

Quick Start

Explore the Docs