Overview

Metrics are how you measure what happens in every voice AI conversation. Roark collects some metrics automatically — like response time, talk time, and sentiment — and lets you define your own using LLM prompts for things like compliance checks, task completion, or custom business KPIs.

What’s a Metric?

A metric is a single measurement collected from a call. It has:

An output type — boolean, numeric, scale, text, classification, count, or offset
A scope — global (one value per call) or per-participant (separate values for agent and customer)
A context — call-level, segment-level (single utterance), or segment-range (span of conversation)

For example, response_time is a numeric, per-participant, segment-range metric that measures how long each speaker takes to respond. identity_verified might be a boolean, global, call-level metric powered by an LLM prompt.

Types of Metrics

System Metrics (Built-in)

Roark automatically collects deterministic and voice-analysis metrics for every call — no configuration needed. Voice-analysis metrics are powered by Roark’s custom voice analysis models, purpose-built to extract signal from conversational audio:

Performance — Response time, talk time, silence duration, overlap/interruptions, latency
Emotion & Sentiment — Sentiment tracking, 64+ emotion detection, vocal cues (raised voice, frustration), stress indicators
Speech — Interruption detection, pause analysis, repetition detection

These are always available in Call History and can be used in reports.

Custom Metrics (LLM-Powered)

Define your own metrics using natural language prompts. Custom metrics are evaluated by Roark Prism — our purpose-built evaluation model optimized for scoring voice AI conversations. You write a prompt describing what to measure, and Prism evaluates each call against it to return a typed result:

“Did the agent verify the caller’s identity?” → Boolean
“Rate the agent’s empathy on a scale of 1-10” → Scale
“What was the primary reason for the call?” → Classification
“How many times did the agent attempt to upsell?” → Count

Custom metrics are created in the dashboard or via the SDK.

How It All Fits Together

Here’s the typical sequence from defining a metric to seeing results:

Define or Choose a Metric

Use a built-in system metric, enable one from the marketplace, or create your own custom metric with an LLM prompt.

Test in the Playground

Run your metric against a real call in the Playground to validate it produces the results you expect. Iterate on the prompt until you’re satisfied.

Add Thresholds (Optional)

Set pass/fail criteria on your metric — for example, Customer Satisfaction >= 7 or Response Time < 1000ms. Thresholds turn raw values into actionable outcomes.

Decide When to Collect

Choose how and when your metric runs:

Metric Policies — Automate collection on incoming calls (monitoring). Set conditions to target specific agents, sources, or call properties.
Simulation Run Plans — Attach metrics with thresholds to simulation tests to validate agent behavior before deployment.
Collection Jobs — Run metrics on demand against existing calls via the SDK, useful for backfilling or re-processing.

Analyze Results

View metric values per call in Call History, aggregate them in Reports, and organize everything in Dashboards.

Quick Start Examples

Use a System Metric
Create a Custom Metric
Use Metrics in Simulations

System metrics like response_time are already collected for every call. To set a quality bar, create a metric policy with a threshold:

const policy = await client.metricPolicy.create({
  name: 'Agent Response Time SLA',
  status: 'ACTIVE',
  conditions: [
    {
      conditions: [
        {
          conditionType: 'AGENT',
          conditionKey: 'your-agent-id',
        },
      ],
    },
  ],
  metrics: [
    {
      id: 'response-time-metric-id',
      threshold: {
        operator: 'LESS_THAN',
        value: 1000,
        aggregationMode: 'P95',
        participantRole: 'AGENT',
      },
    },
  ],
})

Calls where the agent’s P95 response time exceeds 1 second are automatically flagged as failures.

Define a business-specific metric and add it to a policy. Roark Prism evaluates each call against your prompt:

// 1. Create the metric definition
const metric = await client.metric.createDefinition({
  name: 'Appointment Booked',
  outputType: 'BOOLEAN',
  analysisPackageId: 'your-package-id',
  llmPrompt: 'Did the agent successfully book an appointment for the caller?',
  booleanTrueLabel: 'Booked',
  booleanFalseLabel: 'Not Booked',
})

// 2. Add it to a policy so it runs on every call
const policy = await client.metricPolicy.create({
  name: 'Appointment Booking Check',
  status: 'ACTIVE',
  metrics: [{ id: metric.data.id }],
})

Test your prompt in the Playground first to validate it produces the results you expect.

Attach metrics with thresholds to a simulation run plan to validate agent behavior before deployment. Run the simulation, and each call is scored against your pass/fail criteria:

// Run metrics against simulation calls after they complete
const job = await client.metricCollectionJob.create({
  callIds: ['sim-call-1', 'sim-call-2', 'sim-call-3'],
  metrics: [
    { id: 'task-completion-metric-id' },
    { id: 'compliance-check-metric-id' },
  ],
})

// Check results
const result = await client.metricCollectionJob.getByID(job.data.id)
console.log(`${result.data.completedItems}/${result.data.totalItems} processed`)

Or configure metrics with thresholds directly in a simulation run plan from the dashboard.

Sections

Metric Definitions

Types, scopes, output formats, and how to create custom metrics

Playground

Test metrics interactively against real calls

Metric Policies

Automate metric collection with conditions-based rules

Thresholds

Define pass/fail criteria for your metrics

Collection Jobs

Run metrics on demand for existing calls via the SDK

Getting Started

Observability

Metrics

Simulations

Integrations

SDKs & Libraries

Resources

What’s a Metric?

Types of Metrics

System Metrics (Built-in)

Custom Metrics (LLM-Powered)

How It All Fits Together

Quick Start Examples

Sections

Metric Definitions

Playground

Metric Policies

Thresholds

Collection Jobs

Getting Started

Observability

Metrics

Simulations

Integrations

SDKs & Libraries

Resources

​What’s a Metric?

​Types of Metrics

​System Metrics (Built-in)

​Custom Metrics (LLM-Powered)

​How It All Fits Together

​Quick Start Examples

​Sections

Metric Definitions

Playground

Metric Policies

Thresholds

Collection Jobs

What’s a Metric?

Types of Metrics

System Metrics (Built-in)

Custom Metrics (LLM-Powered)

How It All Fits Together

Quick Start Examples

Sections