Skip to main content
Metric definitions describe what to measure and how. Roark comes with built-in system metrics that work out of the box, and you can create your own custom metrics tailored to your business needs. Metrics Dashboard

Creating Custom Metrics

Custom metrics let you measure anything specific to your use case — task completion, compliance checks, quality scoring, or business KPIs.

Custom Metric Types

Write a natural-language prompt describing what to measure. Roark Prism — our evaluation model optimized for voice AI — scores each call against your prompt and returns a typed result.
"Did the agent verify the caller's identity?"           →  Boolean
"Rate the agent's empathy on a 1-5 scale"               →  Scale
"What was the primary reason for the call?"              →  Classification
"How many times did the agent attempt to upsell?"        →  Count
Best for subjective assessments, business logic, and anything that requires understanding conversational context.

Configuration Steps

1

Define the Metric

Name your metric, choose an output type, and describe what it measures. For LLM as Judge metrics, write the evaluation prompt. For formulas, build the expression from existing metrics.
2

Test in the Playground

Run your metric against a real call in the Playground to validate it produces the results you expect. Iterate until you’re satisfied.
3

Add to a Policy or Run Plan

Attach the metric to a metric policy for automated collection on incoming calls, or to a simulation run plan for testing.

SDK Reference

All API endpoints require authentication. Generate an API key to get started.

Create a Metric Definition

Create a new custom metric definition using the SDK: Parameters:
FieldTypeRequiredDescription
namestringYesName of the metric (1-100 characters)
outputTypestringYesOne of: BOOLEAN, NUMERIC, TEXT, SCALE, CLASSIFICATION, COUNT, OFFSET
analysisPackageIdstringYesUUID of the analysis package to add this metric to
metricIdstringNoUnique identifier (auto-generated from name if omitted)
scopestringNoGLOBAL (default) or PER_PARTICIPANT
participantRolestringConditionalRequired when scope is PER_PARTICIPANT
supportedContextsstring[]NoDefaults to ["CALL"]
llmPromptstringNoThe LLM prompt used to evaluate this metric (max 2000 chars)
Type-specific fields:
FieldApplies ToDescription
booleanTrueLabel / booleanFalseLabelBOOLEANCustom labels for true/false values
scaleMin / scaleMaxSCALERange boundaries (0-100)
scaleLabelsSCALEArray of label objects with rangeMin, rangeMax, label, displayOrder
classificationOptionsCLASSIFICATIONArray of options with label, description, displayOrder
maxClassificationsCLASSIFICATIONMaximum number of classifications to select
Example: Create a BOOLEAN metric
const metric = await client.metric.createDefinition({
  name: 'Identity Verified',
  outputType: 'BOOLEAN',
  analysisPackageId: 'your-package-id',
  llmPrompt: 'Did the agent successfully verify the caller identity before proceeding with the request?',
  booleanTrueLabel: 'Verified',
  booleanFalseLabel: 'Not Verified',
})
Example: Create a SCALE metric
const metric = await client.metric.createDefinition({
  name: 'Customer Satisfaction',
  outputType: 'SCALE',
  analysisPackageId: 'your-package-id',
  llmPrompt: 'Rate the overall customer satisfaction based on the conversation tone, resolution, and agent helpfulness.',
  scaleMin: 1,
  scaleMax: 10,
  scaleLabels: [
    { rangeMin: 1, rangeMax: 3, label: 'Poor', displayOrder: 1 },
    { rangeMin: 4, rangeMax: 6, label: 'Average', displayOrder: 2 },
    { rangeMin: 7, rangeMax: 10, label: 'Excellent', displayOrder: 3 },
  ],
})

List Metric Definitions

Retrieve all available metric definitions for your project:
const definitions = await client.metric.listDefinitions()

// definitions.data[0]
{
  id: 'uuid',
  metricId: 'response_time',
  name: 'Response Time',
  description: 'Time taken to respond to a question',
  type: 'OFFSET',
  scope: 'PER_PARTICIPANT',
  supportedContexts: ['SEGMENT_RANGE', 'CALL'],
  unit: { name: 'milliseconds', symbol: 'ms' },
}

Get Call Metrics

Retrieve all metrics for a specific call:
const metrics = await client.call.listMetrics('call-id')

// Or flatten to get a simple list instead of grouped by definition
const flat = await client.call.listMetrics('call-id', { flatten: 'true' })
The response groups metrics by definition, with each metric containing an array of values:
// metrics.data[0]
{
  metricDefinitionId: 'uuid',
  metricId: 'response_time',
  name: 'Response Time',
  type: 'OFFSET',
  scope: 'PER_PARTICIPANT',
  unit: { name: 'milliseconds', symbol: 'ms' },
  values: [
    {
      value: 2500,
      context: 'SEGMENT_RANGE',
      participantRole: 'agent',
      confidence: 1.0,
      computedAt: '2024-01-15T10:30:00Z',
      fromSegment: { id: 'uuid', text: 'How can I help you today?', startOffsetMs: 1000, endOffsetMs: 2000 },
      toSegment: { id: 'uuid', text: 'I have a question about my bill', startOffsetMs: 4500, endOffsetMs: 6000 },
    },
  ],
}

Understanding Metric Values

Confidence Scores:
  • All metrics include a confidence field (0-1)
  • Deterministic metrics (like word count, duration) have confidence = 1.0
  • AI-powered metrics include the model’s confidence level
Value Reasoning:
  • For AI-computed metrics, the valueReasoning field provides explanation
  • Useful for understanding why a metric was scored a certain way
  • Example: “The agent verified identity using two-factor authentication as mentioned in segment 3”
Segment Context:
  • When context is SEGMENT, the segment field contains the specific utterance
  • When context is SEGMENT_RANGE, both fromSegment and toSegment are included
  • All segment objects include the full text and timing information

Best Practices

Use the built-in system metrics first — they cover performance, sentiment, interruptions, compliance, and more with no setup. Add custom metrics for business-specific needs.
Always validate custom metrics in the Playground on representative calls before attaching them to policies.
Instead of creating one complex LLM prompt that tries to measure everything, break it into focused metrics and combine them with a formula.
Track both agent and customer metrics for complete conversation understanding.

What’s Next

System Metrics Reference

Browse all 65+ built-in metrics powered by specialized models

Playground

Test metrics interactively before deploying

Thresholds

Define pass/fail criteria for your metrics

Policies

Automate metric collection with conditions-based rules