Interpretation: speaker and listener

For live interpretation / translation, two roles share the same endpointId but use different session modes:

Speaker — a normal WebRTC voice call (voice_conversation) with role: 'speaker' in browserContext.
Listener — a listen session that subscribes to the TelPhi interpretation stream; cached items replay, then live updates flow in.

This mirrors the InterpretationDemo.tsx example in the SDK's React examples app.

Speaker side

A standard voice call, with browser context tagging the leg as the speaker:

await delphi.startCall({
  endpointId: '24599c70-1e79-4e52-9819-e2d97acf45a5',
  autoDial: true,
  browserContext: {
    identifier: 'booth-1',
    role: 'speaker',
    sourceLanguage: 'de',
    source: 'interpretation_speaker',
    metadata: {
      interpretationScope: '24599c70-1e79-4e52-9819-e2d97acf45a5',
    },
  },
});

interpretationScope is the stream key the listener will attach to. Defaults to the endpointId; override if your application uses a different convention.

Listener side (convenience API)

const session = await delphi.listen({
  endpointId: '24599c70-1e79-4e52-9819-e2d97acf45a5',
  identifier: 'booth-1',
  targetLanguage: 'en',
  // scope defaults to endpointId
});

The convenience method opens a listen session, sets the appropriate browser context (role: 'listener'), sends the browser.action.listen BOA, and resolves once the stream is attached. Replay starts immediately; live updates follow.

Listener side (lower level)

const session = await delphi.openSession({
  endpointId,
  mode: 'listen',
});

session.setBrowserContext({
  identifier: 'booth-1',
  role: 'listener',
  targetLanguage: 'en',
  source: 'interpretation_listener',
});

await session.listen({
  targetLanguage: 'en',
  sinceStreamId: '0-0', // replay from the beginning
  startMode: 'auto',
  captions: {enabled: true},
});

ListenOptions includes targetLanguage, sinceStreamId (Redis Stream entry id for replay), startMode, and caption configuration. See Channel protocol for the message envelope.

React

function Listener({endpointId, identifier, targetLanguage}: Props) {
  const session = useDelphiSession({endpointId, mode: 'listen'});

  useEffect(() => {
    if (!session.serverReady) return;
    session.listen({targetLanguage});
  }, [session.serverReady, targetLanguage]);

  // render captions from session.getState().lastMessage
}

Multiple modes on one endpoint

Speaker (voice_conversation) and listener (listen) run side by side on the same endpointId because sessions are keyed by (endpointId, mode) (SDK 0.1.3+). Closing one mode leaves the other intact:

await delphi.endSession(endpointId, 'listen');
// the voice_conversation session continues

endSession(endpointId) without a mode closes every mode for the endpoint.

Speaker side​

Listener side (convenience API)​

Listener side (lower level)​

React​

Multiple modes on one endpoint​

Speaker side

Listener side (convenience API)

Listener side (lower level)

React

Multiple modes on one endpoint