Skip to main content

Interpretation: speaker and listener

For live interpretation / translation, two roles share the same endpointId but use different session modes:

  • Speaker — a normal WebRTC voice call (voice_conversation) with role: 'speaker' in browserContext.
  • Listener — a listen session that subscribes to the TelPhi interpretation stream; cached items replay, then live updates flow in.

This mirrors the InterpretationDemo.tsx example in the SDK's React examples app.

Speaker side

A standard voice call, with browser context tagging the leg as the speaker:

await delphi.startCall({
endpointId: '24599c70-1e79-4e52-9819-e2d97acf45a5',
autoDial: true,
browserContext: {
identifier: 'booth-1',
role: 'speaker',
sourceLanguage: 'de',
source: 'interpretation_speaker',
metadata: {
interpretationScope: '24599c70-1e79-4e52-9819-e2d97acf45a5',
},
},
});

interpretationScope is the stream key the listener will attach to. Defaults to the endpointId; override if your application uses a different convention.

Listener side (convenience API)

const session = await delphi.listen({
endpointId: '24599c70-1e79-4e52-9819-e2d97acf45a5',
identifier: 'booth-1',
targetLanguage: 'en',
// scope defaults to endpointId
});

The convenience method opens a listen session, sets the appropriate browser context (role: 'listener'), sends the browser.action.listen BOA, and resolves once the stream is attached. Replay starts immediately; live updates follow.

Listener side (lower level)

const session = await delphi.openSession({
endpointId,
mode: 'listen',
});

session.setBrowserContext({
identifier: 'booth-1',
role: 'listener',
targetLanguage: 'en',
source: 'interpretation_listener',
});

await session.listen({
targetLanguage: 'en',
sinceStreamId: '0-0', // replay from the beginning
startMode: 'auto',
captions: {enabled: true},
});

ListenOptions includes targetLanguage, sinceStreamId (Redis Stream entry id for replay), startMode, and caption configuration. See Channel protocol for the message envelope.

React

function Listener({endpointId, identifier, targetLanguage}: Props) {
const session = useDelphiSession({endpointId, mode: 'listen'});

useEffect(() => {
if (!session.serverReady) return;
session.listen({targetLanguage});
}, [session.serverReady, targetLanguage]);

// render captions from session.getState().lastMessage
}

Multiple modes on one endpoint

Speaker (voice_conversation) and listener (listen) run side by side on the same endpointId because sessions are keyed by (endpointId, mode) (SDK 0.1.3+). Closing one mode leaves the other intact:

await delphi.endSession(endpointId, 'listen');
// the voice_conversation session continues

endSession(endpointId) without a mode closes every mode for the endpoint.