Read aloud

The highest-level use of the SDK: send a string, hear the AI speak it. The SDK opens an audio_playback session, streams the TTS audio over the channel WebSocket, plays it via new Audio(), and resolves a promise when the audio finishes.

This mirrors the ReadAloudDemo.tsx example in the SDK's React examples app.

Prerequisites

An endpoint ID that supports audio_playback. Verify with getCapabilities.
A DelphiClient configured per Configuration.

Minimum viable example

import {DelphiClient} from '@ki-kombinat/delphi-client-js-sdk';

const delphi = new DelphiClient({
  apiDomain: 'api.delphi.ki-kombinat.com',
  apiKey: 'sk_live_…',
});

await delphi.readAloud('Hello, world!', {
  endpointId: '24599c70-1e79-4e52-9819-e2d97acf45a5',
});

That's it. The server picks the right read-aloud browser action; you only supply the text. Repeated calls reuse the same session and conversation thread.

Options

await delphi.readAloud('Some text', {
  endpointId: '24599c70-1e79-4e52-9819-e2d97acf45a5',

  metadata: {from: 'highlight'},

  // Disambiguate when an endpoint has multiple read-aloud BOAs.
  capabilityId: 'cap_123',
  identifier: 'tts-fast',

  // Override the BOA message-type if your runtime uses a custom one.
  messageType: 'browser.action.readAloudFast',

  // Cancel mid-flight.
  signal: abortController.signal,

  // Skip the SDK's built-in audio playback (handle audio yourself).
  disableAutoPlay: true,
  onAudio: (event) => myAudioPlayer.play(event.dataUrl),
});

Inspect the audio

The promise resolves with the assembled BrowserAudioEvent:

const audio = await delphi.readAloud('Welcome back!', {endpointId});

console.log(audio.dataUrl); // data: URL ready to feed into <audio>
console.log(audio.mimeType); // e.g. 'audio/wav'
console.log(audio.metadata); // BOA metadata from the server

Closing the session

// Close just audio playback for this endpoint.
await delphi.endSession(endpointId, 'audio_playback');

// Or close every mode for this endpoint at once.
await delphi.endSession(endpointId);

Power-user variant

If you want explicit control over the session lifecycle and the ability to push browser context the AI can reference:

const session = await delphi.openSession({
  endpointId,
  mode: 'audio_playback',
});

session.setBrowserContext({
  text: document.querySelector('article')?.innerText ?? '',
  source: 'page',
  url: window.location.href,
});

session.sendBrowserAction({
  messageType: 'browser.action.readAload',
});
await session.audioDone();

session.sendBrowserAction({
  messageType: 'browser.action.transformAndRead',
  text: 'Summarise the article.',
});
await session.audioDone();

await session.close();

Next step

Try the voice call quick start, or jump straight to the React guide for useDelphiSession.

Prerequisites​

Minimum viable example​

Options​

Inspect the audio​

Closing the session​

Power-user variant​

Next step​