Channel protocol
Each session owns a single WebSocket. Every message in either direction is wrapped in a ChannelMessage envelope. The envelope shape is stable; the variant payloads evolve with new capabilities.
Envelope
interface ChannelMessage {
type: ChannelMessageType;
// 'chat' | 'browser_action' | 'action' | 'audio' | 'status' | 'control' | 'reconnect' | 'error'
sessionId: string;
messageId: string;
/** Redis Stream entry id for replayable streams (interpretation, etc.). */
streamId?: string;
timestamp: number;
/** Routing direction. */
direction: 'to_browser' | 'to_ari';
// One of (depending on `type`):
chat?: ChatPayload;
browserAction?: BrowserActionPayload;
action?: ActionPayload;
actionResult?: ActionResultPayload;
audio?: AudioPayload;
status?: StatusPayload;
control?: ControlPayload;
reconnect?: ReconnectPayload;
error?: ErrorPayload;
}
Direction
direction makes routing explicit and is set by the SDK on the way out:
to_browser— server → SDK (chat, audio, browser-action requests, status).to_ari— SDK → server (chat, action results, control, raw send).
Payload variants
type | Field | Purpose |
|---|---|---|
chat | chat | User → AI text, AI → user text. Carries content, responseMode, mediaType. |
browser_action | browserAction | SDK → server BOA invocation. Carries messageType, params, metadata. |
action | action | Server → SDK browser-action request (the AI wants the browser to do something). Carries name, actionId, params. |
actionResult | actionResult | SDK → server result of an action. Carries actionId, success, data?, error?. |
audio | audio | Server → SDK TTS audio chunk or assembled buffer. |
status | status | Bi-directional status updates (idle, in-progress, response started/ended). |
control | control | Session-level controls (response mode, text-chat enable, manual touch). |
reconnect | reconnect | Server-initiated reconnect hints. |
error | error | Typed error from the server. Mapped to SDK error classes on receive. |
Builders
For tests and custom integrations, message builders are exported from the package root:
import {
createChatMessage,
createBrowserActionMessage,
createActionResultMessage,
createControlMessage,
// ...
} from '@ki-kombinat/delphi-client-js-sdk';
const msg = createChatMessage({
sessionId,
content: 'Hello',
responseMode: 'voice',
});
session.sendMessage(msg);
The high-level SessionClient methods (sendChat, sendTextChat, sendReadAloud, sendBrowserAction, etc.) all internally use these builders.
Stream replay
Messages that carry a streamId are replayable through Redis Streams. The most common consumer is the interpretation listener, which can pass sinceStreamId to listen() to replay from a specific point. See Interpretation.
Going lower level
SessionClient can be constructed directly when you have a sessionId + wsToken from elsewhere (e.g. a server-issued session forwarded over WebSocket from a separate channel). The orchestrator-level DelphiClient is optional — it adds capability discovery, the session map, the WebRTC stack for voice, and React-friendly aggregated state.