Skip to main content

Channel protocol

Each session owns a single WebSocket. Every message in either direction is wrapped in a ChannelMessage envelope. The envelope shape is stable; the variant payloads evolve with new capabilities.

Envelope

interface ChannelMessage {
type: ChannelMessageType;
// 'chat' | 'browser_action' | 'action' | 'audio' | 'status' | 'control' | 'reconnect' | 'error'

sessionId: string;
messageId: string;

/** Redis Stream entry id for replayable streams (interpretation, etc.). */
streamId?: string;

timestamp: number;

/** Routing direction. */
direction: 'to_browser' | 'to_ari';

// One of (depending on `type`):
chat?: ChatPayload;
browserAction?: BrowserActionPayload;
action?: ActionPayload;
actionResult?: ActionResultPayload;
audio?: AudioPayload;
status?: StatusPayload;
control?: ControlPayload;
reconnect?: ReconnectPayload;
error?: ErrorPayload;
}

Direction

direction makes routing explicit and is set by the SDK on the way out:

  • to_browser — server → SDK (chat, audio, browser-action requests, status).
  • to_ari — SDK → server (chat, action results, control, raw send).

Payload variants

typeFieldPurpose
chatchatUser → AI text, AI → user text. Carries content, responseMode, mediaType.
browser_actionbrowserActionSDK → server BOA invocation. Carries messageType, params, metadata.
actionactionServer → SDK browser-action request (the AI wants the browser to do something). Carries name, actionId, params.
actionResultactionResultSDK → server result of an action. Carries actionId, success, data?, error?.
audioaudioServer → SDK TTS audio chunk or assembled buffer.
statusstatusBi-directional status updates (idle, in-progress, response started/ended).
controlcontrolSession-level controls (response mode, text-chat enable, manual touch).
reconnectreconnectServer-initiated reconnect hints.
errorerrorTyped error from the server. Mapped to SDK error classes on receive.

Builders

For tests and custom integrations, message builders are exported from the package root:

import {
createChatMessage,
createBrowserActionMessage,
createActionResultMessage,
createControlMessage,
// ...
} from '@ki-kombinat/delphi-client-js-sdk';

const msg = createChatMessage({
sessionId,
content: 'Hello',
responseMode: 'voice',
});

session.sendMessage(msg);

The high-level SessionClient methods (sendChat, sendTextChat, sendReadAloud, sendBrowserAction, etc.) all internally use these builders.

Stream replay

Messages that carry a streamId are replayable through Redis Streams. The most common consumer is the interpretation listener, which can pass sinceStreamId to listen() to replay from a specific point. See Interpretation.

Going lower level

SessionClient can be constructed directly when you have a sessionId + wsToken from elsewhere (e.g. a server-issued session forwarded over WebSocket from a separate channel). The orchestrator-level DelphiClient is optional — it adds capability discovery, the session map, the WebRTC stack for voice, and React-friendly aggregated state.