Skip to main content

Voice service operations

platform v0.9.11verified 2026-05-14

The Voice service runs the actual voice-AI processing: TelSys (Asterisk 22.9.0) provides PBX and media handling, TelPhi (Node 24) orchestrates AI conversations via realtime and modular speech providers, and AudioProc does optional VAD / noise reduction. SIP calls arrive from TelPro and are processed here. Multiple Voice instances run in parallel for horizontal scaling.

Containers

ContainerBasePort(s)Purpose
voiceai-telsysAsterisk 22.9.0 (built from source on debian:bookworm-slim)5080 SIP · 8088 ARI · 10000–11999 RTPAsterisk PBX / media
voiceai-telphiNode 24-alpine12001 media WSAI conversation engine
voiceai-audioprocpython:3.11-slim (CPU) or nvidia/cuda:12.4.1-runtime-ubuntu22.04 (GPU)8790 WSAudio preprocessing (optional)
log-to-spanalpine:3.19 + Go binaryConverts TelSys structured logs to OTLP spans
voiceai-otel-collectorotel/opentelemetry-collector-contrib:0.150.1Telemetry collector
v0.9.11 changes

Asterisk upgraded to 22.9.0 with missing slin RTP codec entries fixed. chan_websocket enables WebSocket-oriented voice infrastructure. TelPhi media-session handling improved; WebRTC start-call no longer requires two clicks.

Call flow

  1. TelPro routes an INVITE to TelSys (VAIPROXIESPIPED:SIPPORTTELPRO).
  2. TelSys answers and bridges to TelPhi via Asterisk ExternalMedia over the TelPhi media WebSocket (TELPHI_MEDIA_PORT, default 12001).
  3. TelPhi opens a realtime session against the configured AI provider (OpenAI Realtime, Pythia, Vodafone TOBi, Azure STT/TTS) — outbound traffic goes through Squid.
  4. AudioProc (when AUDIOPROC_ENABLED=true) preprocesses inbound audio frames.

See also