219 lines
6.5 KiB
Markdown
219 lines
6.5 KiB
Markdown
# EVS Bridge (Home Assistant + MQTT + WebSocket)
|
|
|
|
This service is the audio bridge between your ESP32 client and your Home Assistant stack.
|
|
|
|
It provides:
|
|
- WebSocket endpoint for raw PCM audio (`/audio`)
|
|
- MQTT status/events (`evs/<device_id>/status`)
|
|
- MQTT playback input (`evs/<device_id>/play_pcm16le`)
|
|
- Optional Home Assistant webhook callbacks (`connected`, `start`, `stop`, `disconnected`)
|
|
- VAD auto-segmentation (`vad_segment`) with pre-roll/post-roll
|
|
- Optional STT worker (`vad_segment` -> `transcript`) via MQTT
|
|
|
|
## 1) Start the bridge
|
|
|
|
The image already contains sane default `ENV` values. A custom `.env` is optional.
|
|
|
|
1. Copy env template:
|
|
```bash
|
|
cp .env.example .env
|
|
```
|
|
2. Edit `.env`:
|
|
- `MQTT_HOST`, `MQTT_USER`, `MQTT_PASSWORD`
|
|
- `HA_WEBHOOK_URL` (optional)
|
|
3. Start:
|
|
```bash
|
|
docker compose up -d --build
|
|
```
|
|
|
|
## 1.1) Registry image naming
|
|
|
|
Recommended image path in your Gitea registry:
|
|
- `git.khnm-zimmerling.de/kai/evs-bridge:latest`
|
|
- `git.khnm-zimmerling.de/kai/evs-bridge:v0.1.0`
|
|
|
|
Recommended tags:
|
|
- `latest` for current default deployment
|
|
- `vX.Y.Z` for stable releases
|
|
|
|
## 2) Configure ESP32
|
|
|
|
In `src/main.cpp`:
|
|
- no environment-specific values should be edited directly
|
|
|
|
In `include/secrets.h`:
|
|
- copy from `include/secrets.example.h`
|
|
- set WiFi credentials
|
|
- set bridge host
|
|
- set WS port/path
|
|
- set unique `EVS_DEVICE_ID`
|
|
|
|
Then upload firmware.
|
|
|
|
## 3) Test flow
|
|
|
|
1. Flash ESP32
|
|
2. Open serial monitor
|
|
3. Wait for WS connect (client switches to stream mode automatically)
|
|
4. In bridge logs, you should see the device connection
|
|
5. If `ECHO_ENABLED=true`, incoming audio is returned to ESP32 speaker
|
|
|
|
## 4) MQTT topics
|
|
|
|
- Status/events published by bridge:
|
|
- `evs/<device_id>/status` (connection/start/stop/disconnect)
|
|
- `evs/<device_id>/mic_level` (mic telemetry)
|
|
- `evs/<device_id>/vad_segment` (finalized speech segments)
|
|
- `evs/<device_id>/transcript` (text from stt-worker)
|
|
- `evs/<device_id>/stt_error` (stt-worker errors)
|
|
- Playback input to device:
|
|
- `evs/<device_id>/play_pcm16le`
|
|
- payload options:
|
|
- raw binary PCM16LE
|
|
- JSON `{ "pcm16le_b64": "<base64>" }`
|
|
|
|
## 5) Home Assistant integration
|
|
|
|
Use webhook for event hooks:
|
|
- Configure `HA_WEBHOOK_URL` in `.env`
|
|
- Bridge sends JSON with event and metadata on:
|
|
- `connected`
|
|
- `start`
|
|
- `stop`
|
|
- `disconnected`
|
|
|
|
You can build automations on these events (for STT/TTS pipelines or Node-RED handoff).
|
|
|
|
## 6) Notes
|
|
|
|
- Audio format: PCM16LE, mono, 16 kHz
|
|
- `SAVE_SESSIONS=true` stores `.wav` files in `bridge/data/sessions`
|
|
- Recording is buffered in RAM during `start`..`stop` and rotated automatically:
|
|
- PCM data is collected in memory and written as one WAV file when the segment limit is reached
|
|
- this reduces write frequency on disk
|
|
- `WAV_SEGMENT_MAX_BYTES` max size per `.wav` file (default: `20971520` = 20 MB)
|
|
- `WAV_KEEP_FILES` max number of `.wav` files to keep (default: `10`)
|
|
- `MAX_SESSION_BYTES` is only used if session file saving is disabled
|
|
- Voice activity detection (VAD):
|
|
- `VAD_ENABLED=true` enables automatic speech segment detection
|
|
- `VAD_PREROLL_MS=1000` keeps 1s before speech start
|
|
- `VAD_POSTROLL_MS=1000` keeps 1s after speech end
|
|
- `VAD_START_THRESHOLD` / `VAD_STOP_THRESHOLD` tune sensitivity
|
|
- `VAD_DIR` stores per-utterance WAV files
|
|
- `VAD_KEEP_FILES=200` limits number of stored VAD WAV files
|
|
- `VAD_MAX_AGE_DAYS=7` deletes VAD WAV files older than 7 days
|
|
- MQTT is recommended for control/events, WebSocket for streaming audio
|
|
- STT worker:
|
|
- subscribes: `evs/<device_id>/vad_segment`
|
|
- reads `wav_path` from event JSON
|
|
- transcribes with `faster-whisper`
|
|
- publishes transcript to `evs/<device_id>/transcript`
|
|
- `STT_TRANSCRIPT_RETAIN=true` keeps latest transcript visible in MQTT UIs
|
|
|
|
## 6.1) STT Worker Config
|
|
|
|
Use these env values (in `.env` or Portainer):
|
|
- `STT_MODEL` (`tiny`, `base`, `small`, `medium`, `large-v3`)
|
|
- `STT_DEVICE` (`cpu` or `cuda`)
|
|
- `STT_COMPUTE_TYPE` (`int8`, `float16`, ...)
|
|
- `STT_LANGUAGE` (`de` or empty for auto-detect)
|
|
- `MQTT_VAD_TOPIC`, `MQTT_TRANSCRIPT_TOPIC_TEMPLATE`, `MQTT_STT_ERROR_TOPIC_TEMPLATE`
|
|
|
|
## 7) Build and push to Gitea registry
|
|
|
|
From repository root:
|
|
|
|
```bash
|
|
docker login git.khnm-zimmerling.de
|
|
docker build -f bridge/Dockerfile -t git.khnm-zimmerling.de/kai/evs-bridge:latest bridge
|
|
docker push git.khnm-zimmerling.de/kai/evs-bridge:latest
|
|
```
|
|
|
|
Optional release tag:
|
|
|
|
```bash
|
|
docker tag git.khnm-zimmerling.de/kai/evs-bridge:latest git.khnm-zimmerling.de/kai/evs-bridge:v0.1.0
|
|
docker push git.khnm-zimmerling.de/kai/evs-bridge:v0.1.0
|
|
```
|
|
|
|
## 8) Portainer stack with registry image
|
|
|
|
```yaml
|
|
services:
|
|
evs-bridge:
|
|
image: git.khnm-zimmerling.de/kai/evs-bridge:latest
|
|
container_name: evs-bridge
|
|
restart: unless-stopped
|
|
ports:
|
|
- "8765:8765"
|
|
environment:
|
|
WS_HOST: "0.0.0.0"
|
|
WS_PORT: "8765"
|
|
WS_PATH: "/audio"
|
|
ECHO_ENABLED: "true"
|
|
LOG_LEVEL: "INFO"
|
|
MQTT_ENABLED: "true"
|
|
MQTT_HOST: "10.100.3.247"
|
|
MQTT_PORT: "1883"
|
|
MQTT_USER: ""
|
|
MQTT_PASSWORD: ""
|
|
MQTT_BASE_TOPIC: "evs"
|
|
MQTT_TTS_TOPIC: "evs/+/play_pcm16le"
|
|
MQTT_STATUS_RETAIN: "true"
|
|
HA_WEBHOOK_URL: ""
|
|
SAVE_SESSIONS: "true"
|
|
SESSIONS_DIR: "/data/sessions"
|
|
PCM_SAMPLE_RATE: "16000"
|
|
WAV_SEGMENT_MAX_BYTES: "20971520"
|
|
WAV_KEEP_FILES: "10"
|
|
VAD_ENABLED: "true"
|
|
VAD_DIR: "/data/vad"
|
|
VAD_KEEP_FILES: "200"
|
|
VAD_MAX_AGE_DAYS: "7"
|
|
VAD_PREROLL_MS: "1000"
|
|
VAD_POSTROLL_MS: "1000"
|
|
VAD_START_THRESHOLD: "900"
|
|
VAD_STOP_THRESHOLD: "600"
|
|
VAD_MIN_SPEECH_MS: "300"
|
|
volumes:
|
|
- evs_bridge_data:/data
|
|
|
|
evs-stt-worker:
|
|
image: git.khnm-zimmerling.de/kai/evs-stt-worker:latest
|
|
container_name: evs-stt-worker
|
|
restart: unless-stopped
|
|
environment:
|
|
LOG_LEVEL: "INFO"
|
|
MQTT_HOST: "10.100.3.247"
|
|
MQTT_PORT: "1883"
|
|
MQTT_USER: ""
|
|
MQTT_PASSWORD: ""
|
|
MQTT_BASE_TOPIC: "evs"
|
|
MQTT_VAD_TOPIC: "evs/+/vad_segment"
|
|
MQTT_TRANSCRIPT_TOPIC_TEMPLATE: "evs/{device_id}/transcript"
|
|
MQTT_STT_ERROR_TOPIC_TEMPLATE: "evs/{device_id}/stt_error"
|
|
STT_MODEL: "small"
|
|
STT_DEVICE: "cpu"
|
|
STT_COMPUTE_TYPE: "int8"
|
|
STT_BEAM_SIZE: "1"
|
|
STT_LANGUAGE: "de"
|
|
STT_CONDITION_ON_PREV_TEXT: "false"
|
|
volumes:
|
|
- evs_bridge_data:/data
|
|
|
|
volumes:
|
|
evs_bridge_data:
|
|
```
|
|
|
|
## 9) Optional: auto-push via Gitea Actions
|
|
|
|
Workflow file:
|
|
- `.gitea/workflows/bridge-image.yml`
|
|
|
|
Required repository secrets:
|
|
- `REGISTRY_USERNAME`
|
|
- `REGISTRY_TOKEN`
|
|
|
|
The workflow builds `bridge/Dockerfile` and pushes:
|
|
- `git.khnm-zimmerling.de/kai/evs-bridge:latest`
|