Add MQTT-based STT worker for VAD segments
Some checks failed
Build and Push EVS Bridge Image / docker (push) Has been cancelled

This commit is contained in:
Kai
2026-02-13 17:49:26 +01:00
parent fd1bfb4786
commit 5294c24b08
7 changed files with 292 additions and 2 deletions

View File

@@ -8,6 +8,7 @@ It provides:
- MQTT playback input (`evs/<device_id>/play_pcm16le`)
- Optional Home Assistant webhook callbacks (`connected`, `start`, `stop`, `disconnected`)
- VAD auto-segmentation (`vad_segment`) with pre-roll/post-roll
- Optional STT worker (`vad_segment` -> `transcript`) via MQTT
## 1) Start the bridge
@@ -53,7 +54,7 @@ Then upload firmware.
1. Flash ESP32
2. Open serial monitor
3. Send `s` (stream mode)
3. Wait for WS connect (client switches to stream mode automatically)
4. In bridge logs, you should see the device connection
5. If `ECHO_ENABLED=true`, incoming audio is returned to ESP32 speaker
@@ -63,7 +64,8 @@ Then upload firmware.
- `evs/<device_id>/status` (connection/start/stop/disconnect)
- `evs/<device_id>/mic_level` (mic telemetry)
- `evs/<device_id>/vad_segment` (finalized speech segments)
- reserved for next steps: `evs/<device_id>/transcript`, `evs/<device_id>/stt_error`
- `evs/<device_id>/transcript` (text from stt-worker)
- `evs/<device_id>/stt_error` (stt-worker errors)
- Playback input to device:
- `evs/<device_id>/play_pcm16le`
- payload options:
@@ -101,6 +103,20 @@ You can build automations on these events (for STT/TTS pipelines or Node-RED han
- `VAD_KEEP_FILES=200` limits number of stored VAD WAV files
- `VAD_MAX_AGE_DAYS=7` deletes VAD WAV files older than 7 days
- MQTT is recommended for control/events, WebSocket for streaming audio
- STT worker:
- subscribes: `evs/<device_id>/vad_segment`
- reads `wav_path` from event JSON
- transcribes with `faster-whisper`
- publishes transcript to `evs/<device_id>/transcript`
## 6.1) STT Worker Config
Use these env values (in `.env` or Portainer):
- `STT_MODEL` (`tiny`, `base`, `small`, `medium`, `large-v3`)
- `STT_DEVICE` (`cpu` or `cuda`)
- `STT_COMPUTE_TYPE` (`int8`, `float16`, ...)
- `STT_LANGUAGE` (`de` or empty for auto-detect)
- `MQTT_VAD_TOPIC`, `MQTT_TRANSCRIPT_TOPIC_TEMPLATE`, `MQTT_STT_ERROR_TOPIC_TEMPLATE`
## 7) Build and push to Gitea registry
@@ -161,6 +177,29 @@ services:
volumes:
- evs_bridge_data:/data
evs-stt-worker:
image: git.khnm-zimmerling.de/kai/evs-stt-worker:latest
container_name: evs-stt-worker
restart: unless-stopped
environment:
LOG_LEVEL: "INFO"
MQTT_HOST: "10.100.3.247"
MQTT_PORT: "1883"
MQTT_USER: ""
MQTT_PASSWORD: ""
MQTT_BASE_TOPIC: "evs"
MQTT_VAD_TOPIC: "evs/+/vad_segment"
MQTT_TRANSCRIPT_TOPIC_TEMPLATE: "evs/{device_id}/transcript"
MQTT_STT_ERROR_TOPIC_TEMPLATE: "evs/{device_id}/stt_error"
STT_MODEL: "small"
STT_DEVICE: "cpu"
STT_COMPUTE_TYPE: "int8"
STT_BEAM_SIZE: "1"
STT_LANGUAGE: "de"
STT_CONDITION_ON_PREV_TEXT: "false"
volumes:
- evs_bridge_data:/data
volumes:
evs_bridge_data:
```