Add MQTT-based STT worker for VAD segments
Some checks failed
Build and Push EVS Bridge Image / docker (push) Has been cancelled
Some checks failed
Build and Push EVS Bridge Image / docker (push) Has been cancelled
This commit is contained in:
@@ -33,3 +33,14 @@ VAD_POSTROLL_MS=1000
|
||||
VAD_START_THRESHOLD=900
|
||||
VAD_STOP_THRESHOLD=600
|
||||
VAD_MIN_SPEECH_MS=300
|
||||
|
||||
# STT worker settings (faster-whisper)
|
||||
MQTT_VAD_TOPIC=evs/+/vad_segment
|
||||
MQTT_TRANSCRIPT_TOPIC_TEMPLATE=evs/{device_id}/transcript
|
||||
MQTT_STT_ERROR_TOPIC_TEMPLATE=evs/{device_id}/stt_error
|
||||
STT_MODEL=small
|
||||
STT_DEVICE=cpu
|
||||
STT_COMPUTE_TYPE=int8
|
||||
STT_BEAM_SIZE=1
|
||||
STT_LANGUAGE=de
|
||||
STT_CONDITION_ON_PREV_TEXT=false
|
||||
|
||||
@@ -8,6 +8,7 @@ It provides:
|
||||
- MQTT playback input (`evs/<device_id>/play_pcm16le`)
|
||||
- Optional Home Assistant webhook callbacks (`connected`, `start`, `stop`, `disconnected`)
|
||||
- VAD auto-segmentation (`vad_segment`) with pre-roll/post-roll
|
||||
- Optional STT worker (`vad_segment` -> `transcript`) via MQTT
|
||||
|
||||
## 1) Start the bridge
|
||||
|
||||
@@ -53,7 +54,7 @@ Then upload firmware.
|
||||
|
||||
1. Flash ESP32
|
||||
2. Open serial monitor
|
||||
3. Send `s` (stream mode)
|
||||
3. Wait for WS connect (client switches to stream mode automatically)
|
||||
4. In bridge logs, you should see the device connection
|
||||
5. If `ECHO_ENABLED=true`, incoming audio is returned to ESP32 speaker
|
||||
|
||||
@@ -63,7 +64,8 @@ Then upload firmware.
|
||||
- `evs/<device_id>/status` (connection/start/stop/disconnect)
|
||||
- `evs/<device_id>/mic_level` (mic telemetry)
|
||||
- `evs/<device_id>/vad_segment` (finalized speech segments)
|
||||
- reserved for next steps: `evs/<device_id>/transcript`, `evs/<device_id>/stt_error`
|
||||
- `evs/<device_id>/transcript` (text from stt-worker)
|
||||
- `evs/<device_id>/stt_error` (stt-worker errors)
|
||||
- Playback input to device:
|
||||
- `evs/<device_id>/play_pcm16le`
|
||||
- payload options:
|
||||
@@ -101,6 +103,20 @@ You can build automations on these events (for STT/TTS pipelines or Node-RED han
|
||||
- `VAD_KEEP_FILES=200` limits number of stored VAD WAV files
|
||||
- `VAD_MAX_AGE_DAYS=7` deletes VAD WAV files older than 7 days
|
||||
- MQTT is recommended for control/events, WebSocket for streaming audio
|
||||
- STT worker:
|
||||
- subscribes: `evs/<device_id>/vad_segment`
|
||||
- reads `wav_path` from event JSON
|
||||
- transcribes with `faster-whisper`
|
||||
- publishes transcript to `evs/<device_id>/transcript`
|
||||
|
||||
## 6.1) STT Worker Config
|
||||
|
||||
Use these env values (in `.env` or Portainer):
|
||||
- `STT_MODEL` (`tiny`, `base`, `small`, `medium`, `large-v3`)
|
||||
- `STT_DEVICE` (`cpu` or `cuda`)
|
||||
- `STT_COMPUTE_TYPE` (`int8`, `float16`, ...)
|
||||
- `STT_LANGUAGE` (`de` or empty for auto-detect)
|
||||
- `MQTT_VAD_TOPIC`, `MQTT_TRANSCRIPT_TOPIC_TEMPLATE`, `MQTT_STT_ERROR_TOPIC_TEMPLATE`
|
||||
|
||||
## 7) Build and push to Gitea registry
|
||||
|
||||
@@ -161,6 +177,29 @@ services:
|
||||
volumes:
|
||||
- evs_bridge_data:/data
|
||||
|
||||
evs-stt-worker:
|
||||
image: git.khnm-zimmerling.de/kai/evs-stt-worker:latest
|
||||
container_name: evs-stt-worker
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
LOG_LEVEL: "INFO"
|
||||
MQTT_HOST: "10.100.3.247"
|
||||
MQTT_PORT: "1883"
|
||||
MQTT_USER: ""
|
||||
MQTT_PASSWORD: ""
|
||||
MQTT_BASE_TOPIC: "evs"
|
||||
MQTT_VAD_TOPIC: "evs/+/vad_segment"
|
||||
MQTT_TRANSCRIPT_TOPIC_TEMPLATE: "evs/{device_id}/transcript"
|
||||
MQTT_STT_ERROR_TOPIC_TEMPLATE: "evs/{device_id}/stt_error"
|
||||
STT_MODEL: "small"
|
||||
STT_DEVICE: "cpu"
|
||||
STT_COMPUTE_TYPE: "int8"
|
||||
STT_BEAM_SIZE: "1"
|
||||
STT_LANGUAGE: "de"
|
||||
STT_CONDITION_ON_PREV_TEXT: "false"
|
||||
volumes:
|
||||
- evs_bridge_data:/data
|
||||
|
||||
volumes:
|
||||
evs_bridge_data:
|
||||
```
|
||||
|
||||
@@ -9,3 +9,26 @@ services:
|
||||
- "${WS_PORT:-8765}:${WS_PORT:-8765}"
|
||||
volumes:
|
||||
- ./data:/data
|
||||
|
||||
evs-stt-worker:
|
||||
build: ../stt-worker
|
||||
container_name: evs-stt-worker
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
LOG_LEVEL: "INFO"
|
||||
MQTT_HOST: "${MQTT_HOST:-localhost}"
|
||||
MQTT_PORT: "${MQTT_PORT:-1883}"
|
||||
MQTT_USER: "${MQTT_USER:-}"
|
||||
MQTT_PASSWORD: "${MQTT_PASSWORD:-}"
|
||||
MQTT_BASE_TOPIC: "${MQTT_BASE_TOPIC:-evs}"
|
||||
MQTT_VAD_TOPIC: "${MQTT_VAD_TOPIC:-evs/+/vad_segment}"
|
||||
MQTT_TRANSCRIPT_TOPIC_TEMPLATE: "${MQTT_TRANSCRIPT_TOPIC_TEMPLATE:-evs/{device_id}/transcript}"
|
||||
MQTT_STT_ERROR_TOPIC_TEMPLATE: "${MQTT_STT_ERROR_TOPIC_TEMPLATE:-evs/{device_id}/stt_error}"
|
||||
STT_MODEL: "${STT_MODEL:-small}"
|
||||
STT_DEVICE: "${STT_DEVICE:-cpu}"
|
||||
STT_COMPUTE_TYPE: "${STT_COMPUTE_TYPE:-int8}"
|
||||
STT_BEAM_SIZE: "${STT_BEAM_SIZE:-1}"
|
||||
STT_LANGUAGE: "${STT_LANGUAGE:-de}"
|
||||
STT_CONDITION_ON_PREV_TEXT: "${STT_CONDITION_ON_PREV_TEXT:-false}"
|
||||
volumes:
|
||||
- ./data:/data
|
||||
|
||||
Reference in New Issue
Block a user