Add VAD segmentation and Docker ENV defaults
Some checks failed
Build and Push EVS Bridge Image / docker (push) Has been cancelled

This commit is contained in:
Kai
2026-02-13 16:47:54 +01:00
parent 9dc1ac3099
commit d4d4c7224b
4 changed files with 201 additions and 6 deletions

View File

@@ -7,9 +7,12 @@ It provides:
- MQTT status/events (`evs/<device_id>/status`)
- MQTT playback input (`evs/<device_id>/play_pcm16le`)
- Optional Home Assistant webhook callbacks (`connected`, `start`, `stop`, `disconnected`)
- VAD auto-segmentation (`vad_segment`) with pre-roll/post-roll
## 1) Start the bridge
The image already contains sane default `ENV` values. A custom `.env` is optional.
1. Copy env template:
```bash
cp .env.example .env
@@ -58,6 +61,7 @@ Then upload firmware.
- Status/events published by bridge:
- `evs/<device_id>/status` (JSON)
- includes `type: "vad_segment"` when a speech segment is finalized
- Playback input to device:
- `evs/<device_id>/play_pcm16le`
- payload options:
@@ -86,6 +90,13 @@ You can build automations on these events (for STT/TTS pipelines or Node-RED han
- `WAV_SEGMENT_MAX_BYTES` max size per `.wav` file (default: `20971520` = 20 MB)
- `WAV_KEEP_FILES` max number of `.wav` files to keep (default: `10`)
- `MAX_SESSION_BYTES` is only used if session file saving is disabled
- Voice activity detection (VAD):
- `VAD_ENABLED=true` enables automatic speech segment detection
- `VAD_PREROLL_MS=1000` keeps 1s before speech start
- `VAD_POSTROLL_MS=1000` keeps 1s after speech end
- `VAD_START_THRESHOLD` / `VAD_STOP_THRESHOLD` tune sensitivity
- `VAD_DIR` stores per-utterance WAV files
- `VAD_KEEP_FILES` limits stored VAD WAV files
- MQTT is recommended for control/events, WebSocket for streaming audio
## 7) Build and push to Gitea registry
@@ -135,6 +146,14 @@ services:
PCM_SAMPLE_RATE: "16000"
WAV_SEGMENT_MAX_BYTES: "20971520"
WAV_KEEP_FILES: "10"
VAD_ENABLED: "true"
VAD_DIR: "/data/vad"
VAD_KEEP_FILES: "100"
VAD_PREROLL_MS: "1000"
VAD_POSTROLL_MS: "1000"
VAD_START_THRESHOLD: "900"
VAD_STOP_THRESHOLD: "600"
VAD_MIN_SPEECH_MS: "300"
volumes:
- evs_bridge_data:/data