[add] Speech to text feature.

This commit is contained in:
2026-04-05 16:07:03 +02:00
parent 049d4f2a18
commit 81e9760e5c
7 changed files with 213 additions and 4 deletions
+59
View File
@@ -31,6 +31,65 @@ You can also run it directly as a module without installing:
uv run python -m h2g2.main
```
## Audio features
### Text-to-speech (TTS)
The game can read all output aloud using [Piper](https://github.com/rhasspy/piper), a fast offline TTS engine. A British English voice model is included in the repo.
```bash
uv run h2g2 --audio
```
Use `--voice /path/to/model.onnx` to use a different Piper voice model.
### Speech-to-text (STT)
You can play the game hands-free using voice input powered by [Vosk](https://alphacephei.com/vosk/), a lightweight offline speech recognition engine. The Vosk model (~50 MB) downloads automatically on first use.
```bash
uv run h2g2 --stt
```
Use `--stt-model /path/to/vosk-model/` to use a different Vosk model.
Combine both flags for full voice interaction:
```bash
uv run h2g2 --audio --stt
```
### Audio prerequisites
STT requires PortAudio for microphone access. Install the system library for your platform before running with `--stt`:
**Linux (Debian/Ubuntu):**
```bash
sudo apt install libportaudio2 portaudio19-dev
```
**macOS:**
```bash
brew install portaudio
```
**Windows:**
PyAudio ships with PortAudio bundled on Windows, so no extra system package is needed. If you run into build issues, install a prebuilt wheel:
```bash
uv pip install pipwin
pipwin install pyaudio
```
After installing the system dependency, sync the Python packages:
```bash
uv sync
```
## What's playable
The Earth opening sequence: wake up in your bedroom, find your gown and aspirin, make your way downstairs, head to the pub, and meet Ford Prefect.