AetherChat is a lightweight voice-chat API that converts user speech into text using VOSK, processes the request through an Ollama-hosted LLM, and responds using Kokoro TTS. A FastAPI backend (served via Uvicorn) powers the API, and the repository includes a browser-ready web UI.
- Speech-to-Text (STT) via VOSK
- Locally hosted LLM through Ollama
- Text-to-Speech (TTS) synthesis using Kokoro
- FastAPI backend with automatic streaming responses
- Caddy reverse proxy with optional HTTPS
- Fully containerised with Docker Compose
- Customisable voices and model selection
docker-compose.yml defines two core services:
- Builds the Python backend
- Mounts
models/voskandkokoro - Runs FastAPI on port
8000 - Exposes it externally as
8888(avoiding conflicts)
-
Fronts the API with HTTP/HTTPS
-
Serves static UI files
-
Proxies inbound traffic to the backend
-
Generates internal TLS certificates automatically
-
Listens on ports:
8080(HTTP)8443(HTTPS)
Shared settings like OLLAMA_HOST, LLM_MODEL, and VOSK_MODEL_PATH are passed to voicechat.
The DOMAIN environment variable controls Caddy’s routing (default: voicechat.local).
Place the required models in the following folders:
./models/vosk/
./kokoro/
(Include Kokoro binaries if you want TTS enabled.)
If you're using the default internal domain:
voicechat.local → 127.0.0.1
docker compose up --buildCaddy will automatically generate and trust a local certificate (via tls internal).
Open:
- https://voicechat.local:8443 – secure UI/API
- http://voicechat.local:8888 – plain HTTP
When using tls internal, Caddy acts as a local CA.
You must install its certificate on clients before Safari/iOS/macOS will allow microphone access.
Extract it with:
docker cp voice-chat-caddy:/data/caddy/pki/authorities/local/root.crt ./caddy-root.crt- Double-click the certificate
- Add to “System” or “Login” keychain
- Set Always Trust
- Airdrop/email the
.crt - Tap to install
- Enable trust under Settings → General → About → Certificate Trust Settings
Once trusted, microphone access will work without warnings.
The default Kokoro pack exposes several selectable voices:
| Voice ID | Name | Notes |
|---|---|---|
af_sarah |
Sarah (EN Female) | Neutral, natural |
af_bella |
Bella (EN Female) | Bright, friendly |
af_sky |
Sky (EN Female) | Soft, airy |
bf_emma |
Emma (Anime EN) | Energetic, expressive |
bf_isabella |
Isabella (Anime EN) | Bright anime style |
bf_lily |
Lily (Shy EN) | Soft, shy |
ff_siwis |
French (Siwis) | Native French |
Add more voices by extending the Kokoro .bin file and registering them in app.py.
