Skip to content

Lightweight local voice-chat API using VOSK STT, Ollama LLM, and Kokoro TTS. Includes FastAPI backend, web UI, and optional Caddy HTTPS.

Notifications You must be signed in to change notification settings

Anishrkhadka/AetherChat

Repository files navigation

AetherChat Logo

AetherChat is a lightweight voice-chat API that converts user speech into text using VOSK, processes the request through an Ollama-hosted LLM, and responds using Kokoro TTS. A FastAPI backend (served via Uvicorn) powers the API, and the repository includes a browser-ready web UI.

Demo GIF


Features

  • Speech-to-Text (STT) via VOSK
  • Locally hosted LLM through Ollama
  • Text-to-Speech (TTS) synthesis using Kokoro
  • FastAPI backend with automatic streaming responses
  • Caddy reverse proxy with optional HTTPS
  • Fully containerised with Docker Compose
  • Customisable voices and model selection

Docker Setup

docker-compose.yml defines two core services:

1. voicechat

  • Builds the Python backend
  • Mounts models/vosk and kokoro
  • Runs FastAPI on port 8000
  • Exposes it externally as 8888 (avoiding conflicts)

2. caddy

  • Fronts the API with HTTP/HTTPS

  • Serves static UI files

  • Proxies inbound traffic to the backend

  • Generates internal TLS certificates automatically

  • Listens on ports:

    • 8080 (HTTP)
    • 8443 (HTTPS)

Shared settings like OLLAMA_HOST, LLM_MODEL, and VOSK_MODEL_PATH are passed to voicechat. The DOMAIN environment variable controls Caddy’s routing (default: voicechat.local).


Running the Project

1. Ensure models are available

Place the required models in the following folders:

./models/vosk/
./kokoro/

(Include Kokoro binaries if you want TTS enabled.)

2. (Optional) Add /etc/hosts entry

If you're using the default internal domain:

voicechat.local   →   127.0.0.1

3. Start the stack

docker compose up --build

Caddy will automatically generate and trust a local certificate (via tls internal). Open:


Installing Caddy’s Local Certificate

When using tls internal, Caddy acts as a local CA. You must install its certificate on clients before Safari/iOS/macOS will allow microphone access.

Extract it with:

docker cp voice-chat-caddy:/data/caddy/pki/authorities/local/root.crt ./caddy-root.crt

macOS

  1. Double-click the certificate
  2. Add to “System” or “Login” keychain
  3. Set Always Trust

iOS / iPadOS

  1. Airdrop/email the .crt
  2. Tap to install
  3. Enable trust under Settings → General → About → Certificate Trust Settings

Once trusted, microphone access will work without warnings.


Available Voices

The default Kokoro pack exposes several selectable voices:

Voice ID Name Notes
af_sarah Sarah (EN Female) Neutral, natural
af_bella Bella (EN Female) Bright, friendly
af_sky Sky (EN Female) Soft, airy
bf_emma Emma (Anime EN) Energetic, expressive
bf_isabella Isabella (Anime EN) Bright anime style
bf_lily Lily (Shy EN) Soft, shy
ff_siwis French (Siwis) Native French

Add more voices by extending the Kokoro .bin file and registering them in app.py.

About

Lightweight local voice-chat API using VOSK STT, Ollama LLM, and Kokoro TTS. Includes FastAPI backend, web UI, and optional Caddy HTTPS.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published