Architecture Overview
Aikeya is a client-side application that combines 3D avatar rendering, LLM chat, text-to-speech, and a relationship simulation engine. Everything runs locally on the user’s device — in a browser or the desktop app — with no backend required.
System Diagram
┌─────────────────────────────────────────────────────────────────┐
│ Client (Browser or Desktop) │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ SvelteKit App │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐│ │
│ │ │ Chat UI │ │ 3D Scene │ │ Settings Panel ││ │
│ │ └──────┬──────┘ └──────┬──────┘ └─────────────────────┘│ │
│ │ │ │ │ │
│ │ ┌──────▼──────┐ ┌──────▼──────┐ │ │
│ │ │ LLM Client │ │ Three.js │ │ │
│ │ │ xsAI/fetch │ │ + Threlte │ │ │
│ │ └──────┬──────┘ └──────┬──────┘ │ │
│ │ │ │ │ │
│ │ ┌──────▼──────┐ ┌──────▼──────┐ ┌─────────────────────┐│ │
│ │ │ Companion │ │ VRM Model │ │ TTS Pipeline ││ │
│ │ │ Engine │ │ @pixiv/vrm │ │ + Lip-sync ││ │
│ │ └──────┬──────┘ └─────────────┘ └──────────┬──────────┘│ │
│ │ │ │ │ │
│ │ ┌──────▼─────────────────────────────────────▼──────────┐│ │
│ │ │ Svelte 5 Runes Stores ││ │
│ │ │ (character.svelte.ts, vrm.svelte.ts, settings.svelte.ts)│ │
│ │ └──────────────────────────┬────────────────────────────┘│ │
│ │ │ │ │
│ │ ┌──────────────────────────▼────────────────────────────┐│ │
│ │ │ IndexedDB (Dexie.js) ││ │
│ │ │ Character state, facts, turns, events ││ │
│ │ └───────────────────────────────────────────────────────┘│ │
│ └───────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ External APIs │
│ LLM: OpenAI / Anthropic / etc│
│ TTS: ElevenLabs / OpenAI TTS │
│ STT: Web Speech / Groq │
└───────────────────────────────┘ Core Components
VRM Rendering
The 3D avatar system uses Three.js with Threlte (a Svelte wrapper) for integration.
Key files:
src/lib/components/vrm/Scene.svelte— Main 3D scene with camera, lighting, and post-processingsrc/lib/components/vrm/VrmModel.svelte— VRM model loading, animation, and expression controlsrc/lib/stores/vrm.svelte.ts— VRM state including head tracking for UI positioning
Libraries:
@pixiv/three-vrm— VRM model loading and runtime@pixiv/three-vrm-animation— VRMA animation support@threlte/core— Svelte-Three.js integrationn8aoandpostprocessing— Visual effects
How it works:
- User uploads a
.vrmfile or URL - VRM loader parses the model and creates a Three.js scene object
- Threlte manages the render loop and integrates with Svelte’s reactivity
- Expressions and animations are applied via the VRM humanoid and expression APIs
Chat System
Messages flow through two paths depending on the platform:
- Web: SvelteKit server route using the xsAI SDK (
src/routes/api/chat/+server.ts) - Desktop (Tauri): Direct fetch to provider APIs (
src/lib/services/chat/client-chat.ts)
Key files:
src/lib/components/chat/BottomChatBar.svelte— User input interfacesrc/lib/components/chat/SpeechBubble.svelte— Message displaysrc/lib/ai/prompt-builder.ts— System prompt constructionsrc/lib/ai/response-parser.ts— Extract dialogue and state updates from LLM outputsrc/lib/engine/— Core companion engine logic
Flow:
User Input
│
▼
┌──────────────┐
│ Heuristics │ ── Calculate baseline state changes
│ Engine │ (energy decay, streak updates)
└──────┬───────┘
│
▼
┌──────────────┐
│ Memory │ ── Retrieve relevant facts and recent turns
│ Retrieval │ using semantic search (embeddings)
└──────┬───────┘
│
▼
┌──────────────┐
│ Prompt │ ── Combine system prompt + character state
│ Builder │ + memory context + instructions
└──────┬───────┘
│
▼
┌──────────────┐
│ LLM Provider │ ── Stream response from OpenAI/Anthropic/etc.
│(xsAI or fetch)│
└──────┬───────┘
│
▼
┌──────────────┐
│ Response │ ── Extract text + JSON state suggestions
│ Parser │
└──────┬───────┘
│
▼
┌──────────────┐
│ State │ ── Merge heuristic + LLM deltas
│ Merger │ and persist to IndexedDB
└──────────────┘ TTS Pipeline
Text-to-speech converts LLM responses to audio with lip-sync.
Key files:
src/lib/services/lipsync/analyzer.ts— Lip-sync audio analysissrc/lib/services/tts/elevenlabs.ts— ElevenLabs providersrc/lib/services/tts/openai-tts.ts— OpenAI TTS providersrc/lib/services/tts/index.ts— Provider factory and shared audio context
Supported providers:
- ElevenLabs (high quality, requires API key)
- OpenAI TTS (requires API key)
Flow:
- LLM response text is sent to TTS provider
- Audio is received as a buffer
- Web Audio API plays the audio
- Audio analyzer extracts volume/frequency data
- VRM model maps audio data to mouth blend shapes in real-time
Memory System
Three-tier memory architecture for context and recall.
Key files:
src/lib/engine/memory.ts— Memory managementsrc/lib/types/memory.ts— Memory type definitionssrc/lib/db/index.ts— Database schema
Tiers:
- Working Memory — In-memory buffer of recent conversation turns
- Facts — IndexedDB-stored facts with vector embeddings for semantic search
- Sessions — Conversation summaries for long-term context
Semantic search: Uses @xenova/transformers to run the all-MiniLM-L6-v2 embedding model locally on the user’s device. Facts are embedded as 384-dimensional vectors and can be retrieved by cosine similarity to the current conversation.
See Companion System and Memory Graph for detailed memory documentation.
State Management
Svelte 5 runes-based stores for reactive state.
Key stores:
src/lib/stores/character.svelte.ts— Character/companion statesrc/lib/stores/vrm.svelte.ts— 3D model state, head trackingsrc/lib/stores/settings.svelte.ts— Provider configurations (LLM, TTS, STT)src/lib/stores/persona.svelte.ts— Persona card managementsrc/lib/stores/chat.svelte.ts— Chat session statesrc/lib/stores/tts.svelte.ts— Text-to-speech statesrc/lib/stores/stt.svelte.ts— Speech-to-text statesrc/lib/stores/display.svelte.ts— Camera distance and display settingssrc/lib/stores/overlay.svelte.ts— Desktop overlay mode state
Pattern:
// Svelte 5 runes pattern
let count = $state(0);
const doubled = $derived(count * 2);
$effect(() => {
console.log('Count changed:', count);
}); Storage Layer
All data persists client-side via IndexedDB using Dexie.js.
Database tables:
characterStates— Character state and relationship datafacts— Memory facts with embeddingssessions— Conversation session summariesconversationTurns— Conversation historycompletedEvents— Milestone events that have fired
Key file: src/lib/db/index.ts
Benefits:
- No server required
- Data stays on user’s device
- Works offline after initial load
- Large storage capacity (typically 50MB+)
Project Structure
src/
├── lib/
│ ├── ai/ # LLM prompt building and response parsing
│ ├── components/
│ │ ├── chat/ # Chat UI (BottomChatBar, SpeechBubble)
│ │ ├── docs/ # Documentation site components
│ │ ├── events/ # Event scene and choice UI
│ │ ├── icons/ # Icon components
│ │ ├── layout/ # App layout components
│ │ ├── memory/ # Memory graph visualization
│ │ ├── onboarding/ # First-run setup
│ │ ├── overlay/ # Desktop overlay UI
│ │ ├── settings/ # Settings page components
│ │ ├── ui/ # Shared UI primitives
│ │ └── vrm/ # 3D scene and model
│ ├── config/ # App and docs configuration
│ ├── data/ # Static data (event definitions)
│ ├── db/ # Database schema and export/import
│ ├── engine/ # Companion engine (heuristics, stages, state, events, memory)
│ ├── services/
│ │ ├── chat/ # Chat client
│ │ ├── lipsync/ # Audio analysis for lip-sync
│ │ ├── modules/ # Module system
│ │ ├── platform/ # Tauri/web platform abstraction
│ │ ├── providers/ # LLM provider registry and model fetching
│ │ ├── storage/ # IndexedDB storage layer
│ │ ├── stt/ # Speech-to-text providers
│ │ └── tts/ # Text-to-speech providers
│ ├── stores/ # Svelte 5 runes stores
│ ├── types/ # TypeScript types
│ └── utils/ # Utility functions
├── routes/
│ ├── (app)/ # Main app and settings routes
│ ├── blog/ # Blog pages
│ ├── docs/ # Documentation site
│ └── overlay/ # Desktop overlay route
└── content/
└── docs/ # Markdown documentation Key Interactions
Expression Updates
When the companion’s mood changes:
- Companion engine calculates new mood state
- State is written to
character.svelte.tsstore VrmModel.sveltecomponent reacts to store change- Mood is mapped to VRM blend shapes (expressions)
- VRM model’s face updates in real-time
Event Triggering
When relationship thresholds are crossed:
- State merger detects threshold crossing
- Event system checks for eligible events
- Matching event is marked as triggered
- UI displays event content (if any)
- Event ID is added to
completedEvents
Desktop Application (Tauri)
The desktop app wraps the same SvelteKit application using Tauri v2.
Platform Layer
A platform abstraction layer allows code to behave differently on web vs desktop:
Key files:
src/lib/services/platform/platform.ts—isTauri()/isWeb()detectionsrc/lib/services/platform/window.ts— Window management (position, drag, click-through)src/lib/services/platform/hotkeys.ts— Global shortcut registration
Detection pattern:
import { isTauri } from '$lib/services/platform';
if (isTauri()) {
// Desktop-only code
await startDragging();
} Multi-Window Architecture
The desktop app uses two windows:
| Window | Purpose |
|---|---|
main | Full application with all features |
overlay | Transparent, always-on-top companion view |
Switching logic:
- Main → Overlay: Invoke
show_overlaycommand, hide main window - Overlay → Main: Show main window, hide overlay
Overlay Rendering
For transparent backgrounds in overlay mode:
- Tauri window configured with
transparent: true,decorations: false - HTML/body backgrounds set to transparent via CSS
- Three.js renderer uses
alpha: trueandsetClearColor(0x000000, 0) - Scene background set to
null(no skybox)
Key file: src/routes/overlay/+page.svelte
Technologies
| Category | Technology |
|---|---|
| Framework | SvelteKit 2 |
| Language | TypeScript |
| 3D Rendering | Three.js + Threlte |
| VRM Support | @pixiv/three-vrm |
| LLM Integration | xsAI SDK (web) / direct fetch (desktop) |
| Desktop | Tauri v2 |
| Styling | Tailwind CSS 4 |
| Database | IndexedDB (Dexie.js) |
| Embeddings | Transformers.js |
| Build Tool | Vite |