Local LLM Setup

Running a local LLM means your conversations never leave your machine. No API keys, no usage costs, and full offline support.

Ollama

Ollama is a lightweight tool for running open-source LLMs locally. Available on macOS, Linux, and Windows.

Installation

macOS

brew install ollama

Linux

curl -fsSL https://ollama.ai/install.sh | sh

Windows

Download the installer from ollama.ai and run it.

Pulling a Model

Download a model before you can use it:

ollama pull llama3.2

Other options worth trying: mistral, phi3, codellama.

Starting the Server

ollama serve

This starts the Ollama API on http://localhost:11434.

Connecting to Aikeya

  1. Open Settings (gear icon)
  2. Navigate to the Character tab
  3. Enable the LLM toggle, then select Ollama from the provider dropdown
  4. Select a model from the model dropdown (click the refresh icon to reload the list from your server)
  5. Start chatting

LM Studio

LM Studio provides a GUI for downloading and running local models. Good option if you prefer not to use the terminal.

Installation

Download from lmstudio.ai and install it.

Downloading Models

Open LM Studio and browse the built-in model catalog. Search for a model, click download, and wait for it to finish.

Starting the Server

  1. Go to the Server tab in LM Studio
  2. Click Start Server

This starts an OpenAI-compatible API on http://localhost:1234.

Connecting to Aikeya

  1. Open Settings (gear icon)
  2. Navigate to the Character tab
  3. Enable the LLM toggle, then select LM Studio from the provider dropdown
  4. Select a model from the model dropdown (click the refresh icon to reload the list from your server)
  5. Start chatting
ModelSizeBest ForRAM Required
Llama 3.2 (3B)~2GBGeneral chat, fast responses8GB
Llama 3.1 (8B)~4.7GBBetter quality responses16GB
Mistral (7B)~4.1GBGood balance of speed and quality16GB
Phi-3 (3.8B)~2.3GBLightweight, efficient8GB

Start with Llama 3.2 (3B) if you’re unsure. It runs well on most hardware and gives solid results for conversational use.

Custom Base URL

If you’re running the LLM server on a different machine or non-default port, enter the full URL in the provider settings. For example:

  • Remote machine: http://192.168.1.50:11434
  • Custom port: http://localhost:8080

Troubleshooting

“Failed to fetch models”

The LLM server isn’t running. Start it:

  • Ollama: ollama serve
  • LM Studio: Go to the Server tab and click Start Server

“Connection refused”

The port doesn’t match. Default ports:

ProviderPort
Ollama11434
LM Studio1234

Make sure the URL in Aikeya matches the port your server is using.

Slow responses

  • Try a smaller model (3B parameters instead of 7B+)
  • Check that GPU acceleration is enabled in your LLM tool’s settings
  • Close other memory-heavy applications

CORS errors in browser

If you’re running Aikeya in a browser and getting CORS errors with Ollama, set the origins environment variable before starting the server:

OLLAMA_ORIGINS=* ollama serve

This isn’t needed when using the desktop app.