CoreLayer Docs
Guides

Configure Local Models

Set up Ollama for fully local AI model inference.

Run AI models entirely on your machine using Ollama.

Install Ollama

Download and install from ollama.com.

Pull a Model

ollama pull llama3.2

Popular models for CoreLayer:

ModelSizeUse Case
llama3.22-3GBGeneral purpose, fast
llama3.3:70b40GBComplex reasoning
qwen2.5:14b9GBGood balance of speed and capability
codellama:13b7GBCode generation

Configure CoreLayer

Add Ollama as a provider:

{
  "models": {
    "providers": [
      {
        "name": "ollama",
        "type": "ollama",
        "endpoint": "http://localhost:11434",
        "defaultModel": "llama3.2"
      }
    ]
  }
}

Or via the Control Center:

  1. Open Settings → Models
  2. Click Add Provider
  3. Select Ollama
  4. Set endpoint to http://localhost:11434
  5. Select your model

Verify

Ask Jarvis a question and check that it responds using the local model:

What model are you running on?

Performance Tips

  • GPU acceleration: Ollama automatically uses GPU if available
  • Model size: Smaller models respond faster but are less capable
  • Quantization: Use quantized models (Q4, Q5) for better performance on limited hardware

Next Steps

On this page