Open Brain Documentation

Persistent, searchable memory for every AI tool you use.

What is Open Brain?

Open Brain is a personal, database-backed AI knowledge system that gives every AI tool persistent memory. Based on Nate B Jones' architecture, it turns your scattered conversations across Copilot, Claude, ChatGPT, Cursor, and other tools into a single, searchable knowledge base.

Credit: Open Brain was originally created by Nate B Jones. This version by Scott Nichols extends the original with self-hosted Kubernetes deployment, Ollama local embeddings, Tailscale Funnel networking, Docker Compose quickstart, and multi-replica SSE support. See Nate's setup guide, prompt kit, and Substack post for the original vision.

The Problem

Every AI conversation starts from zero. Decisions, preferences, and context are lost across sessions and tools.

The Solution

A unified memory backend — capture once, recall everywhere. Semantic search finds thoughts by meaning, not keywords.

Philosophy

→ One row = one retrievable idea — Zettelkasten-style atomic notes
→ Vector search = associative retrieval — search by meaning, not keywords
→ Metadata extraction is automatic — LLM classifies and tags on ingest
→ Backend, not frontend — use with any UI or AI tool you prefer
→ Your memory is portable — self-hosted, open protocol, compounding

Cost

Deployment	Time	Monthly Cost
🖥️ Docker Desktop dev box (Win/Mac)	~10 min	$0
🐳 Docker Compose (Linux server / NAS / Pi)	~10 min	$0–5
☁️ Cheap hosted (Supabase / Neon + Fly.io)	~30 min	$0–5
☸️ Kubernetes homelab	~1 hour	$0 (own hardware)
🚀 Azure managed (Bicep)	~20 min	~$15–20

Architecture

Open Brain has a simple, layered architecture — AI clients talk to the MCP server via SSE, which handles tool dispatch, embedding generation, metadata extraction, and database operations.

AI Client (Copilot, Claude, ChatGPT, Cursor, etc.)
    ↓ MCP Protocol (SSE transport)
    ↓ Auth: x-brain-key header or ?key= param
    ↓
MCP Server (Node.js + Hono / Deno Edge Function)
    ├─ Tool dispatch (7 MCP tools)
    ├─ Embedding generation (OpenRouter or Ollama)
    ├─ Metadata extraction (LLM)
    └─ Database client
    ↓
PostgreSQL + pgvector
    ├─ thoughts table (content + embedding + metadata)
    ├─ HNSW index (vector search)
    ├─ GIN index (metadata filtering)
    └─ match_thoughts() RPC function

Data Flows

Capture Flow

Client → capture_thought → Embed (parallel) + Extract metadata (parallel) → Insert row → Return confirmation with metadata

Search Flow

Client → search_thoughts → Embed query → match_thoughts() RPC → Return ranked results with similarity scores

Scaling

Scale	Thoughts	Notes
Personal	1 – 10K	Single instance, no tuning needed
Power User	10K – 100K	HNSW index handles smoothly
Team	100K+	Consider dedicated Postgres, read replicas

Database Schema

Single table design on PostgreSQL with pgvector extension. Rich JSONB metadata enables flexible filtering without schema migrations.

The `thoughts` Table

Column	Type	Purpose
`id`	UUID	Primary key (auto-generated)
`content`	TEXT	The thought itself
`embedding`	VECTOR(768/1536)	Semantic embedding vector
`metadata`	JSONB	Type, topics, people, action items, source
`created_at`	TIMESTAMPTZ	When captured
`updated_at`	TIMESTAMPTZ	Last modified (auto-trigger)
`created_by`	TEXT	User provenance for multi-dev teams

Metadata Fields

Extracted automatically by the LLM on capture:

type topics[] people[] action_items[] dates[] source project created_by supersedes provenance (v0.7.0+)

Indexes

Index	Type	Purpose
`thoughts_embedding_idx`	HNSW	Fast vector similarity search
`thoughts_metadata_idx`	GIN	JSONB metadata containment queries
`thoughts_created_at_idx`	B-tree	Date range filtering

`match_thoughts()` Function

The core search RPC — combines vector similarity with a configurable threshold:

SELECT * FROM match_thoughts(
    query_embedding := '[0.1, 0.2, ...]'::vector,
    match_count := 10,
    match_threshold := 0.5
);

Provenance Helpers v0.7.0+

Migration 003-add-provenance-helpers.sql adds two generated columns projected from metadata->'provenance' (the Hallmark v1 envelope written by Plan Forge), two partial indexes, and a sibling RPC for source-hash lookups. Existing rows are untouched — provenance stays optional.

Generated Column	Projection	Index
`source_file_hash`	`metadata->'provenance'->>'contentHash'`	`idx_thoughts_source_file_hash` (partial, WHERE NOT NULL)
`code_hash`	`metadata->'provenance'->>'codeHash'`	`idx_thoughts_code_hash` (partial, WHERE NOT NULL)

New sibling RPC — match_thoughts() is unchanged:

SELECT * FROM match_thoughts_by_source(
    source_hash      := 'sha256:abcd...',
    max_count        := 25,
    project_filter   := NULL,
    include_archived := false
);

MCP Server

Open Brain's MCP server exposes 7 tools via the Model Context Protocol, an open standard for AI-to-tool integration. Uses SSE transport over HTTP.

Tools

search_thoughts query, limit?, threshold?

Semantic vector search — finds thoughts by meaning, ranked by similarity score.

capture_thought content, metadata?

Store a thought. Auto-generates embedding and extracts metadata (type, topics, people, action items) in parallel. v0.7.0+: consumers may pass metadata.provenance (Hallmark envelope) for source-hash traceability.

capture_thoughts thoughts[]

Batch capture — store multiple thoughts in one call.

list_thoughts type?, topic?, person?, days?

Filtered listing — browse by type, topic, person mentioned, or date range.

update_thought id, content

Edit a thought's content and regenerate its embedding.

delete_thought id

Remove a thought by ID.

thought_stats

Aggregate stats — total count, type distribution, top topics, top people mentioned.

Authentication

API key sent via x-brain-key header (preferred) or ?key= URL parameter.

# Auth is enforced on /sse connection only.
# /messages endpoint uses sessionId for implicit auth.
GET /sse?key=YOUR_64_CHAR_HEX_KEY     → SSE stream
POST /messages?sessionId=xxx           → JSON-RPC calls (no key needed)

Capture Pipeline

Multiple ways to feed thoughts into Open Brain — from AI tool conversations to Slack messages to bulk imports.

Capture Methods

Method	How	Best For
MCP Tool	`capture_thought` from any AI client	Daily workflow captures
REST API	`POST /memories`	Scripts, automations, webhooks
Slack Webhook	DM the Slack bot	Quick captures, mobile
Bulk Import	Migration scripts	Notion, Obsidian, Apple Notes, ChatGPT exports

Quick Capture Templates

Decision: "Decision: Using PostgreSQL with pgvector instead of Pinecone. Reason: self-hosted, lower cost."

Person Note: "Mike prefers async communication, Slack over email. Timezone: PST."

Insight: "Vector search with 768-dim nomic-embed-text is nearly as good as 1536-dim for short content."

Prompt Kit

Five core prompts covering the full lifecycle — from migration to daily capture to weekly review.

Prompt 1: Memory Migration

Import memories from Copilot, Claude, ChatGPT, or other AI platforms into Open Brain.

Prompt 2: Second Brain Migration

Migrate from Notion, Obsidian, Apple Notes, Google Keep, or Evernote.

Prompt 3: Open Brain Spark

System prompt that teaches your AI to proactively capture and search thoughts.

Prompt 4: Quick Capture Templates

Structured templates for decisions, meetings, person notes, and insights.

Prompt 5: The Weekly Review

Review your week's captures — surface patterns, stale tasks, and connections.

Daily Rhythm

☀️ Morning — Quick review of recent captures

💻 During work — Capture decisions and insights as they happen

🤝 After meetings — Debrief with capture templates

🌙 End of day — Save key takeaways

📊 Friday — Run the weekly review prompt

🔨 Plan Forge integration

v0.7.0+

Plan Forge is an agentic project execution system. In Plan Forge's unified memory architecture, Open Brain is the L3 layer — permanent, cross-project, semantic memory. Skills run search_thoughts before acting, and capture_thought after, so architecture decisions, patterns, and postmortems are preserved across runs and tools.

Memory layers (Plan Forge's view)

Layer	Backing	Lifetime	What lives there
L1	process state	per-call	tool inputs / outputs in flight
L2	`.forge/*.jsonl`	per-project	queues, dead-letter, run trajectories
L3	Open Brain (Postgres + pgvector)	permanent, cross-project	decisions, patterns, conventions, postmortems

Hallmark provenance envelope

Plan Forge wraps every L3 write in a Hallmark v1 envelope under metadata.provenance. Open Brain projects two fields out as generated columns (source_file_hash, code_hash) with partial indexes, plus a sibling RPC (match_thoughts_by_source) and a REST endpoint (GET /memories/by-source) for exact-source deduplication.

{
  "content": "Decision: pgvector over pinecone for OSS-friendliness",
  "metadata": {
    "provenance": {
      "schemaVersion": "hallmark-provenance.v1",
      "contentHash":   "sha256:abcd...",
      "codeHash":      "sha256:ef01...",
      "phase":         "Phase-3-Slice-2",
      "tool":          "forge_step3_execute_slice"
    }
  }
}

Capability negotiation

Before stamping provenance, Plan Forge checks Open Brain's /health response. The presence of "by-source" in capabilities[] means the server understands the Hallmark envelope and the by-source lookup path. Older servers get bare thoughts (no Hallmark) — everything degrades transparently.

curl https://openbrain.example.com/health
# {
#   "status": "healthy",
#   "service": "open-brain-api",
#   "capabilities": ["capture","search","list","batch","update","delete","stats","by-source"]
# }

Resilience: the OpenBrain queue

When Open Brain is unreachable, Plan Forge does not drop the thought. It writes it to .forge/openbrain-queue.jsonl (L2) and drains the queue back to Open Brain later via forge_anvil_dlq_drain. Failed writes that Open Brain rejects (e.g. a malformed Hallmark envelope) land in the Slag-Heap DLQ for inspection. This means Plan Forge can keep working offline, on a flight, or during a brief outage — no thought is lost.

Recommended Plan Forge tools

Tool	Purpose
`captureMemory()` SDK	L3 write — auto-wraps in Hallmark envelope
`forge_sync_memories`	Mirror L3 hot thoughts into Copilot's `.github/instructions/`
`forge_hallmark_show` · `forge_hallmark_verify`	Inspect / verify a thought's provenance envelope
`forge_anvil_dlq_drain`	Replay queued / DLQ writes against a live Open Brain

You don't need Plan Forge. Open Brain works standalone with any MCP client. Provenance is purely opt-in — bare thoughts (no Hallmark envelope) are still accepted, indexed and searchable. Plan Forge just gets you exact-source deduplication and traceability for free if you're already using it.

Full integration spec in Plan Forge · MEMORY-ARCHITECTURE.md and UNIFIED-SYSTEM-ARCHITECTURE.md.

Deployment — pick a path

Open Brain runs anywhere PostgreSQL + Node can run. There are five canonical paths, all using the same MCP tools and AI clients. Pick the one that matches how you want to use it.

Path	Best for	Time	Cost	Skill
🖥️ Docker Desktop dev box	Solo dev, Win/Mac laptop, fully local	~10 min	$0	Beginner
🐳 Docker Compose (Linux)	Headless server / NAS / Pi / VPS	~10 min	$0–5	Beginner
☁️ Cheap hosted	Always-on, accessible anywhere	~30 min	$0–5	Beginner
☸️ Kubernetes	Homelab / on-prem / privacy	~1 hour	$0 hw	Intermediate
🚀 Azure	Teams / production / managed	~20 min	~$15–20	Intermediate

Using AWS or GCP? Open Brain is provider-agnostic — only Azure ships ready-made IaC. See the AWS & GCP equivalents table for service mapping (Container Apps → ECS / Cloud Run, Azure PostgreSQL Flex → RDS / Cloud SQL, Azure OpenAI → Bedrock / Vertex AI).

🖥️ Docker Desktop dev box

The friendliest path for Windows / macOS laptops. ~10 minutes, $0, fully private. Recommended starting point for most people.

git clone https://github.com/srnichols/OpenBrain.git
cd OpenBrain

# Wizard (PowerShell on Windows, bash on Mac)
.\setup.ps1   # or:  ./setup.sh

# Verify
.\scripts\verify.ps1 http://localhost:8000

Ollama runs natively on your host; everything else lives in Docker. Full walkthrough in 11-DOCKER-DESKTOP-DEVBOX.md.

☁️ Cheap hosted (Supabase / Neon + Fly.io)

Always-on, accessible from any device, ~$0–5/month. The closest match to Nate B Jones' original Open Brain spirit — hosted Postgres + a tiny serverless MCP runtime.

Pick Postgres

Provider	Free tier	Best for
Supabase	500 MB, pauses when idle	Most people — nice UI, closest to Nate's original
Neon	0.5 GB, serverless, ~1 sec cold start	Devs who want git-style DB branches
Railway	$5 trial, then ~$5/mo	One-stop deploy with the MCP server
Render	1 GB free, expires after 90 days	Short-term testing

Pick MCP runtime

Recommended: Fly.io with the ready-made deploy/hosted/fly/fly.toml. Free tier covers a personal install with ~1 sec wake from suspend.

cd deploy/hosted/fly
fly launch --copy-config --no-deploy --name openbrain-<handle>
fly secrets set \
  DATABASE_URL='postgresql://...' \
  MCP_ACCESS_KEY="$(openssl rand -hex 32)" \
  EMBEDDER_PROVIDER=openrouter \
  OPENROUTER_API_KEY='sk-or-...' \
  EMBEDDING_DIMENSIONS=1536
fly deploy

Render and Railway templates are also included. Full walkthrough in 12-HOSTED-CHEAP.md.

🐳 Docker Compose (Linux servers)

For headless Linux hosts — a NAS, Raspberry Pi, VPS, or any always-on server. Same compose stack the dev box uses, just without Docker Desktop's host helpers.

# Clone the repo
git clone https://github.com/srnichols/OpenBrain.git
cd OpenBrain

# Configure
cp .env.example .env
# Edit .env with your settings (MCP_ACCESS_KEY, embedder, etc.)

# Start everything
docker compose up -d

# Verify (universal smoke test)
./scripts/verify.sh http://localhost:8000

The Docker Compose stack includes PostgreSQL with pgvector pre-configured, the Open Brain API + MCP server, and reaches Ollama either on the host or via OpenRouter / Azure OpenAI based on your .env.

☸️ Kubernetes Deployment

Full homelab deployment with Tailscale networking, MetalLB, Ollama GPU, and monitoring integration.

Stack

Component	Technology
Container Runtime	K8s + containerd
Database	PostgreSQL + pgvector (StatefulSet)
Embeddings	Ollama (nomic-embed-text, local GPU)
Networking	MetalLB (LAN) + Tailscale (VPN + Funnel)
Monitoring	Prometheus + Grafana + Loki
Cost	$0/month (uses existing cluster resources)

Networking Options

Option	Access	URL Pattern
Tailscale MagicDNS	Tailnet only	`http://openbrain.your-tailnet.ts.net:8080`
MetalLB	LAN only	`http://192.168.x.x:8080`
Tailscale Funnel	Public internet	`https://openbrain.your-tailnet.ts.net`
Cloudflare Tunnel	Public (custom domain)	`https://brain.yourdomain.com`

Important Notes

• Session Affinity — Required for multi-replica SSE. Set sessionAffinity: ClientIP on the ClusterIP service.
• Tailscale Funnel — The K8s Operator (v1.92.4) doesn't auto-configure Funnel serve. Manual tailscale funnel command needed inside the proxy pod.
• Auth — API key is checked on /sse only. /messages uses sessionId for implicit auth.

Full guide in docs/09-SELF-HOSTED-K8S.md.

🚀 Azure (managed)

One-command Bicep deploy: Azure Container Apps + Azure PostgreSQL Flexible Server + Azure OpenAI + Key Vault. Fully managed, scale-to-zero, ~$15–20/month for personal use.

git clone https://github.com/srnichols/OpenBrain.git
cd OpenBrain

.\deploy\azure\deploy.ps1 -ResourceGroup rg-openbrain -Location eastus2

The script generates secrets, deploys Bicep, seeds the database, and prints the MCP endpoint + key. Embeddings via Azure OpenAI text-embedding-3-small (1536 dim).

AWS & GCP equivalents

Open Brain itself is provider-agnostic; only the Bicep is Azure-specific. The same architecture maps to:

Concern	Azure	AWS	GCP
Container runtime	Container Apps	ECS Fargate / App Runner	Cloud Run
Managed Postgres	PostgreSQL Flex (pgvector)	RDS / Aurora PostgreSQL	Cloud SQL / AlloyDB
Embeddings + LLM	Azure OpenAI	Amazon Bedrock	Vertex AI
Secrets	Key Vault	Secrets Manager	Secret Manager

Full mapping & cost-parity table in 10-AZURE-DEPLOYMENT.md → Equivalents. PRs welcome for AWS / GCP IaC.

Full guide in docs/10-AZURE-DEPLOYMENT.md.

🤖 AI-prompted setup (EASY-SETUP.md)

Don't want to run the commands yourself? Paste a prompt into any AI agent (VS Code Copilot, Claude Code, Cursor, Claude Desktop with terminal) and it'll do the install for you.

EASY-SETUP.md has one prompt per deployment path, plus a "Help me decide" prompt that asks 3 questions and routes you to the right one:

• 🖥️ Docker Desktop dev box prompt
• 🐳 Docker Compose (Linux) prompt
• ☁️ Cheap hosted (Fly + Supabase) prompt
• ☸️ Kubernetes prompt
• 🚀 Azure prompt
• 🤔 Help me decide prompt

Full prompts in EASY-SETUP.md. The repo also ships an AGENTS.md so any AI agent helping a user follows the canonical install procedure.

✅ verify script — universal smoke test

A single script that confirms any Open Brain deployment is healthy. Captures a marker thought, semantically searches for it, then deletes it. Exit 0 = all good.

# Linux / macOS
./scripts/verify.sh http://localhost:8000

# Windows
.\scripts\verify.ps1 http://localhost:8000

# Hosted
./scripts/verify.sh https://openbrain-<handle>.fly.dev

If anything fails, see docs/TROUBLESHOOTING.md — a cross-cutting troubleshooting guide organized by symptom (API won't start, capture / embedder failures, search returns nothing, client config, Azure-specific, hosted-specific).

⏱️ Your first hour

Server's running. Now what? 13-FIRST-HOUR.md walks through the first 60 minutes of actually using Open Brain:

• Minutes 0–5 — confirm it works with the verify script + a thought_stats probe
• Minutes 5–20 — your first 5 captures (small, varied, real)
• Minutes 20–40 — search the way you'd actually think; semantic recall in practice
• Minutes 40–60 — bake it into your real workflow (after-bug captures, project-start captures, cross-tool handoffs)

The piece that turns "it's running" into "I'm using it."

Implementation Roadmap

Four phases, each independently functional. Total time: 2-4 hours.

Phase 1

Foundation (45 min)

Database → Edge functions → MCP server → CLI test

Phase 2

Capture Pipeline (30 min)

Slack app → Ingest webhook → Confirmation replies

Phase 3

Knowledge Migration (30-60 min)

Memory migration from AI platforms + Second brain migration (Notion, Obsidian)

Phase 4

Optimization & Habits (30 min)

Multi-client setup + Daily habits + Weekly review

Success Criteria

📅 Week 1 — Deployed, 20+ thoughts captured

📅 Month 1 — 100+ thoughts, 2+ weekly reviews, established daily habit

📅 Month 3 — Full compounding, knowledge graph effect, AI tools feel contextual

Client Configuration

Copy-paste configs for every supported AI client.

Claude Desktop

Requires mcp-remote bridge (Claude Desktop doesn't support SSE natively).

File: %APPDATA%\Claude\claude_desktop_config.json (Windows) or ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)

{
  "mcpServers": {
    "openbrain": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://YOUR_HOST/sse?key=YOUR_KEY"]
    }
  }
}

Claude Code / VS Code Copilot

File: ~/.claude/settings.json

{
  "mcpServers": {
    "openbrain": {
      "type": "sse",
      "url": "http://YOUR_HOST:8080/sse?key=YOUR_KEY"
    }
  }
}

Cursor

File: .cursor/mcp.json

{
  "mcpServers": {
    "openbrain": {
      "url": "http://YOUR_HOST:8080/sse?key=YOUR_KEY",
      "transport": "sse"
    }
  }
}

ChatGPT

Enable Developer Mode → Add MCP connector with your Funnel/public URL. Set auth to "none" (key in URL).

REST API

Every MCP tool has a REST equivalent on port 8000. Useful for scripts, testing, and non-MCP integrations.

Method	Endpoint	Purpose
`GET`	/health	Health check — response includes `capabilities[]`. Presence of `"by-source"` signals provenance / Hallmark support (v0.7.0+)
`POST`	/memories	Capture a thought (validates `metadata.provenance` when present, v0.7.0+)
`POST`	/memories/search	Semantic search
`POST`	/memories/list	Filtered listing
`POST`	/memories/batch	Transactional batch capture (rejects all-or-nothing on bad provenance)
`GET`	/memories/by-source (v0.7.0+)	Look up thoughts by source/content hash
`GET`	/stats	Aggregate statistics

Example: Capture

curl -X POST http://localhost:8000/memories \
  -H "Content-Type: application/json" \
  -d '{"content": "Decision: Using pgvector for embeddings"}'

Example: Search

curl -X POST http://localhost:8000/memories/search \
  -H "Content-Type: application/json" \
  -d '{"query": "database decisions", "limit": 5}'

Example: Capability Probe v0.7.0+

Single round trip — clients learn which optional features the server speaks:

curl http://localhost:8000/health
# { "status": "healthy", "service": "open-brain-api",
#   "capabilities": ["provenance"] }

Example: Capture with Provenance v0.7.0+

curl -X POST http://localhost:8000/memories \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Phase-PROVENANCE Slice 3 shipped: by-source RPC live",
    "metadata": {
      "provenance": {
        "schemaVersion": "hallmark/v1",
        "toolName": "forge_memory_capture",
        "capturedAt": "2026-05-16T20:00:00Z",
        "contentHash": "sha256:abcd...",
        "codeHash":    "sha256:efgh..."
      }
    }
  }'

Example: Look up by Source Hash v0.7.0+

curl 'http://localhost:8000/memories/by-source?hash=sha256:abcd...&limit=10'
# Returns thoughts whose metadata.provenance.contentHash matches,
# ordered by created_at DESC. limit defaults to 25, max 100.

Troubleshooting

Start here: run ./scripts/verify.sh <your-api-url> first — it pinpoints which layer is broken. Then jump to docs/TROUBLESHOOTING.md for the full cross-cutting guide organized by symptom (API won't start, capture / embedder failures, search returns nothing, client config, Azure-specific, hosted-specific).

Common gotchas

"No active session. Connect to /sse first."

Cause: SSE connection and /messages POST hitting different pods (multi-replica without session affinity).

Fix: kubectl patch svc openbrain-api -n openbrain -p '{"spec":{"sessionAffinity":"ClientIP"}}'

mcp-remote ServerError / OAuth errors

Cause: /messages endpoint returning 401, triggering mcp-remote's OAuth flow.

Fix: Auth must only be enforced on /sse, not on /messages. The sessionId proves authentication.

Claude Desktop doesn't show OpenBrain tools

Check: Verify %APPDATA%\Claude\claude_desktop_config.json has the mcpServers entry. Claude Desktop may overwrite on launch.

Fix: Fully quit (system tray → Quit) and relaunch. Check logs at %APPDATA%\Claude\logs\mcp-server-openbrain.log.

Search returns no results

Check: Run thought_stats. Under 20-30 entries = sparse data, not broken.

Fix: Lower similarity threshold (try 0.3 instead of 0.5). Test with exact captured terminology.

Tailscale Funnel not serving

Cause: K8s Operator v1.92.4 doesn't auto-configure Funnel serve from the annotation.

Fix: Run tailscale funnel --bg --https=443 http://openbrain-api.openbrain.svc.cluster.local:8080 inside the proxy pod. Re-run after pod restarts.