Documentation

Complete reference for the Knol REST API, Python SDK, TypeScript SDK, and framework integrations.

Quick Start

Setup

# Option 1: Docker Compose (recommended)
docker compose up -d

# Option 2: pip install
pip install knol

# Option 3: npm install
npm install @knol-dev/sdk

# Option 4: Build from source
cd knol-oss
cargo build --workspace --release

knol-local — Local MCP & CLI

A lightweight, standalone memory server backed by SQLite. No Docker, no PostgreSQL, no API key required. Works with Claude Desktop, Cursor, Windsurf, and Claude Code out of the box.

Installation

Global install

npm install -g knol-local

The postinstall script automatically patches any existing Claude Desktop or Cursor config files. Re-run node $(npm root -g)/knol-local/setup.mjs at any time to re-apply.

Manual Client Setup

Setup commands

knol-local setup claude       # Claude Desktop (creates config if missing)
knol-local setup cursor       # Cursor (~/.cursor/mcp.json)
knol-local setup claude-code  # Claude Code CLI (claude mcp add)
knol-local setup codex        # Codex — shows HTTP API instructions
knol-local setup              # Auto-detect and configure all found clients

For Claude Code you can also add it per-project in .claude/settings.json:

.claude/settings.json

{ "mcpServers": { "knol-local": { "command": "knol-local" } } }

MCP Tools

Once connected, Claude (or any MCP client) can call these tools:

remember

Store a memory — accepts content, optional tags, and an importance score (0–1).

recall

Full-text search across memories. Returns ranked results with relevance scores.

forget

Delete a memory by ID.

list_memories

List recent memories with optional tag filters and a limit.

update_memory

Update the content, tags, or importance of an existing memory.

memory_stats

Return total count, oldest, and newest memory timestamps.

CLI Commands

knol-local CLI

# Add a memory
knol-local add "Prefer strict TypeScript and functional patterns" --tag coding

# Search memories
knol-local search "TypeScript preferences" --limit 5

# List all memories
knol-local list --limit 20 --tag coding

# Summary statistics
knol-local stats

# Export / import
knol-local export --out backup.json
knol-local import backup.json

# Backup / restore the SQLite database
knol-local backup --out ~/backups/
knol-local restore ~/backups/memories-2026-05-09.db

# Start the HTTP REST API (useful for Codex)
knol-local serve --port 3001 --key my-secret

HTTP REST API

Start the server with knol-local serve to expose a local REST API — useful for Codex or any tool without native MCP support.

Method	Path	Description
GET	/memories	List memories (supports ?tag=&limit=)
POST	/memories	Add a memory { content, tags?, importance? }
GET	/memories/search	Full-text search (?q=&limit=&tag=)
DELETE	/memories/:id	Delete a memory by ID
GET	/export	Export all memories as JSON
POST	/import	Import memories from JSON
GET	/health	Health check

Environment & Data

Variable	Default	Description
KNOL_LOCAL_DB	~/.knol-local/memories.db	Path to the SQLite database file

Node 22.5+ uses the built-in node:sqlite with no extra dependencies. Older Node versions (e.g. Claude Desktop embeds Node 18) automatically install better-sqlite3 during postinstall.

npm page →GitHub →

Authentication

All API requests require an API key in the Authorization header. Keys are SHA-256 hashed and stored securely. Create keys via the admin dashboard or admin API.

Authorization: Bearer YOUR_API_KEY

API Endpoints

Method	Path	Description
POST	/v1/memory	Store a new memory (async extraction)
POST	/v1/memory/batch	Store multiple memories in one call
POST	/v1/memory/search	Hybrid search (vector + BM25 + graph)
GET	/v1/memory/:id	Get a specific memory by ID
PUT	/v1/memory/:id	Update a memory
DELETE	/v1/memory/:id	Delete a memory
GET	/v1/graph/entities	List entities in knowledge graph
GET	/v1/graph/entities/:id/edges	Get entity relationships (N-hop)
POST	/v1/memory/export	Export memories (JSON)
POST	/v1/memory/import	Import memories (JSON)
GET	/v1/admin/memories	List all memories (admin)
GET	/v1/webhooks	List webhooks
POST	/v1/webhooks	Create a webhook
DELETE	/v1/webhooks/:id	Delete a webhook
GET	/v1/admin/audit	Browse audit log
GET	/health	Health check
GET	/metrics	Prometheus metrics

Store a Memory

Memories are accepted instantly and processed asynchronously. The pipeline extracts entities, generates embeddings, detects conflicts, and fires webhook events — all in the background.

POST /v1/memory

curl -X POST http://localhost:3000/v1/memory \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "User prefers dark mode and concise responses",
    "user_id": "user-123",
    "metadata": {
      "source": "settings",
      "session_id": "sess-456"
    }
  }'

# Response
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "accepted",
  "message": "Memory queued for processing"
}

Search Memories

The search endpoint uses adaptive hybrid retrieval. It classifies query intent (preference, temporal, relational, or general) and fuses vector, BM25, and graph signals via Reciprocal Rank Fusion for optimal results.

POST /v1/memory/search

curl -X POST http://localhost:3000/v1/memory/search \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What does the user prefer?",
    "user_id": "user-123",
    "limit": 5,
    "memory_types": ["semantic", "episodic"]
  }'

# Response — hybrid retrieval fuses vector + BM25 + graph
{
  "results": [
    {
      "id": "550e8400-...",
      "content": "User prefers dark mode and concise responses",
      "memory_type": "semantic",
      "score": 0.94,
      "created_at": "2026-02-15T10:30:00Z"
    }
  ],
  "query_intent": "preference",
  "retrieval_strategy": "vector_primary"
}

Python SDK

Install with pip install knol. Both sync and async clients are included.

Sync Client

from knol import KnolClient

client = KnolClient(
    base_url="http://localhost:3000",
    api_key="your-api-key"
)

# Store — extraction & embedding happen automatically
memory = client.add(
    content="User prefers dark mode and concise responses",
    user_id="user-123",
    metadata={"source": "settings"}
)

# Search — hybrid retrieval (vector + BM25 + graph)
results = client.search(
    query="user preferences",
    user_id="user-123",
    limit=5
)

# Knowledge graph
entities = client.list_entities(user_id="user-123")

# CRUD
memory = client.get(memory_id="550e8400-...")
client.update(memory_id="550e8400-...", content="Updated content")
client.delete(memory_id="550e8400-...")

Async Client

from knol import AsyncKnolClient
import asyncio

async def main():
    client = AsyncKnolClient(
        base_url="http://localhost:3000",
        api_key="your-api-key"
    )

    # Async store + search
    await client.add(
        content="User prefers TypeScript and functional patterns",
        user_id="user-456"
    )

    results = await client.search(
        query="programming preferences",
        user_id="user-456"
    )

asyncio.run(main())

TypeScript SDK

Install with npm install @knol-dev/sdk. Fully typed with TypeScript generics.

TypeScript SDK

import { KnolClient } from '@knol-dev/sdk';

const knol = new KnolClient({
  baseUrl: 'http://localhost:3000',
  apiKey: 'your-api-key',
});

// Store
await knol.add({
  content: 'User prefers TypeScript and functional patterns',
  userId: 'user-123',
});

// Search
const results = await knol.search({
  query: 'programming preferences',
  userId: 'user-123',
});

// Knowledge graph
const entities = await knol.listEntities({ userId: 'user-123' });

Framework Integrations

Drop-in memory backends for popular AI agent frameworks.

LangChain

from knol.langchain import KnolMemory
from langchain.agents import AgentExecutor

memory = KnolMemory(
    base_url="http://localhost:3000",
    api_key="your-api-key",
    user_id="user-123"
)

# Use as LangChain memory backend
agent = AgentExecutor(
    ...,
    memory=memory
)

CrewAI

from knol.crewai import KnolMemory
from crewai import Crew

memory = KnolMemory(
    base_url="http://localhost:3000",
    api_key="your-api-key"
)

crew = Crew(
    agents=[...],
    tasks=[...],
    memory=memory
)

Error Codes

All errors are returned as JSON with a consistent format. The response body always contains an error field with a descriptive message.

Status	Code	Description
400	Bad Request	Validation error (malformed JSON, missing required fields)
401	Unauthorized	Missing or invalid API key
403	Forbidden	API key role insufficient for this operation
404	Not Found	Resource does not exist
402	Payment Required	Plan limit exceeded
429	Too Many Requests	Rate limit exceeded
500	Internal Server Error	Unexpected server error

Error Response Format

{
  "error": "Missing required field: content"
}

Rate Limiting

Rate limits are enforced per-tenant at the gateway level. When you exceed the limit, the API returns HTTP 429 with an error response.

Plan	Requests/Minute
Free	10
Developer	100
Pro	500
Team	2,000
Enterprise	10,000

Rate Limit Exceeded Response

{
  "error": "Rate limit exceeded. Maximum 10 requests per minute on Free plan"
}

Webhooks

Webhooks allow you to receive real-time events from Knol. Events are sent as HTTP POST requests to your configured URL with HMAC-SHA256 signatures for verification.

Webhook Management

GET /v1/webhooks — List all webhooks for your tenant
POST /v1/webhooks — Create a new webhook subscription
DELETE /v1/webhooks/:id — Remove a webhook

Event Types

memory.created — New memory stored and accepted
entity.created — New entity extracted and added to graph
edge.created — New relationship discovered between entities
conflict.detected — Memory conflict identified and flagged for review

Signature Verification

Each webhook request includes an X-Webhook-Signature header containing an HMAC-SHA256 signature of the request body using your webhook secret. Verify this signature to confirm the request came from Knol.

POST /v1/webhooks

curl -X POST http://localhost:3000/v1/webhooks \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-app.com/webhook",
    "events": ["memory.created", "conflict.detected"],
    "active": true
  }'

# Response
{
  "id": "webhook-550e8400-e29b",
  "url": "https://your-app.com/webhook",
  "events": ["memory.created", "conflict.detected"],
  "active": true,
  "created_at": "2026-02-15T10:30:00Z"
}

Batch Operations

Store multiple memories in a single request. Batch operations are processed asynchronously and return immediately with accepted IDs. This is more efficient than multiple individual requests.

POST /v1/memory/batch

curl -X POST http://localhost:3000/v1/memory/batch \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "memories": [
      {
        "content": "User prefers dark mode",
        "user_id": "user-123",
        "metadata": {"source": "settings"}
      },
      {
        "content": "User knows TypeScript and Rust",
        "user_id": "user-123",
        "metadata": {"source": "profile"}
      }
    ]
  }'

# Response
{
  "inserted": 2,
  "ids": ["550e8400-e29b-41d4-...", "550e8400-e29c-41d4-..."],
  "status": "accepted"
}

Configuration

Configuration follows a three-tier hierarchy: database (system_config table) → environment variable → compiled default. Runtime changes via the admin API take effect without restarts.

Variable	Default	Description
DATABASE_URL	postgres://memory:memory_dev@localhost/memory	PostgreSQL + pgvector connection
REDIS_URL	redis://localhost:6379	Redis for rate limiting & caching
NATS_URL	nats://localhost:4222	NATS JetStream for async pipeline
LLM_PROVIDER	anthropic	LLM provider (anthropic, openai, gemini)
LLM_API_KEY	(required)	API key for LLM extraction
LLM_MODEL	claude-haiku	Model for entity extraction
JWT_SECRET	(required)	JWT signing key (min 32 chars)
GATEWAY_PORT	3000	Gateway listen port
WEBHOOK_ENABLED	true	Enable webhook event dispatch
RUST_LOG	info	Log level (trace, debug, info, warn, error)

Architecture

Knol uses a microservice architecture with each concern isolated in its own Rust binary. All services share a single PostgreSQL database with pgvector for vector storage.

┌─────────────┐      ┌──────────────┐      ┌────────────────┐
│   Gateway    │─────▶│  Write Svc   │─────▶│  NATS Stream   │
│  auth/rate   │      │  episodes    │      │  extraction    │
│  /metrics    │      │  webhooks    │      │  embedding     │
└──────┬──────┘      └──────────────┘      └───────┬────────┘
       │                                            │
       │             ┌──────────────┐      ┌────────▼────────┐
       └────────────▶│ Retrieve Svc │      │   Graph Svc     │
                     │ vector+BM25  │      │  LLM extraction │
                     │  RRF fusion  │      │  conflict detect│
                     │  graph walk  │      │  embedding gen  │
                     └──────────────┘      └─────────────────┘
                            │
                     ┌──────▼──────┐
                     │ PostgreSQL  │
                     │  + pgvector │
                     └─────────────┘

▸

Gateway

Auth, rate limiting, metrics

▸

Write Service

Episodes, webhooks

▸

NATS Stream

Extraction, embedding pipeline

▸

Retrieve Service

Vector + BM25 + RRF + graph walk

▸

Graph Service

LLM extraction, conflict detection

▸

PostgreSQL + pgvector

Single database for all data

Documentation

Quick Start

knol-local — Local MCP & CLI

Installation

Manual Client Setup

MCP Tools

CLI Commands

HTTP REST API

Environment & Data

Authentication

API Endpoints

Store a Memory

Search Memories

Python SDK

TypeScript SDK

Framework Integrations

Error Codes

Rate Limiting

Webhooks

Webhook Management

Event Types

Signature Verification

Batch Operations

Configuration

Architecture

GitHub

PyPI

npm