Documentation
Complete reference for the Knol REST API, Python SDK, TypeScript SDK, and framework integrations.
Quick Start
# Option 1: Docker Compose (recommended)
docker compose up -d
# Option 2: pip install
pip install knol
# Option 3: npm install
npm install @knol-dev/sdk
# Option 4: Build from source
cd knol-oss
cargo build --workspace --releaseknol-local — Local MCP & CLI
A lightweight, standalone memory server backed by SQLite. No Docker, no PostgreSQL, no API key required. Works with Claude Desktop, Cursor, Windsurf, and Claude Code out of the box.
Installation
npm install -g knol-localThe postinstall script automatically patches any existing Claude Desktop or Cursor config files. Re-run node $(npm root -g)/knol-local/setup.mjs at any time to re-apply.
Manual Client Setup
knol-local setup claude # Claude Desktop (creates config if missing)
knol-local setup cursor # Cursor (~/.cursor/mcp.json)
knol-local setup claude-code # Claude Code CLI (claude mcp add)
knol-local setup codex # Codex — shows HTTP API instructions
knol-local setup # Auto-detect and configure all found clientsFor Claude Code you can also add it per-project in .claude/settings.json:
{ "mcpServers": { "knol-local": { "command": "knol-local" } } }MCP Tools
Once connected, Claude (or any MCP client) can call these tools:
rememberStore a memory — accepts content, optional tags, and an importance score (0–1).
recallFull-text search across memories. Returns ranked results with relevance scores.
forgetDelete a memory by ID.
list_memoriesList recent memories with optional tag filters and a limit.
update_memoryUpdate the content, tags, or importance of an existing memory.
memory_statsReturn total count, oldest, and newest memory timestamps.
CLI Commands
# Add a memory
knol-local add "Prefer strict TypeScript and functional patterns" --tag coding
# Search memories
knol-local search "TypeScript preferences" --limit 5
# List all memories
knol-local list --limit 20 --tag coding
# Summary statistics
knol-local stats
# Export / import
knol-local export --out backup.json
knol-local import backup.json
# Backup / restore the SQLite database
knol-local backup --out ~/backups/
knol-local restore ~/backups/memories-2026-05-09.db
# Start the HTTP REST API (useful for Codex)
knol-local serve --port 3001 --key my-secretHTTP REST API
Start the server with knol-local serve to expose a local REST API — useful for Codex or any tool without native MCP support.
| Method | Path | Description |
|---|---|---|
| GET | /memories | List memories (supports ?tag=&limit=) |
| POST | /memories | Add a memory { content, tags?, importance? } |
| GET | /memories/search | Full-text search (?q=&limit=&tag=) |
| DELETE | /memories/:id | Delete a memory by ID |
| GET | /export | Export all memories as JSON |
| POST | /import | Import memories from JSON |
| GET | /health | Health check |
Environment & Data
| Variable | Default | Description |
|---|---|---|
| KNOL_LOCAL_DB | ~/.knol-local/memories.db | Path to the SQLite database file |
Node 22.5+ uses the built-in node:sqlite with no extra dependencies. Older Node versions (e.g. Claude Desktop embeds Node 18) automatically install better-sqlite3 during postinstall.
Authentication
All API requests require an API key in the Authorization header. Keys are SHA-256 hashed and stored securely. Create keys via the admin dashboard or admin API.
Authorization: Bearer YOUR_API_KEYAPI Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/memory | Store a new memory (async extraction) |
| POST | /v1/memory/batch | Store multiple memories in one call |
| POST | /v1/memory/search | Hybrid search (vector + BM25 + graph) |
| GET | /v1/memory/:id | Get a specific memory by ID |
| PUT | /v1/memory/:id | Update a memory |
| DELETE | /v1/memory/:id | Delete a memory |
| GET | /v1/graph/entities | List entities in knowledge graph |
| GET | /v1/graph/entities/:id/edges | Get entity relationships (N-hop) |
| POST | /v1/memory/export | Export memories (JSON) |
| POST | /v1/memory/import | Import memories (JSON) |
| GET | /v1/admin/memories | List all memories (admin) |
| GET | /v1/webhooks | List webhooks |
| POST | /v1/webhooks | Create a webhook |
| DELETE | /v1/webhooks/:id | Delete a webhook |
| GET | /v1/admin/audit | Browse audit log |
| GET | /health | Health check |
| GET | /metrics | Prometheus metrics |
Store a Memory
Memories are accepted instantly and processed asynchronously. The pipeline extracts entities, generates embeddings, detects conflicts, and fires webhook events — all in the background.
curl -X POST http://localhost:3000/v1/memory \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"content": "User prefers dark mode and concise responses",
"user_id": "user-123",
"metadata": {
"source": "settings",
"session_id": "sess-456"
}
}'
# Response
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "accepted",
"message": "Memory queued for processing"
}Search Memories
The search endpoint uses adaptive hybrid retrieval. It classifies query intent (preference, temporal, relational, or general) and fuses vector, BM25, and graph signals via Reciprocal Rank Fusion for optimal results.
curl -X POST http://localhost:3000/v1/memory/search \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What does the user prefer?",
"user_id": "user-123",
"limit": 5,
"memory_types": ["semantic", "episodic"]
}'
# Response — hybrid retrieval fuses vector + BM25 + graph
{
"results": [
{
"id": "550e8400-...",
"content": "User prefers dark mode and concise responses",
"memory_type": "semantic",
"score": 0.94,
"created_at": "2026-02-15T10:30:00Z"
}
],
"query_intent": "preference",
"retrieval_strategy": "vector_primary"
}Python SDK
Install with pip install knol. Both sync and async clients are included.
from knol import KnolClient
client = KnolClient(
base_url="http://localhost:3000",
api_key="your-api-key"
)
# Store — extraction & embedding happen automatically
memory = client.add(
content="User prefers dark mode and concise responses",
user_id="user-123",
metadata={"source": "settings"}
)
# Search — hybrid retrieval (vector + BM25 + graph)
results = client.search(
query="user preferences",
user_id="user-123",
limit=5
)
# Knowledge graph
entities = client.list_entities(user_id="user-123")
# CRUD
memory = client.get(memory_id="550e8400-...")
client.update(memory_id="550e8400-...", content="Updated content")
client.delete(memory_id="550e8400-...")from knol import AsyncKnolClient
import asyncio
async def main():
client = AsyncKnolClient(
base_url="http://localhost:3000",
api_key="your-api-key"
)
# Async store + search
await client.add(
content="User prefers TypeScript and functional patterns",
user_id="user-456"
)
results = await client.search(
query="programming preferences",
user_id="user-456"
)
asyncio.run(main())TypeScript SDK
Install with npm install @knol-dev/sdk. Fully typed with TypeScript generics.
import { KnolClient } from '@knol-dev/sdk';
const knol = new KnolClient({
baseUrl: 'http://localhost:3000',
apiKey: 'your-api-key',
});
// Store
await knol.add({
content: 'User prefers TypeScript and functional patterns',
userId: 'user-123',
});
// Search
const results = await knol.search({
query: 'programming preferences',
userId: 'user-123',
});
// Knowledge graph
const entities = await knol.listEntities({ userId: 'user-123' });Framework Integrations
Drop-in memory backends for popular AI agent frameworks.
from knol.langchain import KnolMemory
from langchain.agents import AgentExecutor
memory = KnolMemory(
base_url="http://localhost:3000",
api_key="your-api-key",
user_id="user-123"
)
# Use as LangChain memory backend
agent = AgentExecutor(
...,
memory=memory
)from knol.crewai import KnolMemory
from crewai import Crew
memory = KnolMemory(
base_url="http://localhost:3000",
api_key="your-api-key"
)
crew = Crew(
agents=[...],
tasks=[...],
memory=memory
)Error Codes
All errors are returned as JSON with a consistent format. The response body always contains an error field with a descriptive message.
| Status | Code | Description |
|---|---|---|
| 400 | Bad Request | Validation error (malformed JSON, missing required fields) |
| 401 | Unauthorized | Missing or invalid API key |
| 403 | Forbidden | API key role insufficient for this operation |
| 404 | Not Found | Resource does not exist |
| 402 | Payment Required | Plan limit exceeded |
| 429 | Too Many Requests | Rate limit exceeded |
| 500 | Internal Server Error | Unexpected server error |
{
"error": "Missing required field: content"
}Rate Limiting
Rate limits are enforced per-tenant at the gateway level. When you exceed the limit, the API returns HTTP 429 with an error response.
| Plan | Requests/Minute |
|---|---|
| Free | 10 |
| Developer | 100 |
| Pro | 500 |
| Team | 2,000 |
| Enterprise | 10,000 |
{
"error": "Rate limit exceeded. Maximum 10 requests per minute on Free plan"
}Webhooks
Webhooks allow you to receive real-time events from Knol. Events are sent as HTTP POST requests to your configured URL with HMAC-SHA256 signatures for verification.
Webhook Management
GET /v1/webhooks— List all webhooks for your tenantPOST /v1/webhooks— Create a new webhook subscriptionDELETE /v1/webhooks/:id— Remove a webhook
Event Types
memory.created— New memory stored and acceptedentity.created— New entity extracted and added to graphedge.created— New relationship discovered between entitiesconflict.detected— Memory conflict identified and flagged for review
Signature Verification
Each webhook request includes an X-Webhook-Signature header containing an HMAC-SHA256 signature of the request body using your webhook secret. Verify this signature to confirm the request came from Knol.
curl -X POST http://localhost:3000/v1/webhooks \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://your-app.com/webhook",
"events": ["memory.created", "conflict.detected"],
"active": true
}'
# Response
{
"id": "webhook-550e8400-e29b",
"url": "https://your-app.com/webhook",
"events": ["memory.created", "conflict.detected"],
"active": true,
"created_at": "2026-02-15T10:30:00Z"
}Batch Operations
Store multiple memories in a single request. Batch operations are processed asynchronously and return immediately with accepted IDs. This is more efficient than multiple individual requests.
curl -X POST http://localhost:3000/v1/memory/batch \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"memories": [
{
"content": "User prefers dark mode",
"user_id": "user-123",
"metadata": {"source": "settings"}
},
{
"content": "User knows TypeScript and Rust",
"user_id": "user-123",
"metadata": {"source": "profile"}
}
]
}'
# Response
{
"inserted": 2,
"ids": ["550e8400-e29b-41d4-...", "550e8400-e29c-41d4-..."],
"status": "accepted"
}Configuration
Configuration follows a three-tier hierarchy: database (system_config table) → environment variable → compiled default. Runtime changes via the admin API take effect without restarts.
| Variable | Default | Description |
|---|---|---|
| DATABASE_URL | postgres://memory:memory_dev@localhost/memory | PostgreSQL + pgvector connection |
| REDIS_URL | redis://localhost:6379 | Redis for rate limiting & caching |
| NATS_URL | nats://localhost:4222 | NATS JetStream for async pipeline |
| LLM_PROVIDER | anthropic | LLM provider (anthropic, openai, gemini) |
| LLM_API_KEY | (required) | API key for LLM extraction |
| LLM_MODEL | claude-haiku | Model for entity extraction |
| JWT_SECRET | (required) | JWT signing key (min 32 chars) |
| GATEWAY_PORT | 3000 | Gateway listen port |
| WEBHOOK_ENABLED | true | Enable webhook event dispatch |
| RUST_LOG | info | Log level (trace, debug, info, warn, error) |
Architecture
Knol uses a microservice architecture with each concern isolated in its own Rust binary. All services share a single PostgreSQL database with pgvector for vector storage.
Gateway
Auth, rate limiting, metrics
Write Service
Episodes, webhooks
NATS Stream
Extraction, embedding pipeline
Retrieve Service
Vector + BM25 + RRF + graph walk
Graph Service
LLM extraction, conflict detection
PostgreSQL + pgvector
Single database for all data