Documentation

Complete reference for the Knol REST API, Python SDK, TypeScript SDK, and framework integrations.

Quick Start

Setup

# Option 1: Docker Compose (recommended)
docker compose up -d

# Option 2: pip install
pip install knol

# Option 3: npm install
npm install @knol-dev/sdk

# Option 4: Build from source
cd knol-oss
cargo build --workspace --release

Authentication

All API requests require an API key in the Authorization header. Keys are SHA-256 hashed and stored securely. Create keys via the admin dashboard or admin API.

Authorization: Bearer YOUR_API_KEY

API Endpoints

Method	Path	Description
POST	/v1/memory	Store a new memory (async extraction)
POST	/v1/memory/batch	Store multiple memories in one call
POST	/v1/memory/search	Hybrid search (vector + BM25 + graph)
GET	/v1/memory/:id	Get a specific memory by ID
PUT	/v1/memory/:id	Update a memory
DELETE	/v1/memory/:id	Delete a memory
GET	/v1/graph/entities	List entities in knowledge graph
GET	/v1/graph/entities/:id/edges	Get entity relationships (N-hop)
POST	/v1/memory/export	Export memories (JSON)
POST	/v1/memory/import	Import memories (JSON)
GET	/v1/admin/memories	List all memories (admin)
GET	/v1/webhooks	List webhooks
POST	/v1/webhooks	Create a webhook
DELETE	/v1/webhooks/:id	Delete a webhook
GET	/v1/admin/audit	Browse audit log
GET	/health	Health check
GET	/metrics	Prometheus metrics

Store a Memory

Memories are accepted instantly and processed asynchronously. The pipeline extracts entities, generates embeddings, detects conflicts, and fires webhook events — all in the background.

POST /v1/memory

curl -X POST http://localhost:3000/v1/memory \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "User prefers dark mode and concise responses",
    "user_id": "user-123",
    "metadata": {
      "source": "settings",
      "session_id": "sess-456"
    }
  }'

# Response
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "accepted",
  "message": "Memory queued for processing"
}

Search Memories

The search endpoint uses adaptive hybrid retrieval. It classifies query intent (preference, temporal, relational, or general) and fuses vector, BM25, and graph signals via Reciprocal Rank Fusion for optimal results.

POST /v1/memory/search

curl -X POST http://localhost:3000/v1/memory/search \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What does the user prefer?",
    "user_id": "user-123",
    "limit": 5,
    "memory_types": ["semantic", "episodic"]
  }'

# Response — hybrid retrieval fuses vector + BM25 + graph
{
  "results": [
    {
      "id": "550e8400-...",
      "content": "User prefers dark mode and concise responses",
      "memory_type": "semantic",
      "score": 0.94,
      "created_at": "2026-02-15T10:30:00Z"
    }
  ],
  "query_intent": "preference",
  "retrieval_strategy": "vector_primary"
}

Python SDK

Install with pip install knol. Both sync and async clients are included.

Sync Client

from knol import KnolClient

client = KnolClient(
    base_url="http://localhost:3000",
    api_key="your-api-key"
)

# Store — extraction & embedding happen automatically
memory = client.add(
    content="User prefers dark mode and concise responses",
    user_id="user-123",
    metadata={"source": "settings"}
)

# Search — hybrid retrieval (vector + BM25 + graph)
results = client.search(
    query="user preferences",
    user_id="user-123",
    limit=5
)

# Knowledge graph
entities = client.list_entities(user_id="user-123")

# CRUD
memory = client.get(memory_id="550e8400-...")
client.update(memory_id="550e8400-...", content="Updated content")
client.delete(memory_id="550e8400-...")

Async Client

from knol import AsyncKnolClient
import asyncio

async def main():
    client = AsyncKnolClient(
        base_url="http://localhost:3000",
        api_key="your-api-key"
    )

    # Async store + search
    await client.add(
        content="User prefers TypeScript and functional patterns",
        user_id="user-456"
    )

    results = await client.search(
        query="programming preferences",
        user_id="user-456"
    )

asyncio.run(main())

TypeScript SDK

Install with npm install @knol-dev/sdk. Fully typed with TypeScript generics.

TypeScript SDK

import { KnolClient } from '@knol-dev/sdk';

const knol = new KnolClient({
  baseUrl: 'http://localhost:3000',
  apiKey: 'your-api-key',
});

// Store
await knol.add({
  content: 'User prefers TypeScript and functional patterns',
  userId: 'user-123',
});

// Search
const results = await knol.search({
  query: 'programming preferences',
  userId: 'user-123',
});

// Knowledge graph
const entities = await knol.listEntities({ userId: 'user-123' });

Framework Integrations

Drop-in memory backends for popular AI agent frameworks.

LangChain

from knol.langchain import KnolMemory
from langchain.agents import AgentExecutor

memory = KnolMemory(
    base_url="http://localhost:3000",
    api_key="your-api-key",
    user_id="user-123"
)

# Use as LangChain memory backend
agent = AgentExecutor(
    ...,
    memory=memory
)

CrewAI

from knol.crewai import KnolMemory
from crewai import Crew

memory = KnolMemory(
    base_url="http://localhost:3000",
    api_key="your-api-key"
)

crew = Crew(
    agents=[...],
    tasks=[...],
    memory=memory
)

Error Codes

All errors are returned as JSON with a consistent format. The response body always contains an error field with a descriptive message.

Status	Code	Description
400	Bad Request	Validation error (malformed JSON, missing required fields)
401	Unauthorized	Missing or invalid API key
403	Forbidden	API key role insufficient for this operation
404	Not Found	Resource does not exist
402	Payment Required	Plan limit exceeded
429	Too Many Requests	Rate limit exceeded
500	Internal Server Error	Unexpected server error

Error Response Format

{
  "error": "Missing required field: content"
}

Rate Limiting

Rate limits are enforced per-tenant at the gateway level. When you exceed the limit, the API returns HTTP 429 with an error response.

Plan	Requests/Minute
Free	10
Developer	100
Pro	500
Team	2,000
Enterprise	10,000

Rate Limit Exceeded Response

{
  "error": "Rate limit exceeded. Maximum 10 requests per minute on Free plan"
}

Webhooks

Webhooks allow you to receive real-time events from Knol. Events are sent as HTTP POST requests to your configured URL with HMAC-SHA256 signatures for verification.

Webhook Management

GET /v1/webhooks — List all webhooks for your tenant
POST /v1/webhooks — Create a new webhook subscription
DELETE /v1/webhooks/:id — Remove a webhook

Event Types

memory.created — New memory stored and accepted
entity.created — New entity extracted and added to graph
edge.created — New relationship discovered between entities
conflict.detected — Memory conflict identified and flagged for review

Signature Verification

Each webhook request includes an X-Webhook-Signature header containing an HMAC-SHA256 signature of the request body using your webhook secret. Verify this signature to confirm the request came from Knol.

POST /v1/webhooks

curl -X POST http://localhost:3000/v1/webhooks \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-app.com/webhook",
    "events": ["memory.created", "conflict.detected"],
    "active": true
  }'

# Response
{
  "id": "webhook-550e8400-e29b",
  "url": "https://your-app.com/webhook",
  "events": ["memory.created", "conflict.detected"],
  "active": true,
  "created_at": "2026-02-15T10:30:00Z"
}

Batch Operations

Store multiple memories in a single request. Batch operations are processed asynchronously and return immediately with accepted IDs. This is more efficient than multiple individual requests.

POST /v1/memory/batch

curl -X POST http://localhost:3000/v1/memory/batch \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "memories": [
      {
        "content": "User prefers dark mode",
        "user_id": "user-123",
        "metadata": {"source": "settings"}
      },
      {
        "content": "User knows TypeScript and Rust",
        "user_id": "user-123",
        "metadata": {"source": "profile"}
      }
    ]
  }'

# Response
{
  "inserted": 2,
  "ids": ["550e8400-e29b-41d4-...", "550e8400-e29c-41d4-..."],
  "status": "accepted"
}

Configuration

Configuration follows a three-tier hierarchy: database (system_config table) → environment variable → compiled default. Runtime changes via the admin API take effect without restarts.

Variable	Default	Description
DATABASE_URL	postgres://memory:memory_dev@localhost/memory	PostgreSQL + pgvector connection
REDIS_URL	redis://localhost:6379	Redis for rate limiting & caching
NATS_URL	nats://localhost:4222	NATS JetStream for async pipeline
LLM_PROVIDER	anthropic	LLM provider (anthropic, openai, gemini)
LLM_API_KEY	(required)	API key for LLM extraction
LLM_MODEL	claude-haiku	Model for entity extraction
JWT_SECRET	(required)	JWT signing key (min 32 chars)
GATEWAY_PORT	3000	Gateway listen port
WEBHOOK_ENABLED	true	Enable webhook event dispatch
RUST_LOG	info	Log level (trace, debug, info, warn, error)

Architecture

Knol uses a microservice architecture with each concern isolated in its own Rust binary. All services share a single PostgreSQL database with pgvector for vector storage.

┌─────────────┐      ┌──────────────┐      ┌────────────────┐
│   Gateway    │─────▶│  Write Svc   │─────▶│  NATS Stream   │
│  auth/rate   │      │  episodes    │      │  extraction    │
│  /metrics    │      │  webhooks    │      │  embedding     │
└──────┬──────┘      └──────────────┘      └───────┬────────┘
       │                                            │
       │             ┌──────────────┐      ┌────────▼────────┐
       └────────────▶│ Retrieve Svc │      │   Graph Svc     │
                     │ vector+BM25  │      │  LLM extraction │
                     │  RRF fusion  │      │  conflict detect│
                     │  graph walk  │      │  embedding gen  │
                     └──────────────┘      └─────────────────┘
                            │
                     ┌──────▼──────┐
                     │ PostgreSQL  │
                     │  + pgvector │
                     └─────────────┘

Documentation

Quick Start

Authentication

API Endpoints

Store a Memory

Search Memories

Python SDK

TypeScript SDK

Framework Integrations

Error Codes

Rate Limiting

Webhooks

Webhook Management

Event Types

Signature Verification

Batch Operations

Configuration

Architecture

GitHub

PyPI

npm