Documentation

Complete reference for the Knol REST API, Python SDK, TypeScript SDK, and framework integrations.

Quick Start

Setup
# Option 1: Docker Compose (recommended)
docker compose up -d

# Option 2: pip install
pip install knol

# Option 3: npm install
npm install @knol-dev/sdk

# Option 4: Build from source
cd knol-oss
cargo build --workspace --release

Authentication

All API requests require an API key in the Authorization header. Keys are SHA-256 hashed and stored securely. Create keys via the admin dashboard or admin API.

Authorization: Bearer YOUR_API_KEY

API Endpoints

MethodPathDescription
POST/v1/memoryStore a new memory (async extraction)
POST/v1/memory/batchStore multiple memories in one call
POST/v1/memory/searchHybrid search (vector + BM25 + graph)
GET/v1/memory/:idGet a specific memory by ID
PUT/v1/memory/:idUpdate a memory
DELETE/v1/memory/:idDelete a memory
GET/v1/graph/entitiesList entities in knowledge graph
GET/v1/graph/entities/:id/edgesGet entity relationships (N-hop)
POST/v1/memory/exportExport memories (JSON)
POST/v1/memory/importImport memories (JSON)
GET/v1/admin/memoriesList all memories (admin)
GET/v1/webhooksList webhooks
POST/v1/webhooksCreate a webhook
DELETE/v1/webhooks/:idDelete a webhook
GET/v1/admin/auditBrowse audit log
GET/healthHealth check
GET/metricsPrometheus metrics

Store a Memory

Memories are accepted instantly and processed asynchronously. The pipeline extracts entities, generates embeddings, detects conflicts, and fires webhook events — all in the background.

POST /v1/memory
curl -X POST http://localhost:3000/v1/memory \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "User prefers dark mode and concise responses",
    "user_id": "user-123",
    "metadata": {
      "source": "settings",
      "session_id": "sess-456"
    }
  }'

# Response
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "accepted",
  "message": "Memory queued for processing"
}

Search Memories

The search endpoint uses adaptive hybrid retrieval. It classifies query intent (preference, temporal, relational, or general) and fuses vector, BM25, and graph signals via Reciprocal Rank Fusion for optimal results.

POST /v1/memory/search
curl -X POST http://localhost:3000/v1/memory/search \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What does the user prefer?",
    "user_id": "user-123",
    "limit": 5,
    "memory_types": ["semantic", "episodic"]
  }'

# Response — hybrid retrieval fuses vector + BM25 + graph
{
  "results": [
    {
      "id": "550e8400-...",
      "content": "User prefers dark mode and concise responses",
      "memory_type": "semantic",
      "score": 0.94,
      "created_at": "2026-02-15T10:30:00Z"
    }
  ],
  "query_intent": "preference",
  "retrieval_strategy": "vector_primary"
}

Python SDK

Install with pip install knol. Both sync and async clients are included.

Sync Client
from knol import KnolClient

client = KnolClient(
    base_url="http://localhost:3000",
    api_key="your-api-key"
)

# Store — extraction & embedding happen automatically
memory = client.add(
    content="User prefers dark mode and concise responses",
    user_id="user-123",
    metadata={"source": "settings"}
)

# Search — hybrid retrieval (vector + BM25 + graph)
results = client.search(
    query="user preferences",
    user_id="user-123",
    limit=5
)

# Knowledge graph
entities = client.list_entities(user_id="user-123")

# CRUD
memory = client.get(memory_id="550e8400-...")
client.update(memory_id="550e8400-...", content="Updated content")
client.delete(memory_id="550e8400-...")
Async Client
from knol import AsyncKnolClient
import asyncio

async def main():
    client = AsyncKnolClient(
        base_url="http://localhost:3000",
        api_key="your-api-key"
    )

    # Async store + search
    await client.add(
        content="User prefers TypeScript and functional patterns",
        user_id="user-456"
    )

    results = await client.search(
        query="programming preferences",
        user_id="user-456"
    )

asyncio.run(main())

TypeScript SDK

Install with npm install @knol-dev/sdk. Fully typed with TypeScript generics.

TypeScript SDK
import { KnolClient } from '@knol-dev/sdk';

const knol = new KnolClient({
  baseUrl: 'http://localhost:3000',
  apiKey: 'your-api-key',
});

// Store
await knol.add({
  content: 'User prefers TypeScript and functional patterns',
  userId: 'user-123',
});

// Search
const results = await knol.search({
  query: 'programming preferences',
  userId: 'user-123',
});

// Knowledge graph
const entities = await knol.listEntities({ userId: 'user-123' });

Framework Integrations

Drop-in memory backends for popular AI agent frameworks.

LangChain
from knol.langchain import KnolMemory
from langchain.agents import AgentExecutor

memory = KnolMemory(
    base_url="http://localhost:3000",
    api_key="your-api-key",
    user_id="user-123"
)

# Use as LangChain memory backend
agent = AgentExecutor(
    ...,
    memory=memory
)
CrewAI
from knol.crewai import KnolMemory
from crewai import Crew

memory = KnolMemory(
    base_url="http://localhost:3000",
    api_key="your-api-key"
)

crew = Crew(
    agents=[...],
    tasks=[...],
    memory=memory
)

Error Codes

All errors are returned as JSON with a consistent format. The response body always contains an error field with a descriptive message.

StatusCodeDescription
400Bad RequestValidation error (malformed JSON, missing required fields)
401UnauthorizedMissing or invalid API key
403ForbiddenAPI key role insufficient for this operation
404Not FoundResource does not exist
402Payment RequiredPlan limit exceeded
429Too Many RequestsRate limit exceeded
500Internal Server ErrorUnexpected server error
Error Response Format
{
  "error": "Missing required field: content"
}

Rate Limiting

Rate limits are enforced per-tenant at the gateway level. When you exceed the limit, the API returns HTTP 429 with an error response.

PlanRequests/Minute
Free10
Developer100
Pro500
Team2,000
Enterprise10,000
Rate Limit Exceeded Response
{
  "error": "Rate limit exceeded. Maximum 10 requests per minute on Free plan"
}

Webhooks

Webhooks allow you to receive real-time events from Knol. Events are sent as HTTP POST requests to your configured URL with HMAC-SHA256 signatures for verification.

Webhook Management

  • GET /v1/webhooks — List all webhooks for your tenant
  • POST /v1/webhooks — Create a new webhook subscription
  • DELETE /v1/webhooks/:id — Remove a webhook

Event Types

  • memory.created — New memory stored and accepted
  • entity.created — New entity extracted and added to graph
  • edge.created — New relationship discovered between entities
  • conflict.detected — Memory conflict identified and flagged for review

Signature Verification

Each webhook request includes an X-Webhook-Signature header containing an HMAC-SHA256 signature of the request body using your webhook secret. Verify this signature to confirm the request came from Knol.

POST /v1/webhooks
curl -X POST http://localhost:3000/v1/webhooks \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-app.com/webhook",
    "events": ["memory.created", "conflict.detected"],
    "active": true
  }'

# Response
{
  "id": "webhook-550e8400-e29b",
  "url": "https://your-app.com/webhook",
  "events": ["memory.created", "conflict.detected"],
  "active": true,
  "created_at": "2026-02-15T10:30:00Z"
}

Batch Operations

Store multiple memories in a single request. Batch operations are processed asynchronously and return immediately with accepted IDs. This is more efficient than multiple individual requests.

POST /v1/memory/batch
curl -X POST http://localhost:3000/v1/memory/batch \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "memories": [
      {
        "content": "User prefers dark mode",
        "user_id": "user-123",
        "metadata": {"source": "settings"}
      },
      {
        "content": "User knows TypeScript and Rust",
        "user_id": "user-123",
        "metadata": {"source": "profile"}
      }
    ]
  }'

# Response
{
  "inserted": 2,
  "ids": ["550e8400-e29b-41d4-...", "550e8400-e29c-41d4-..."],
  "status": "accepted"
}

Configuration

Configuration follows a three-tier hierarchy: database (system_config table) → environment variable → compiled default. Runtime changes via the admin API take effect without restarts.

VariableDefaultDescription
DATABASE_URLpostgres://memory:memory_dev@localhost/memoryPostgreSQL + pgvector connection
REDIS_URLredis://localhost:6379Redis for rate limiting & caching
NATS_URLnats://localhost:4222NATS JetStream for async pipeline
LLM_PROVIDERanthropicLLM provider (anthropic, openai, gemini)
LLM_API_KEY(required)API key for LLM extraction
LLM_MODELclaude-haikuModel for entity extraction
JWT_SECRET(required)JWT signing key (min 32 chars)
GATEWAY_PORT3000Gateway listen port
WEBHOOK_ENABLEDtrueEnable webhook event dispatch
RUST_LOGinfoLog level (trace, debug, info, warn, error)

Architecture

Knol uses a microservice architecture with each concern isolated in its own Rust binary. All services share a single PostgreSQL database with pgvector for vector storage.

┌─────────────┐      ┌──────────────┐      ┌────────────────┐
│   Gateway    │─────▶│  Write Svc   │─────▶│  NATS Stream   │
│  auth/rate   │      │  episodes    │      │  extraction    │
│  /metrics    │      │  webhooks    │      │  embedding     │
└──────┬──────┘      └──────────────┘      └───────┬────────┘
       │                                            │
       │             ┌──────────────┐      ┌────────▼────────┐
       └────────────▶│ Retrieve Svc │      │   Graph Svc     │
                     │ vector+BM25  │      │  LLM extraction │
                     │  RRF fusion  │      │  conflict detect│
                     │  graph walk  │      │  embedding gen  │
                     └──────────────┘      └─────────────────┘
                            │
                     ┌──────▼──────┐
                     │ PostgreSQL  │
                     │  + pgvector │
                     └─────────────┘