Knowledge Bases

{
  "name": "Product Docs",
  "description": "Product documentation"
}

{
  "name": "Product Docs",
  "description": "Product documentation",
  "indexing_config": {
    "strategy": "chunk_embed",
    "chunk_size": 2000,
    "overlap": 50,
    "embedding_model": "text-embedding-3-small"
  },
  "retrieval_config": {
    "method": "hybrid",
    "top_k": 5,
    "vector_weight": 0.5,
    "context_mode": "image",
    "reranker": {
      "model": "cohere/rerank-english-v3.0",
      "candidate_count": 20
    },
    "query_enrichment": {
      "enabled": true,
      "model": "gpt-5-mini"
    }
  }
}

{
  "items": [
    {
      "id": "doc-uuid",
      "source_id": "src-uuid-1",
      "text": "Q3 2025 financials summary...",
      "meta": { "page_count": 42 },
      "extracted_json": {
        "revenue_usd": 12345678,
        "fiscal_period": "Q3-2025",
        "key_risks": ["supply chain", "fx"]
      }
    }
  ],
  "total": 1,
  "strategy": "doc2json",
  "source_ids": ["src-uuid-1"]
}

{
  "items": [
    {
      "id": "chunk-uuid",
      "source_id": "src-uuid-1",
      "text": "Lorem ipsum...",
      "meta": {},
      "chunk_index": 0,
      "start_char": 0,
      "end_char": 487,
      "tokens": 112
    }
  ],
  "total": 348,
  "strategy": "chunk_embed",
  "source_ids": ["src-uuid-1"]
}

Knowledge bases provide semantic search over your document content. They store chunked and embedded text from one or more sources for retrieval-augmented generation (RAG). When you add a source to a knowledge base, the platform chunks the source’s page texts, generates vector embeddings, and stores them for similarity search.

Common Patterns

Create a knowledge base, add one or more sources to trigger indexing, then use the search endpoint to query. Check indexing status by fetching the knowledge base details. Reindex when you change chunking parameters or want to re-process sources.

GET /api/knowledge-bases

List all knowledge bases.

response = requests.get(f"{BASE_URL}/api/knowledge-bases", headers=headers)

const res = await fetch(`${BASE_URL}/api/knowledge-bases`, { headers });

curl '{BASE_URL}/api/knowledge-bases' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

POST /api/knowledge-bases

Create a new knowledge base. Only name is required. You can also pass indexing_config (how sources are chunked/embedded) and retrieval_config (how the KB is searched); anything you supply is merged over the strategy’s defaults rather than replacing them. The retrieval-side features (reranking, query enrichment, and multimodal retrieval) all live inside retrieval_config and are persisted on the KB, so the minimal example and the fully-configured example below differ only in how much of that object you fill in.

{
  "name": "Product Docs",
  "description": "Product documentation"
}

{
  "name": "Product Docs",
  "description": "Product documentation",
  "indexing_config": {
    "strategy": "chunk_embed",
    "chunk_size": 2000,
    "overlap": 50,
    "embedding_model": "text-embedding-3-small"
  },
  "retrieval_config": {
    "method": "hybrid",
    "top_k": 5,
    "vector_weight": 0.5,
    "context_mode": "image",
    "reranker": {
      "model": "cohere/rerank-english-v3.0",
      "candidate_count": 20
    },
    "query_enrichment": {
      "enabled": true,
      "model": "gpt-5-mini"
    }
  }
}

`retrieval_config` field	Type	Notes
`method`	string	`vector_search` / `full_text` / `hybrid` / `tree_search`.
`top_k`	int	Results returned after any reranking.
`vector_weight`	float	Hybrid only: balance of semantic vs keyword (default `0.5`).
`context_mode`	string	`text` (default) or `image` for multimodal retrieval. All strategies except Doc2JSON.
`reranker`	object	`{ model, candidate_count }`. Present ⇒ reranking on; omit ⇒ off.
`query_enrichment`	object	`{ enabled, model }`. LLM query rewriting; `enabled` defaults to `false`.

All of these are editable later via PATCH (see below). Reranking, query enrichment, and context_mode take effect on the next search with no reindex. Changing indexing_config (chunking, embedding model, strategy) requires a reindex to take effect. See Knowledge bases & indexing for the per-strategy indexing_config fields.

response = requests.post(f"{BASE_URL}/api/knowledge-bases", headers=headers, json={"name": "Product Docs"})

const res = await fetch(`${BASE_URL}/api/knowledge-bases`, { method: "POST", headers, body: JSON.stringify({ name: "Product Docs" }) });

curl -X POST '{BASE_URL}/api/knowledge-bases' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}" -H "Content-Type: application/json" -d '{"name": "Product Docs"}'

GET /api/knowledge-bases/

Get a knowledge base with its indexed sources and status.

string

required

KB ID

response = requests.get(f"{BASE_URL}/api/knowledge-bases/{kb_id}", headers=headers)

const res = await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}`, { headers });

curl '{BASE_URL}/api/knowledge-bases/{id}' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

PATCH /api/knowledge-bases/

Update knowledge base configuration or strategy. Accepts the same name, description, indexing_config, and retrieval_config fields as create. Send the full retrieval_config object you want; it is stored as-is. Reranking, query enrichment, and context_mode changes apply to the next search immediately; indexing_config changes need a reindex to take effect.

string

required

KB ID

{
  "retrieval_config": {
    "method": "hybrid",
    "top_k": 5,
    "reranker": { "model": "voyage/rerank-2.5", "candidate_count": 30 },
    "query_enrichment": { "enabled": true }
  }
}

response = requests.patch(f"{BASE_URL}/api/knowledge-bases/{kb_id}", headers=headers, json={"description": "Updated"})

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}`, { method: "PATCH", headers, body: JSON.stringify({ description: "Updated" }) });

curl -X PATCH '{BASE_URL}/api/knowledge-bases/{id}' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}" -H "Content-Type: application/json" -d '{"description": "Updated"}'

DELETE /api/knowledge-bases/

Delete a knowledge base and all its indexed data.

string

required

KB ID

response = requests.delete(f"{BASE_URL}/api/knowledge-bases/{kb_id}", headers=headers)

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}`, { method: "DELETE", headers });

curl -X DELETE '{BASE_URL}/api/knowledge-bases/{id}' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

GET /api/knowledge-bases//sources

Paginated, filterable, sortable list of the sources indexed into a knowledge base. Each item joins the indexed_sources row with its underlying sources row, returning both index status and source metadata in one response.

string

required

KB ID

string

Case-insensitive substring match on the source’s name.

string

Exact match on index_status (pending, indexing, indexed, failed, cancelled).

string

name or created_at. When omitted, sorts failed-first then newest-source-first, which surfaces problems at the top of a UI list.

string

asc or desc (default desc). Ignored when sort is omitted.

integer

Page size. Defaults to 50, capped at 200.

integer

Skip the first N rows. Defaults to 0.

response = requests.get(f"{BASE_URL}/api/knowledge-bases/{kb_id}/sources", headers=headers, params={"status": "failed", "limit": 20})
print(response.json())

const res = await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/sources?status=failed&limit=20`, { headers });
const { items, total } = await res.json();

curl '{BASE_URL}/api/knowledge-bases/{id}/sources?status=failed&limit=20' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

Response:

{
  "items": [
    {
      "id": "indexed-source-uuid",
      "source_id": "source-uuid",
      "index_status": "failed",
      "indexed_at": null,
      "stats": {},
      "error_message": "embedding provider rate-limited",
      "source_name": "manual.pdf",
      "file_type": "application/pdf",
      "source_created_at": "2026-01-01T00:00:00Z"
    }
  ],
  "total": 1,
  "limit": 20,
  "offset": 0
}

id is the indexed_sources.id (use this with the cancel, reindex, and DELETE endpoints below). source_id is the underlying ai.sources row.

POST /api/knowledge-bases//sources

Add a source to the knowledge base. Triggers asynchronous indexing.

string

required

KB ID

{ "source_id": "source-uuid" }

response = requests.post(f"{BASE_URL}/api/knowledge-bases/{kb_id}/sources", headers=headers, json={"source_id": source_id})

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/sources`, { method: "POST", headers, body: JSON.stringify({ source_id: sourceId }) });

curl -X POST '{BASE_URL}/api/knowledge-bases/{id}/sources' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}" -H "Content-Type: application/json" -d '{"source_id": "uuid"}'

POST /api/knowledge-bases//sources//cancel

Cancel an in-progress indexing task. Only pending and indexing rows can be cancelled; anything already indexed, failed, or cancelled returns 409. Returns 404 if no indexed_sources row matches the given pair.

string

required

KB ID

string

required

The indexed_sources.id returned when the source was added (not the source UUID itself).

response = requests.post(f"{BASE_URL}/api/knowledge-bases/{kb_id}/sources/{indexed_source_id}/cancel", headers=headers)

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/sources/${indexedSourceId}/cancel`, { method: "POST", headers });

curl -X POST '{BASE_URL}/api/knowledge-bases/{id}/sources/{indexed_source_id}/cancel' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

DELETE /api/knowledge-bases//sources/

Remove a source from a knowledge base. The underlying ai.sources row is not touched, only its link to this KB and everything the indexing pipeline produced from it. The source stays available to re-add to other KBs. Deleting the indexed_sources row cascades through all eight derivative tables: ai.chunks, ai.embeddings, ai.full_documents, ai.doc2json_documents, ai.page_index_toc, ai.page_index_nodes, ai.graph_index_toc, ai.graph_index_nodes. After the call, the source contributes nothing to retrieval in this KB. If the source is mid-indexing (status pending or indexing with a celery_task_id), the Celery task is revoked before the row is deleted. A revoke failure logs a warning but does not block the deletion; the row goes regardless. Returns 200 with a small JSON body, or 404 if no indexed_sources row matches the given (kb_id, indexed_source_id) pair.

string

required

KB ID

string

required

The indexed_sources.id returned when the source was added (not the source UUID itself).

requests.delete(f"{BASE_URL}/api/knowledge-bases/{kb_id}/sources/{indexed_source_id}", headers=headers)

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/sources/${indexedSourceId}`, { method: "DELETE", headers });

curl -X DELETE '{BASE_URL}/api/knowledge-bases/{id}/sources/{indexed_source_id}' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

Response:

{
  "message": "Source removed from knowledge base",
  "deleted_indexed_source_id": "indexed-source-uuid",
  "kb_id": "kb-uuid"
}

POST /api/knowledge-bases//reindex

Re-index sources in the knowledge base. The body is optional; an empty body re-indexes every source. Pass indexed_source_ids to re-index a specific subset, or failed_only: true to retry only sources currently in failed status.

string

required

KB ID

string[]

Restrict to specific indexed_sources.id values. If supplied, this wins and failed_only is ignored.

boolean

When true (and indexed_source_ids is empty), re-index only sources currently in failed status. Returns {"status": "noop"} if there are none.

{ "indexed_source_ids": ["uuid-1", "uuid-2"] }

{ "failed_only": true }

# Re-index everything
requests.post(f"{BASE_URL}/api/knowledge-bases/{kb_id}/reindex", headers=headers)

# Retry only failed sources
requests.post(f"{BASE_URL}/api/knowledge-bases/{kb_id}/reindex", headers=headers, json={"failed_only": True})

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/reindex`, { method: "POST", headers, body: JSON.stringify({ failed_only: true }) });

curl -X POST '{BASE_URL}/api/knowledge-bases/{id}/reindex' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}" -H "Content-Type: application/json" -d '{"failed_only": true}'

POST /api/knowledge-bases//build-bm25

Dispatch a one-shot BM25 rebuild for this KB. Re-tokenizes the entire item table for the KB’s strategy (chunks / full_documents / graph_index_nodes) and writes a fresh BM25 index, replacing whatever was there. This is an operator path: most KBs never need it, but it helps after changing retrieval_config tuning knobs or recovering from a partial index.

string

required

KB ID

Only valid for KBs whose retrieval_config.method is hybrid or full_text. Vector-only KBs return 400 (BM25 isn’t part of their retrieval path).

response = requests.post(f"{BASE_URL}/api/knowledge-bases/{kb_id}/build-bm25", headers=headers)
print(response.json())  # {"task_id": "...", "knowledge_base_id": "..."}

const res = await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/build-bm25`, { method: "POST", headers });
const { task_id } = await res.json();

curl -X POST '{BASE_URL}/api/knowledge-bases/{id}/build-bm25' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

Returns 202 + {"task_id": "celery-task-uuid", "knowledge_base_id": "kb-uuid"}. Poll the KB’s bm25_status field via GET /api/knowledge-bases/{id} to observe completion. Returns 503 if the Celery worker can’t be reached.

POST /api/knowledge-bases//items

Fetch the indexed content items (chunks, nodes, or extracted JSON) for one or more source documents. The response shape depends on the KB’s indexing_config.strategy; every item carries id, source_id, text, and meta, plus strategy-specific extras. Use this endpoint to enumerate everything the platform produced from a source: for export, debugging, or to consume Doc2JSON extracted_json directly without going through retrieval.

string

required

KB ID

string[]

required

Non-empty list of source UUIDs (sources.id). Items from sources that were never added to this KB return zero rows for that ID rather than an error.

integer

Max items to return. Default 1000, capped at 10000.

integer

Pagination offset. Default 0.

{
  "source_ids": ["src-uuid-1", "src-uuid-2"],
  "limit": 1000,
  "offset": 0
}

The text field is the embeddable representation of the item (full chunk text for chunk_embed, node body for page_index/graph_index, document summary for full_document/doc2json). Strategy-specific extras:

Strategy	Source table	`text` is	Extra fields
`chunk_embed`	`ai.chunks`	full chunk text	`chunk_index`, `start_char`, `end_char`, `tokens`
`page_index`	`ai.page_index_nodes`	node text	`node_id`, `title`, `depth`, `parent_node_id`
`graph_index`	`ai.graph_index_nodes`	node text	`node_id`, `title`, `depth`, `parent_node_id`
`full_document`	`ai.full_documents`	document summary	`full_text_path`
`doc2json`	`ai.doc2json_documents`	document summary	`extracted_json`

{
  "items": [
    {
      "id": "doc-uuid",
      "source_id": "src-uuid-1",
      "text": "Q3 2025 financials summary...",
      "meta": { "page_count": 42 },
      "extracted_json": {
        "revenue_usd": 12345678,
        "fiscal_period": "Q3-2025",
        "key_risks": ["supply chain", "fx"]
      }
    }
  ],
  "total": 1,
  "strategy": "doc2json",
  "source_ids": ["src-uuid-1"]
}

{
  "items": [
    {
      "id": "chunk-uuid",
      "source_id": "src-uuid-1",
      "text": "Lorem ipsum...",
      "meta": {},
      "chunk_index": 0,
      "start_char": 0,
      "end_char": 487,
      "tokens": 112
    }
  ],
  "total": 348,
  "strategy": "chunk_embed",
  "source_ids": ["src-uuid-1"]
}

response = requests.post(
    f"{BASE_URL}/api/knowledge-bases/{kb_id}/items",
    headers=headers,
    json={"source_ids": [source_id], "limit": 1000},
)
data = response.json()
for item in data["items"]:
    if data["strategy"] == "doc2json":
        print(item["extracted_json"])
    else:
        print(item["text"])

const res = await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/items`, {
  method: "POST",
  headers,
  body: JSON.stringify({ source_ids: [sourceId], limit: 1000 }),
});
const data = await res.json();

curl -X POST '{BASE_URL}/api/knowledge-bases/{id}/items' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}" -H "Content-Type: application/json" -d '{"source_ids": ["src-uuid"], "limit": 1000}'

POST /api/knowledge-bases//search

Run a search against the knowledge base.

string

required

KB ID

{
  "query": "search text",
  "top_k": 5,
  "retrieval_method": "hybrid",
  "filter_metadata": { "topic": "billing" },
  "similarity_threshold": 0.3,
  "source_ids": ["src-uuid-1", "src-uuid-2"]
}

Body field	Type	Notes
`query`	string (required)	Natural-language search query
`top_k`	int	Number of results to return. Default `5` (or `KB_DEFAULT_TOP_K` setting)
`retrieval_method`	string	`vector_search` / `full_text` / `hybrid` / `tree_search`. Omit to use the KB’s configured default; the response then carries `retrieval_method: "auto"`.
`similarity_threshold`	float	Minimum vector score (0–1). Items below this are filtered out. Default `0.0`.
`filter_metadata`	object	Narrow to chunks whose enrichment-metadata fields match. Keys are field names from the KB’s enrichment config (see `/enrichment`); values can be scalars or `{ "op": "...", "value": ... }` shapes.
`source_ids`	array of UUIDs	Restrict to chunks from this set of sources only. Useful for “search inside one specific document.”

Per-request overrides for retrieval tuning are also accepted: vector_weight (hybrid only), context_mode (text or image; see Multimodal Retrieval), ts_language (full-text language for stemming), and reranker fields. These override the KB’s stored retrieval_config for this one request only. The reranker, query-enrichment, and multimodal (context_mode) behaviors are all properties stored on the KB’s retrieval_config. Set them at creation or change them later with PATCH. See the create body below and the knowledge bases concept guide for the full field reference.

response = requests.post(f"{BASE_URL}/api/knowledge-bases/{kb_id}/search", headers=headers, json={"query": "search text", "top_k": 5, "retrieval_method": "hybrid"})

const res = await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/search`, { method: "POST", headers, body: JSON.stringify({ query: "search text", top_k: 5, retrieval_method: "hybrid" }) });

curl -X POST '{BASE_URL}/api/knowledge-bases/{id}/search' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}" -H "Content-Type: application/json" -d '{"query": "search text", "top_k": 5, "retrieval_method": "hybrid"}'

Metadata Enrichment

Metadata enrichment runs an LLM over each indexed item to extract structured fields (text, boolean, number, enum) into a per-KB metadata table. The enriched values can then be used as filters in the /search filter_metadata parameter. PUT triggers re-enrichment only when something material changes (see the endpoint notes below).

PUT /api/knowledge-bases//enrichment

Create or replace the enrichment config. Re-enrichment behavior depends on what changed:

Changes to fields or llm_model: drop the metadata table, recreate it, and re-enrich every item from scratch. Returns 409 if a run is already in progress.
Toggle of use_multimodal: keep the table, but re-enrich because results differ with/without images. Returns 409 if a run is in progress.
max_tokens-only change: lightweight update, no re-enrichment.
Identical body: no-op.

The response body is { "config": {...}, "re_enrichment_triggered": <bool> }.

string

required

KB ID

object[]

required

Field definitions. Each: { name, description, type: "text"|"boolean"|"number"|"enum", enum_values?: string[] }. name must be SQL-safe (alphanumeric + underscores, starts with a letter) and not in the reserved set (id, item_id, item_type, enriched_at, _enrichment_error). enum types require enum_values with at least 2 entries.

string

Model identifier (e.g. gpt-4o).

integer

Max tokens per enrichment call.

boolean

When true, the enricher sees page images alongside text.

{
  "fields": [
    { "name": "topic", "description": "Main topic of the chunk", "type": "text" },
    { "name": "is_legal", "description": "True if discusses legal matters", "type": "boolean" },
    { "name": "severity", "description": "Risk level", "type": "enum", "enum_values": ["low", "medium", "high"] }
  ],
  "llm_model": "gpt-4o",
  "max_tokens": 500,
  "use_multimodal": false
}

requests.put(f"{BASE_URL}/api/knowledge-bases/{kb_id}/enrichment", headers=headers, json={"fields": [...], "llm_model": "gpt-4o"})

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/enrichment`, { method: "PUT", headers, body: JSON.stringify({ fields: [...], llm_model: "gpt-4o" }) });

curl -X PUT '{BASE_URL}/api/knowledge-bases/{id}/enrichment' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}" -H "Content-Type: application/json" -d '{"fields": [{"name": "topic", "description": "Main topic", "type": "text"}], "llm_model": "gpt-4o"}'

GET /api/knowledge-bases//enrichment

Get the current enrichment config and run status. Returns {"config": null} if no config exists. The config includes status (idle, enriching, completed, completed_with_errors, failed), enriched_count, total_count, and the dynamic metadata_table_name.

string

required

KB ID

requests.get(f"{BASE_URL}/api/knowledge-bases/{kb_id}/enrichment", headers=headers)

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/enrichment`, { headers });

curl '{BASE_URL}/api/knowledge-bases/{id}/enrichment' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

DELETE /api/knowledge-bases//enrichment

Remove the enrichment config and drop its metadata table. Returns 404 if no enrichment config exists, or 409 if a run is in progress.

string

required

KB ID

requests.delete(f"{BASE_URL}/api/knowledge-bases/{kb_id}/enrichment", headers=headers)

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/enrichment`, { method: "DELETE", headers });

curl -X DELETE '{BASE_URL}/api/knowledge-bases/{id}/enrichment' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

POST /api/knowledge-bases//enrichment/run

Manually trigger enrichment. With incremental: true only items missing metadata are processed; with retry_failed: true items previously marked failed are retried.

string

required

KB ID

boolean

Skip items that already have metadata.

boolean

Re-enrich items currently marked failed.

{ "incremental": true, "retry_failed": false }

requests.post(f"{BASE_URL}/api/knowledge-bases/{kb_id}/enrichment/run", headers=headers, json={"incremental": True})

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/enrichment/run`, { method: "POST", headers, body: JSON.stringify({ incremental: true }) });

curl -X POST '{BASE_URL}/api/knowledge-bases/{id}/enrichment/run' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}" -H "Content-Type: application/json" -d '{"incremental": true}'

GET /api/knowledge-bases//enrichment/results

Fetch enriched metadata for specific items. Pass a comma-separated item_ids query parameter. Returns per-item field values plus item_errors for any items that failed enrichment.

string

required

KB ID

string

required

Comma-separated item UUIDs (chunk/node IDs from /items).

{
  "results": {
    "chunk-uuid-1": { "topic": "billing", "is_legal": false, "severity": "low" }
  },
  "fields": [...],
  "item_errors": {
    "chunk-uuid-2": "LLM returned invalid JSON"
  }
}

ids = ",".join([chunk_a, chunk_b])
requests.get(f"{BASE_URL}/api/knowledge-bases/{kb_id}/enrichment/results?item_ids={ids}", headers=headers)

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/enrichment/results?item_ids=${ids}`, { headers });

curl '{BASE_URL}/api/knowledge-bases/{id}/enrichment/results?item_ids=uuid1,uuid2' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

Graph-index re-enrichment

Specific to KBs whose indexing_config.strategy is graph_index. These endpoints re-run reference enrichment (Stages 2+3) without a full reindex.

POST /api/knowledge-bases//graph-enrichment/run

Re-run graph reference enrichment. Optionally limit to a single indexed_source_id, or set retry_failed: true to retry only the previously-failed references. Returns 400 if the KB strategy is not graph_index.

string

required

KB ID

string

Limit to a single source’s references.

boolean

Retry only previously-failed references.

requests.post(f"{BASE_URL}/api/knowledge-bases/{kb_id}/graph-enrichment/run", headers=headers, json={"retry_failed": True})

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/graph-enrichment/run`, { method: "POST", headers, body: JSON.stringify({ retry_failed: true }) });

curl -X POST '{BASE_URL}/api/knowledge-bases/{id}/graph-enrichment/run' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}" -H "Content-Type: application/json" -d '{"retry_failed": true}'

GET /api/knowledge-bases//graph-enrichment/errors

Per-source enrichment error counts. Useful for surfacing which sources need retry. Returns 400 if the KB is not graph_index.

string

required

KB ID

requests.get(f"{BASE_URL}/api/knowledge-bases/{kb_id}/graph-enrichment/errors", headers=headers)

await fetch(`${BASE_URL}/api/knowledge-bases/${kbId}/graph-enrichment/errors`, { headers });

curl '{BASE_URL}/api/knowledge-bases/{id}/graph-enrichment/errors' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

Discoverable defaults

GET /api/config/kb-defaults

Returns the platform’s KB-creation defaults so a UI or self-service KB-creation flow can present valid options without hardcoding them. The Studio’s KB-create wizard uses this endpoint. Returns:

strategies: map of strategy name → { label, compatible_retrievers, retriever_labels, default_retrieval_method, supports_reranker, default_indexing_config, default_retrieval_config }. The strategies are the indexing strategies described in Knowledge bases & indexing.
reranker: { default_model, candidate_count, options }. options is the list of supported reranker models (Cohere v3, Jina v2, Voyage 2.5, ZeroEntropy zerank-2, etc.).
query_enrichment: { model } (the model used to rewrite/expand user queries).
enrichment: { model, max_tokens } (the LLM used for chunk metadata enrichment).
hybrid_vector_weight: default weight for hybrid search (vector vs. sparse contribution).
extraction: { default_method, fallback_chain, options }. options lists every supported extraction method (mistral, paddleocr, lighton, opendataloader, fitz, pdfplumber, plus auto) with a one-line description of each.

A new strategy, reranker, or extraction method shows up here without any docs update, so this is the place to read what’s selectable when creating or reconfiguring a KB.

defaults = requests.get(f"{BASE_URL}/api/config/kb-defaults", headers=headers).json()
print(list(defaults["strategies"].keys()))
print([opt["value"] for opt in defaults["extraction"]["options"]])

const defaults = await fetch(`${BASE_URL}/api/config/kb-defaults`, { headers }).then(r => r.json());

curl '{BASE_URL}/api/config/kb-defaults' -H "apikey: {API_KEY}" -H "Authorization: Bearer {API_KEY}"

Error Responses

KB routes return {"error": "<message>"} (no structured error code field).

Status	Description
400	Invalid chunking, embedding, or enrichment field configuration
400	Endpoint requires a specific `indexing_config.strategy` (e.g. graph-enrichment endpoints require `graph_index`)
404	No knowledge base or indexed source exists with the given ID
404	No enrichment config exists (DELETE `/enrichment`, GET `/enrichment/results`)
409	Indexing for this source has already finished or been cancelled (`/sources/{id}/cancel`)
409	Cannot update or delete enrichment config while a run is active
503	Failed to dispatch the indexing task (worker unavailable)

Sources Agents

Getting Started

Concepts

Guides

API Reference

Common Patterns

GET /api/knowledge-bases

POST /api/knowledge-bases

GET /api/knowledge-bases/

PATCH /api/knowledge-bases/

DELETE /api/knowledge-bases/

GET /api/knowledge-bases//sources

POST /api/knowledge-bases//sources

POST /api/knowledge-bases//sources//cancel

DELETE /api/knowledge-bases//sources/

POST /api/knowledge-bases//reindex

POST /api/knowledge-bases//build-bm25

POST /api/knowledge-bases//items

POST /api/knowledge-bases//search

Metadata Enrichment

PUT /api/knowledge-bases//enrichment

GET /api/knowledge-bases//enrichment

DELETE /api/knowledge-bases//enrichment

POST /api/knowledge-bases//enrichment/run

GET /api/knowledge-bases//enrichment/results

Graph-index re-enrichment

POST /api/knowledge-bases//graph-enrichment/run

GET /api/knowledge-bases//graph-enrichment/errors

Discoverable defaults

GET /api/config/kb-defaults

Error Responses

​Common Patterns

​GET /api/knowledge-bases

​POST /api/knowledge-bases

​GET /api/knowledge-bases/

​PATCH /api/knowledge-bases/

​DELETE /api/knowledge-bases/

​GET /api/knowledge-bases//sources

​POST /api/knowledge-bases//sources

​POST /api/knowledge-bases//sources//cancel

​DELETE /api/knowledge-bases//sources/

​POST /api/knowledge-bases//reindex

​POST /api/knowledge-bases//build-bm25

​POST /api/knowledge-bases//items

​POST /api/knowledge-bases//search

​Metadata Enrichment

​PUT /api/knowledge-bases//enrichment

​GET /api/knowledge-bases//enrichment

​DELETE /api/knowledge-bases//enrichment

​POST /api/knowledge-bases//enrichment/run

​GET /api/knowledge-bases//enrichment/results

​Graph-index re-enrichment

​POST /api/knowledge-bases//graph-enrichment/run

​GET /api/knowledge-bases//graph-enrichment/errors

​Discoverable defaults

​GET /api/config/kb-defaults

​Error Responses

Common Patterns

GET /api/knowledge-bases

POST /api/knowledge-bases

GET /api/knowledge-bases/

PATCH /api/knowledge-bases/

DELETE /api/knowledge-bases/

GET /api/knowledge-bases//sources

POST /api/knowledge-bases//sources

POST /api/knowledge-bases//sources//cancel

DELETE /api/knowledge-bases//sources/

POST /api/knowledge-bases//reindex

POST /api/knowledge-bases//build-bm25

POST /api/knowledge-bases//items

POST /api/knowledge-bases//search

Metadata Enrichment

PUT /api/knowledge-bases//enrichment

GET /api/knowledge-bases//enrichment

DELETE /api/knowledge-bases//enrichment

POST /api/knowledge-bases//enrichment/run

GET /api/knowledge-bases//enrichment/results

Graph-index re-enrichment

POST /api/knowledge-bases//graph-enrichment/run

GET /api/knowledge-bases//graph-enrichment/errors

Discoverable defaults

GET /api/config/kb-defaults

Error Responses